I want to use a clustering algorithm which can catch the following within a multivariate binary dataset. In the sample below, since class 1 and 2 appear twice within column A and B they will form a cluster. The same will be for class 5 and 6. Class 3 and 4 will belong to a cluster which is located closer to class 1 and 2 since column B has class 1 to 4. Is hierarchical clustering an appropriate technique to display this kind of relationship?
The data are as follow:
| A | B | C | D | |
|---|---|---|---|---|
| class1 | 1 | 1 | 0 | 0 |
| class2 | 1 | 1 | 0 | 0 |
| class3 | 0 | 1 | 0 | 0 |
| class4 | 0 | 1 | 0 | 0 |
| class5 | 0 | 0 | 1 | 1 |
| class6 | 0 | 0 | 1 | 1 |