Cluster analysis is the searching for groups (clusters) in the data in such a way that objects belonging to the same cluster resemble each other, whereas objects in different clusters are dissimilar.
When all of the variables in a data set are binary, a natural way to divide the observations is by splitting the data into two groups based on the two values of a particular binary variable. Monothetic analysis produces a hierarchy of clusters in which at each step a group is split in two based on the value of one of the binary variables.
To perform monothetic clustering
Choose Statistics Cluster Analysis
Monothetic (Binary Variables). The dialog shown below appears.
Model page
In the Monothetic Clustering dialog, the Model page has the following options:
Data
Data Set for Monothetic Clustering
Specify the data set. For monothetic analysis, all variables must be binary. A limited number of missing values (NAs) is allowed. Every observation must have at least one value different from NA. No variable should have half of its values missing. There must be at least one variable that has no missing values. A variable with all its non-missing values identical is not allowed.
Clustering Variables
Select numeric variables from the dropdown list. If your data set contains factor variables, use the Compute Dissimilarities dialog to create dissimilarity objects to be used in the cluster analysis. However dissimilarity objects cannot be used in K-Means or Monothetic clustering.
Enter an S-PLUS expression that identifies the rows to use in the analysis. To use all the rows in the data set, leave this field blank.
Select this box to omit from the analysis any rows in the data set that contain missing values for any of the variables in the model.
In the Save As field, enter the name for the object in which to save the results of the analysis. If an object with this name already exists, its contents are overwritten. The model object can be used in later functions such as plotting.
Results page
In the Monothetic Clustering dialog, the Results page has the following options:
Printed Results
Output Type
Select None for no printed output, Short for a short printed summary, or Long for a more detailed printed summary. (Long output is not available for all functions.)
Plot page
In the Monothetic Clustering dialog, the Plot page has the following options:
Banner Plot
Select this to create a banner plot. The banner plot displays the hierarchy of clusters, and is equivalent to a tree. The banner plots the diameter of each cluster being split.
Related programming language functions
mona