Monothetic Clustering

Cluster analysis is the searching for groups (clusters) in the data in such a way that objects belonging to the same cluster resemble each other, whereas objects in different clusters are dissimilar.

When all of the variables in a data set are binary, a natural way to divide the observations is by splitting the data into two groups based on the two values of a particular binary variable. Monothetic analysis produces a hierarchy of clusters in which at each step a group is split in two based on the value of one of the binary variables.

To perform monothetic clustering

Choose Statistics __image\arrow5.gif Cluster Analysis __image\arrow5.gif Monothetic (Binary Variables). The dialog shown below appears.

Model page

__image\mono1.gif

In the Monothetic Clustering dialog, the Model page has the following options:

Data

Data Set for Monothetic Clustering

Specify the data set. For monothetic analysis, all variables must be binary. A limited number of missing values (NAs) is allowed. Every observation must have at least one value different from NA. No variable should have half of its values missing. There must be at least one variable that has no missing values. A variable with all its non-missing values identical is not allowed.

Clustering Variables

Select numeric variables from the dropdown list. If your data set contains factor variables, use the Compute Dissimilarities dialog to create dissimilarity objects to be used in the cluster analysis. However dissimilarity objects cannot be used in K-Means or Monothetic clustering.

Subset Rows

Enter an S-PLUS expression that identifies the rows to use in the analysis. To use all the rows in the data set, leave this field blank.

Omit Rows with Missing Values

Select this box to omit from the analysis any rows in the data set that contain missing values for any of the variables in the model.

Save Model Object

In the Save As field, enter the name for the object in which to save the results of the analysis. If an object with this name already exists, its contents are overwritten. The model object can be used in later functions such as plotting.

Results page

__image\mono2.gif

In the Monothetic Clustering dialog, the Results page has the following options:

Printed Results

Output Type

Select None for no printed output, Short for a short printed summary, or Long for a more detailed printed summary. (Long output is not available for all functions.)

Plot page

__image\mono3.gif

In the Monothetic Clustering dialog, the Plot page has the following options:

Banner Plot

Select this to create a banner plot. The banner plot displays the hierarchy of clusters, and is equivalent to a tree. The banner plots the diameter of each cluster being split.

Related programming language functions

mona