mclust(x, method = "S*", signif = rep(0, dim(x)[2]), noise = F, scale = rep(1, dim(x)[2]), shape = c(1, rep(0.2, (dim(x)[2]-1))), workspace = <<see below>>)
"S*"
,
"S"
,
"spherical"
(with varying sizes),
"sum of squares"
or
"trace"
(Ward's method),
"unconstrained"
,
"determinant"
,
"centroid"
,
"weighted average link"
,
"group average link"
,
"complete link"
or
"farthest neighbor"
,
"single link"
or
"nearest neighbor"
.
Only enough of the string to determine a unique match is required.
x
.
Nonpositive components are allowed. Used in initializing clustering in some
methods.
i
th column of
x
is multiplied by
scale[i]
before cluster analysis begins.
"S*"
and
"S"
.
(dim(x)[1]*(dim(x)[1]-1)) + 10*dim(x)[1]
.
merge
,
height
, and
order
conforming to the
output of the function
hclust
, but here
height
is just the
stage of the merge. This output can be used with several functions
such as
plclust
and
subtree
.
mclass
).
"S*"
,
"S"
,
"spherical"
(with varying sizes),
"sum of squares"
or
"trace"
(Ward's method),
"unconstrained"
, and
"determinant"
.
mclust
.
The amount of storage needed is dependent on the ordering of the data
points. If the limit is exceeded, it may be possible to rerun without
increasing
workspace
by reordering.
Hierarchical merging is used in all cases;
the criteria for the merge are defined by the kind of clusters expected.
They include the standard sum-of-squares method for hyperspherical clusters,
and the determinant criterion for ellipsoidal clusters pointing in the same
direction. There are also several criteria which give importance to the shape
of the clusters, such as
S*
, which is optimal for clusters
that are long and point in different directions, perhaps even overlapping.
Some standard heuristic criteria are included along with the model-based
methods. For the heuristic methods
(
centroid
,
weighted average link
,
group average link
,
complete link
, and
single link
), the initial criterion is the
Euclidean distance between observations.
The function
hclust
allows more general initialization for the
group average link
,
complete link
, and
single link
methods.
Separate functions are available for classification (
mclass
)
and iterative relocation (
mreloc
).
Banfield, J. D. and Raftery, A. E. (1993).
Model-based Gaussian and non-Gaussian clustering.
Biometrics,
Vol. 49, No. 3 (September 1993) 803-822.
Gordon, A. D. (1981).
Classification: Methods for the Exploratory Analysis of Multivariate Data.
London: Chapman and Hall.
elect.years <- c("1960", "1964", "1968", "1972", "1976") votes.clust <- mclust(votes.repub[,elect.years], method = "S", noise = T) # display dendrogram on current graphics device plclust(votes.clust$tree, label = state.abb) plot(x = 1:length(votes.clust$awe), y = votes.clust$awe) # plot the awe