good part of the data.
cov.rob(x, cor=FALSE, quantile.used=floor((n + p + 1)/2),
method=c("mve", "mcd", "classical"), nsamp="best", seed)
good points.
"best" or
"exact" or
"sample". If
"sample" the number chosen is
min(5*p, 3000), taken from Rousseeuw
and Hubert (1997). If
"best" exhaustive enumeration is done up to
5000 samples: if
"exact" exhaustive enumeration will be attempted
however many samples are needed.
.Random.seed. The
current value of
.Random.seed will be preserved if it is set..
cor=T) the estimate of the correlation
matrix.
quantile.used.
good points.
For method
"mve", an approximate search is made of a subset of
size
quantile.used with an enclosing ellipsoid of smallest volume; in
method
"mcd" it is the volume of the Gaussian confidence
ellipsoid, equivalently the determinant of the classical covariance
matrix, that is minimized. The mean of the subset provides a first
estimate of the location, and the rescaled covariance matrix a first
estimate of scatter. The Mahalanobis distances of all the points from
the location estimate for this covariance matrix are calculated, and
those points within the 97.5% point under Gaussian assumptions are
declared to be
good. The final estimates are the mean and rescaled
covariance of the
good points.
The rescaling is by the appropriate percentile under Gaussian data; in
addition the first covariance matrix has an ad hoc finite-sample
correction given by Marazzi.
For method
"mve" the search is made over ellipsoids determined
by the covariance matrix of
p of the data points. For method
"mcd"
an additional improvement step suggested by Rousseeuw and
van Driessen (1997) is used, in which once a subset of size
quantile.used
is selected, an ellipsoid based on its covariance
is tested (as this will have no larger a determinant, and may be smaller).
P. J. Rousseeuw and A. M. Leroy (1987)
Robust Regression and Outlier Detection.
Wiley.
A. Marazzi (1993)
Algorithms, Routines and S Functions for Robust Statistics.
Wadsworth & Brooks/Cole.
P. J. Rousseeuw and B. C. van Zomeren (1990) Unmasking
multivariate outliers and leverage points,
Journal of the American Statistical Association,
85, 633-639.
P. J. Rousseeuw and K. van Driessen (1999) A fast algorithm for the
minimum covariance determinant estimator.
Technometrics
41, 212-223.
stackloss <- data.frame(stack.x, stack.loss) cov.rob(stackloss) cov.rob(stack.x, method="mcd", nsamp="exact")