Univariate Location and Scale Estimation

DESCRIPTION:

Returns a list containing robust estimates of location and spread of univariate data using the least trimmed squares (LTS) estimator (Rousseeuw 1984, 1985). location.lts uses the exact algorithm of Rousseeuw and Leroy (1987, pages 171-172).

USAGE:

location.lts(xvec, quan=floor(length(xvec)/2)+1) 

REQUIRED ARGUMENTS:

xvec
a vector containing the observations of which the location has to be determined. Missing values ( NAs) and Infinite values ( Infs) are not accepted.

OPTIONAL ARGUMENTS:

quan
the number of observations of which the empirical variance will be minimized. In general, quan must be an integer between the default value and n=length(xvec).

VALUE:

a list containing the solution with components:
loc
an estimate for the location of the data.
scale
an estimate for the spread of the data.

DETAILS:

Let n be the number of univariate observations. The LTS method (Rousseeuw 1984, 1985) estimates univariate location and scale. The location estimator is defined as the mean of the subset that contains quan observations, and that has the smallest sum of squared deviations from the subset mean. The scale estimator is essentially the square root of this smallest sum of squares divided by quan.

The exact algorithm (Rousseeuw and Leroy, 1987, pages 171-172) proceeds as follows. First the observations are ordered. The means of the successive quan-subsets are computed. (Note that not all quan-subsets have to be considered, because an optimal quan-subset must consist of contiguous observations). For the location estimate, the mean of the half with the smallest sum of squares is returned. If there are several such halves, the low median of their centers is returned. The scale estimate (which is always unique) is the square root of the smallest sum of squares divided by quan. Throughout the algorithm, the means and the sums of squares are computed by update formulas.

The univariate location and scale estimator can be considered as a particular case of the general regression model. The method of LTS location is a special case of ltsreg. Also the minimum covariance determinant estimator reduces to LTS location in one dimension. For quan=n, the LTS location is the average of all n observations.

BACKGROUND:

The LTS estimator has breakdown value 50% when using the default quan. That is, the estimate cannot be pulled arbitrarily far away without changing about half of the data. For a larger quan, the breakdown value is roughly (n-quan)/n.

REFERENCES:

Rousseeuw, P. J. (1984). Least median of squares regression. Journal of the American Statistical Association , 79, 871-881.

Rousseeuw, P. J. (1985). Multivariate estimation with high breakdown point. In Mathematical Statistics and Applications. W. Grossmann, G. Pflug, I. Vincze and W. Wertz, eds. Reidel: Dordrecht, 283-297.

Rousseeuw, P. J. and Leroy, A. M. (1987). Robust Regression and Outlier Detection. New York: Wiley.

SEE ALSO:

, .

EXAMPLES:

x <- c(90,93,86,92,95,83,75,40,88,80) 
sort(x) 
mean(x) 
median(x) 
location.lts(x)