rreg(x, y, w=rep(1,n), int=T, init=lsfit(x, y, w, int=F)$coef, method=wt.default, wx=weights, iter=20, acc=10*.Machine$single.eps^0.5, test.vec="resid", weights=NULL)
1
s for the intercept unless
int=FALSE
.
Missing values (
NA
s) are allowed.
x
.
Missing values (
NA
s) are allowed.
w
may be the weights computed from residuals in previous
iterations of
rreg
.
The argument
wx
should be used for weights that are to remain constant
from iteration to iteration.
x
.
wt.default
, is the converged Huber
estimate followed by the bisquare. See the examples below for a way
to give different values for the tuning constants where applicable.
weights
argument instead of this argument,
for consistency with other S-PLUS functions.
"resid"
, for coefficients with
"coef"
, and for weights
with
"w".
If
test.vec=NULL
, then the orthogonality of the residuals to
x
is
tested.
NA
s) are allowed.
wx
.
"converged"
if the iterations converged normally,
or else a message describing abnormal convergence.
The routine uses iteratively reweighted least squares to
approximate the robust fit, with residuals from the current
fit passed through a weighting function to give weights for
the next iteration.
There are several possible weighting functions, and you are free to create
your own.
The weight functions that are available are:
wt.andrews
,
wt.bisquare
,
wt.cauchy
,
wt.default
,
wt.fair
,
wt.hampel
,
wt.huber
,
wt.logistic
,
wt.median
,
wt.talworth
,
wt.welsch
.
Here we describe the calculations for some of the weight functions
available.
The vector
u
below is the vector of residuals divided by
the (Gaussian consistent) MAD of the residuals.
c
is a "tuning" constant
with default values as indicated for each method. (See the examples
for suggestions on how to override these defaults.)
sin(u/c)/(u/c)
for abs(u) <= pi*c
and
0
otherwise. The default
c
is 1.339.
(1 - (u/c)^2)^2
if u < c and
0
otherwise.
The default is
c=4.685
.
1/(1 + (u/c)^2)
with
c=2.385
the default.
1/(1 + abs(u/c))^2
is the weight function and
1.4
is the default for
c
.
a
,
b
and
c
.
The weight function is
1
if abs(u)<=a; it is
a/abs(u)
for aa
,
b
and
c
are
2
,
4
and
8
.
1
for abs(u) < c and
c/abs(u)
otherwise.
The default is
c=1.345
.
tanh(u/c)/(u/c)
with
c=1.205
the default.
1
, otherwise it is
0
.
c=2.795
is the default.
exp(-2*(abs(u/(2*c))^2))
with a default
c
of
2.985
.
rreg
, which is described in Heiberger and Becker (1992),
is modeled on the algorithm
described in Coleman et al. (1980).
These regression estimates are useful when there are outliers in the response.
Least squares regression is very sensitive to the assumption that the errors
have a Gaussian distribution, so robust techniques are often of value.
The M-estimates that
rreg
produces are susceptible to
high leverage points, however (see
hat
).
In technical terms: M-estimates do not have bounded influence.
The
ltsreg
function is an alternative-it has the highest possible
breakdown (almost half the observations, both
x
and
y
, can be moved
arbitrarily and the estimate will change only a finite amount) but it is
somewhat inefficient.
Two ways to currently get high-breakdown, high-efficiency regression are to
use
rreg
with weights that are based on the hat diagonal, or to use least
squares with weights based on the size of the residuals from the least trimmed
squares regression. Research is currently very active on this question,
more definite solutions can be expected in the future.
Coleman, D., Holland, P., Kaden, N., Klema, V., and Peters, S. C., (1980).
A system of subroutines for iteratively re-weighted least-squares computations.
ACM Transactions on Mathematical Software
6, 327-336.
Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J. and Stahel, W. A. (1986).
Robust Statistics, The Approach Based on Influence Functions.
Wiley, New York.
Heiberger, R. and Becker, R. A. (1992). Design of an S function for
Robust Regression Using Iteratively Reweighted Least Squares.
Journal of Computational and Graphical Statistics.
1, 181-196.
rreg(stack.x, stack.loss, method=wt.huber) # use a Huber weight function # If a different value for the tuning constant is desired, say c=1.2, then rreg(stack.x, stack.loss, method=function(u) wt.huber(u,c=1.2)) # To implement different values for the tuning constants, we might also do: hamp.139 <- function (r) wt.hampel(r, a=0.1, b=0.3, c=0.9) rreg(stack.x, stack.loss, method=hamp.139) reg1 <- rreg(x,y, method=wt.talworth, iter=3) rreg(x, y, w=reg1$w, method=wt.bisquare, iter=3)