Quantile Regression
DESCRIPTION:
Returns an object of class rq or rq.process that represents
a quantile regression fit.
USAGE:
rq(formula,tau=.5,data,weights,na.action,method="br",contrasts,...)
REQUIRED ARGUMENTS:
- formula
-
a formula object, with the response on the left of a `~' operator,
and the terms, separated by + operators, on the right.
- tau
-
the quantile to be estimated, this is generally a number between 0 and 1,
but if specified outside this range, it is presumed that the solutions
for all values of tau in (0,1) are desired. In the former case an
object of class rq is returned, in the latter, an object of class rq.process
is returned.
OPTIONAL ARGUMENTS:
- data
-
a data.frame in which to interpret the variables
named in the formula, or in the subset and the weights argument.
If this is missing, then the variables in the formula should be on the
search list. This may also be a single number to handle some special
cases -- see below for details.
- weights
-
vector of observation weights; if supplied, the algorithm fits to minimize the
sum of the weights multiplied into the absolute residuals.
The length of weights must be the same as the
number of observations. The weights must be nonnegative and it is strongly
recommended that they be strictly positive, since zero weights are ambiguous.
- na.action
-
a function to filter missing data.
This is applied to the model.frame after any subset argument has been used.
The default (with na.fail) is to create an error if any missing values are
found. A possible alternative is na.omit, which
deletes observations that contain one or more missing values.
- method
-
the algorithmic method used to compute the fit. There are currently
three options: The default method is the
modified version of the Barrodale and Roberts algorithm for
l1-regression, used by l1fit in S, and is described in detail in
Koenker and d"Orey(1987,1994), default = "br".
This is quite efficient for problems up to several thousand observations,
and may be used to compute the full quantile regression process. It
also implements a scheme for computing confidence intervals for
the estimated parameters, based on inversion of a rank test described
in Koenker(1994). For larger problems it is advantagous to use
the Frisch Newton interior point method "fn".
And very large problems one can use the Frisch-Newton approach after
preprocessing "pfn". Both of the latter methods are described in detail in
Portnoy and Koenker(1997).
- contrasts
-
a list giving contrasts for some or all of the factors
default = NULL. appearing in the model formula.
The elements of the list should have the same name as the variable
and should be either a contrast matrix (specifically, any full-rank
matrix with as many rows as there are levels in the factor),
or else a function to compute such a matrix given the number of levels.
- ...
-
additional arguments for the fitting routines
(see rq.fit.br and rq.fit.fn and the functions they call).
VALUE:
See rq.object and rq.process.object for details.
EXAMPLES:
rq(stack.loss~stack.x,.5) #median (l1) regression fit for the stackloss data.
rq(stack.loss~stack.x,.25) #the 1st quartile,
#note that 8 of the 21 points lie exactly on this plane in 4-space
rq(stack.loss~stack.x,tau=-1) #this returns the full rq process
rq(rnorm(50)~1,ci=F) #ordinary sample median --no rank inversion ci
rq(rnorm(50)~11,weights=runif(50),ci=F) #weighted sample median
METHOD:
The function computes an estimate on the tau-th conditional quantile
function of the response, given the covariates, as specified by the
formula argument. Like lm(), the function presumes a linear specification
for the quantile regression model, i.e. that the formula defines a model
that is linear in parameters. For non-linear quantile regression
see the function nlrq(). [To appear real soon now on a screen near you.]
The function minimizes a weighted sum of absolute residuals that can
be formulated as a linear programming problem. As noted above, there
are three different algorithms that can be chosen depending on problem
size and other characteristics. For moderate sized problems (n<<5,000,
p<<20) it is recommended that the default "br" method be used. There
are several choices of methods for computing confidence intervals and
associated test statistics. Using "br" the default approach produces
confidence intervals for each of the estimated model parameters
based on inversion of a rank test. See the documentation for rq.fit.br
for further details and options. For larger problems, the "fn" and "pfn"
are preferred, and there are several methods of computing standard errors
and associated test statistics described in the help files for rq.fit.fn,
and summary.rq.
REFERENCES:
[1] Koenker, R.W. and Bassett, G.W. (1978). Regression quantiles,
Econometrica, 46, 33-50.
[2] Koenker, R.W. and d'Orey (1987,1994). Computing Regression Quantiles.
Applied Statistics, 36, 383-393, and 43, 410-414.
[3] Gutenbrunner, C. Jureckova, J. (1991).
Regression quantile and regression rank score process in the
linear model and derived statistics, Annals of Statistics, 20, 305-330.
[4] Koenker, R.W. (1994). Confidence Intervals for Regression Quantiles, in
P. Mandl and M. Huskova (eds.), Asymptotic Statistics, 349-359,
Springer-Verlag, New York.
There is also recent information available at the url: www.econ.uiuc.edu.
SEE ALSO:
summary.rq, print.rq, print.summary.rq, rq.object, rq.process.object