bootstrap(data, statistic, ..., lmsampler="observations") bootstrap.lm(data, statistic, ... lmsampler="observations") bootstrap.glm(data, statistic, ...)
lm
or
glm
objects and returns a
vector or matrix.
It may be a function (e.g.
coef
) which
accepts data as the first argument;
other arguments may be passed using
args.stat
.
predict(fit,newdata=orig.frame)
.
If the
data
object is given by name (e.g.
data=fit
) then use that name
in the expression,
otherwise (e.g.
data=glm(formula,dataframe)
) use the name
data
in
the expression, e.g.
predict(data,newdata=orig.frame)
.
"observations"
,
"residuals"
,
"wild"
or
"wild-as"
(may be abbreviated).
When bootstrapping observations, the
data from the
data
argument to the call generating the
lm
object are
resampled. When bootstrapping residuals, the (unadjusted) residuals and
predicted values for the fit of the original data are computed. The
residuals are then resampled and the statistic is evaluated on the
fit with response variable replaced by the original predicted values
plus the resampled residuals. The wild bootstraps are variations
on resampling residuals; for the simple wild bootstrap, each residual
is either added or subtracted to the predicted value for that observation.
For the asymmetric wild bootstrap, the residual times Q is added to the
prediction, where Q is a discrete random variable with mean 0,
variance 1, and E(Q^3) = 1.
bootstrap
which inherits from
resamp
. See help
for
.
When resampling residuals, the result has a component
order.matters
set to "resampling residuals"; this disables
functions such as
and
that are only appropriate for the ordinary bootstrap.
These functions are designed to speed up bootstrap computations when the statistic of interest requires fitting a model. Typically one has, for example,
#
bootstrap(data=data.frame, statistic(lm(formula, data),...),...)
#
In this case
is called once per iteration, and a new object of
class
lm
is created each time. Faster (but equivalent) results are
attained by using
#
bootstrap(lm(formula, data.frame), statistic(...), ...)
#
which dispatches to
bootstrap.lm
. The savings come from the
reduction of the overhead required to create fitted models.
The methods described here do this work just once, save the result in
an object of class
model.list
, and then resample the
model.list
.
Thus the following are equivalent:
#
# Slow
bootstrap(data=data.frame, stat(lm(formula, data),...),...)
#
# Fast
fit <- lm(formula, data.frame) # returns lm object
bootstrap(fit, stat(lmfit,...),...) # uses bootstrap.lm
#
# Fast
modlst <- lm(data,...,method="model.list") # returns model.list object
bootstrap(modlst, stat(lm(modlst),...),...) # uses bootstrap.default
See .
# bootstrap and lm bootstrap(fuel.frame, coef(lm(Fuel~Weight+Disp.)), seed=10) # the same thing but faster, using bootstrap.lm fit <- lm(Fuel~Weight+Disp., data=fuel.frame) bootstrap(fit, coef) # Bootstrapping unadjusted residuals in lm (2 equivalent ways) fit.lm <- lm(Mileage~Weight, fuel.frame) resids <- resid(fit.lm) preds <- predict(fit.lm) bootstrap(resids, lm(resids+preds~fuel.frame$Weight)$coef, B=500, seed=0) bootstrap(fit.lm, coef, lmsampler="resid", B=500, seed=0) # Other statistics bootstrap(fit, coef(fit)[1]-coef(fit)[2]) bootstrap(fit, predict, args.stat=list(newdata=fuel.frame)) bootstrap(fit, function(x) predict(x,newdata=fuel.frame)) # bootstrap and glm mform <- Kyphosis ~ Age + (Number > 5)*Start fit <- glm(mform, family = binomial, data = kyphosis, control=glm.control(maxit=20)) bootstrap(fit, coef, B=50, seed=8)