ABC Confidence Limits

DESCRIPTION:

Calculate ABC (Approximate Bootstrap Confidence) limits. This is a generic function.

USAGE:

limits.abc(data, statistic, args.stat = NULL, group = NULL, 
          subject = NULL, probs = c(25, 50, 950, 975)/1000, 
          positive = T, 
          tilt = "exponential", subset.statistic=1:p, 
          assign.frame1 = F, weights = NULL, 
          epsilon = 0.01/n, unbiased = F, returnL = F, 
          save.group, save.subject, subjectDivide = F, 
          modifiedStatistic) 

Some of the arguments to limits.abc are the same as for . See for details in addition to those provided below.

REQUIRED ARGUMENTS:

data
data; may be a vector, matrix, or data frame. Variable naming restrictions apply in case data is a data frame -- see "Details", below.

Alternately, data may be a bootstrap, jackknife, influence, or other resamp object. In this case, statistic and other arguments are not required, instead are assumed to be the same as for the earlier object.

statistic
statistic to be calculated; a function or expression that returns a vector or matrix. Not all expressions work; see below. It may be a function which accepts data as the first argument and an argument named weights; other arguments may be passed using args.stat.
Or it may be an expression such as mean(x,trim=.2). If data is given by name (e.g. data=x) then use that name in the expression, otherwise (e.g. data=air[,4]) use the name data in the expression. If data is a data frame, the expression may involve variables in the data frame.

OPTIONAL ARGUMENTS:

args.stat
list of other arguments, if any, passed to statistic when calculating the statistic.
group
vector of length equal to the number of observations in data, for stratified sampling or multiple-sample problems. Sampling is done separately for each group (determined by unique values of this vector). If data is a data frame, this may be a variable in the data frame, or expression involving such variables.
subject
vector of length equal to the number of observations in data; if present then subjects (determined by unique values of this vector) are resampled rather than individual observations. If data is a data frame, this may be a variable in the data frame, or an expression involving such variables. If group is also present, subject must be nested within group (each subject must be in only one group).
probs
probabilities for one-sided confidence limits; e.g. c(.025, .975) gives a two-sided 95% confidence interval. Note that limits are undefined at 0 and 1.
positive
logical, if TRUE then negative weights (for the linear method) are changed to zero; the resulting intervals may be too narrow. If FALSE negative weights are passed unchanged to the statistic.
tilt
one of "exponential", "ml", "both", or "none", indicating whether exponential tilting or maximum likelihood tilting versions of the ABC limits should be done in addition to the usual linear ABC limits.
subset.statistic
subscript expression; if the statistic that was bootstrapped has length greater than 1, use this to request intervals for only some elements (parameters) of the statistic.
assign.frame1
logical flag indicating whether the resampled data should be assigned to frame 1 before evaluating the statistic. Try assign.frame1=T if all estimates are identical (this is slower).
weights
a vector of length equal to the number of observations (or subjects). The empirical influence function is calculated at the empirical distribution with these probabilities (normalized to sum to 1) on the observations or subjects. When sampling by subject the vector may be named, in which case the names must correspond to the unique values of subject. Otherwise the weights are taken to be ordered with respect to the sorted values of subject. If data is a data frame, this may be a variable in the data frame, or an expression involving such variables. The default implies equal weights.
Weights are not yet supported for ABC inferences; if supplied the result is similar to the result of calling .
epsilon
small value used for numerical evaluation of derivatives. A larger value should be used for non-smooth functions.
unbiased
logical value; if TRUE then standard error estimates are computed using a divisor of (n-1) instead of n; then squared standard error estimates are more nearly unbiased.
returnL
logical flag, if TRUE then only the L matrix is returned, rather than the list described below.
save.group, save.subject
logical flags, if TRUE then group and subject vectors, respectively, are saved in the returned object. Both defaults are TRUE if n<=10000.
subjectDivide
logical flag, meaningful only if sampling by subject. Internal calculations involve assigning weights to subjects; if TRUE then the weight for each subject is divided among observations for that subject before calculating the statistic; if FALSE the subject weight is replicated to observations for that subject. Also, if TRUE and input weights are supplied for observations (as a vector with length equal to the number of observations), then initial subject weights are the sums of weights for the observations.
modifiedStatistic
if your statistic is an expression that calls a function with a "hidden" weights argument, then pass this to indicate how to call your function. See below.

VALUE:

object of class c("limits.abc", "influence", "resamp"), with components call, observed, L, estimate, n, B, dim.obs, and epsilon (see for components not described below):
abc.limits
the ABC confidence limits.
exp.limits, ml.limits
if present, these contain the exponential and maximum likelihood tilting versions of the ABC limits.
replicates
value of statistic evaluated at distance epsilon in each direction from weights. If sampling by subject, the rows are named with the unique values of subject.
replicates2
value of statistic evaluated in the steepest descent directions (for each statistic). These are used to calculate curvature of the statistic in that direction, which is component of the bias of the statistic.
L
the empirical influence function values. If sampling by subject, the rows are named with the unique values of subject.
estimate
data frame with columns containing the mean of the replicates, and estimated bias and standard error. In addition, if weights is missing, columns containing estimates of acceleration, z0, and cq used by other bootstrap procedures.

DETAILS:

This function shares much code in common with . Calculations involve perturbing the empirical (weighted) distribution represented by data and measuring the effect on statistic. The statistic is evaluated using a number of different weight vectors: once with the input weights (or no weights), n times in order to estimate the empirical influence function and asymptotic bias, and 2p times in order to estimate curvature in the steepest descent directions for each component of a multivariate statistic (where p is the number of components). Then the statistic is evaluated again at certain distances from the original weights in the steepest descent direction (either linearly in that direction, or using tilting), with distances determined by the confidence limits desired (determined by probs), adjusted for the estimated bias, curvature, and skewness of the empirical influence function in order to obtain accurate confidence limits.

The name "Splus.resamp.weights" is reserved for internal use by influence. To avoid naming conflicts, that name can not be used as a variable name in data, if data is a data frame.

When statistic is an expression, for example mean(x), a modified expression mean(x, weights = Splus.resamp.weights) is created. Only calls to functions that have an argument named weights are modified; e.g. sum(x)/length(x) would fail. If your expression calls a function with a "hidden" weights argument, e.g. you may pass weights as part of the ... list, then use the modifiedStatistic argument to specify that, e.g. modifiedStatistic = myFun(x, weights = Splus.resamp.weights). An expression such as mean(y[a==1]) is converted to mean(y[a==1], weights = Splus.resamp.weights) which will fail because the weights vector was not subscripted along with y. In cases such as these pass a function that performs the desired calculations, or use
modifiedStatistic = mean(y[a==1], weights = Splus.resamp.weights[a==1])

The usual ABC intervals may require evaluating the statistic with negative or zero weights. The "exponential" and "ml" methods always use positive weights.

REFERENCES:

Davison, A.C. and Hinkley, D.V. (1997), Bootstrap Methods and Their Application, Cambridge University Press.

Efron, B. and Tibshirani, R.J. (1993), An Introduction to the Bootstrap, San Francisco: Chapman & Hall.

SEE ALSO:

and do similar calculations.

More details on many arguments, see .

Print, summarize, plot: , , , .

Description of a "limits.abc" object, extract parts: , , .

Modify a "limits.abc" object: .

For an annotated list of functions in the package, including other high-level resampling functions, see: .

EXAMPLES:

set.seed(1); x <- rcauchy(40) 
limits.abc(x, location.m) 
influence.obj <- influence(x, location.m) 
 
limits.abc(stack.loss, var) 
 
set.seed(0) 
y <- cbind(1:15, runif(15)) 
limits.abc(y, cor(y)[2,1])  # gives warning, negative weights