Tilting-after-bootstrap diagnostics

DESCRIPTION:

Diagnostic procedure, to estimate how summaries of a bootstrap distribution (like quantiles, bias, standard error) are related to the statistic of interest.

USAGE:

tiltAfterBootstrap(boot.obj, 
    functional = "Bias&SE", passObserved = F, 
    probs = ppoints(10, 0.5), L, tau, 
    column = 1, tilt = "exponential", 
    subjectDivide = F, modifiedStatistic = NULL, 
    ..., frame.eval = boot.obj$parent.frame) 
 

REQUIRED ARGUMENTS:

boot.obj
a bootstrap object, created by .

OPTIONAL ARGUMENTS:

functional
This may be a character string, one of: "Quantiles", "Centered Quantiles", "Standardized Quantiles", "Mean", "Bias", "SE", or "Bias&SE". Or it may be a function that takes the matrix of bootstrap replicates as its first argument, such as colMeans; see DETAILS below for additional requirements.
passObserved
logical; TRUE if functional accepts the observed value of the bootstrap statistic as an argument; e.g. "Bias" is the mean of the bootstrap distribution minus the observed value. If functional is one of the character strings above this is set automatically, to TRUE for "Bias", "Bias&SE", "Centered Quantiles" and "Standardized Quantiles". If TRUE, the bootstrap statistic will be modified to accept a weights argument; see "DETAILS", below.
probs
vector of values between 0 and 1; each value determines a tilting factor used to reweight the bootstrap distribution with the new center at roughly the probs quantile of the unweighted bootstrap distribution. This argument is ignored if tau is provided.
L
influence values, to be used for computing probabilities by exponential or maximum likelihood tilting. This may be a vector or matrix, in which case the length (or number of rows) is equal to the number of observations (or number of subjects, if sampling by subject was used) in the original bootstrap data. The influence values are computed internally if this argument is missing; see .
tau
vector of tilting parameters, may be supplied in place of probs. Must be supplied if tilt="ml".
column
an integer from 1:p, where p = length(boot.obj$observed). When p>1, the bootstrap distribution is a joint distribution, and column describes the component of the joint distribution for which tilting parameters are computed.
tilt
one of "exponential" or "ml", for exponential or maximum likelihood tilting. If "ml", then you must specify tau rather than probs.
subjectDivide
logical flag, meaningful only if the sampling in boot.obj was by subject. Internal calculations involve assigning weights to subjects; if TRUE then the weight for each subject is divided among observations for that subject before calculating the statistic; if FALSE the subject weight is replicated to observations for that subject.
modifiedStatistic
if the bootstrap statistic is an expression that calls a function with a "hidden" weights argument, then pass this to indicate how to call your function. See "DETAILS", below.
...
other arguments passed to functional.
frame.eval
frame where the data and other objects used when creating boot.obj can be found. You need to specify this if objects can't be found by their original names, or have changed; see .

VALUE:

an object of class "tiltAfterBootstrap"; a list with components
call
the function call to this function,
Func
the value of the functional applied to the original bootstrap distribution (including control adjustments), converted to a vector. Let p denote the length of this.
Func.replicates
matrix with one row for every value of probs or tau, and p columns. Each row is the value of the functional when applied to one tilted bootstrap distribution. Let m be the number of rows.
observed
the original observed statistic, from the bootstrap object.
statistic
the bootstrap statistic, recalculated for the weighted empirical distribution with the same values as the original sample but with unequal probabilities determined by tilting. A matrix of dimension k x p, where k is the length of tau. In some cases this will be missing. See "DETAILS", below.
tau
vector of tilting parameters, length k.
probs
vector of probabilities, length k, approximate quantiles of the original bootstrap distribution (of a linear approximation to the statistic).
column
integer - this determines which dimension of the statistic to use for tilting, when the statistic is multivariate.
effectiveB
vector of length k, giving the effective bootstrap sample size with unequal weights, assuming independence between the weights and other quantities (this assumption is violated, so these numbers are only a guideline).
quantiles
logical, TRUE if the function is known to compute quantiles, centered or standardized quantiles.
dim.Func
dimension of the functional calculated for a single bootstrap distribution.
dimnames.Func
dimname of the functional calculated for a single bootstrap distribution.

DETAILS:

Suppose one were to modify the original empirical distribution by placing unequal weights on the observations. As the weights change, both the statistic calculated from the weighted distribution, and the bootstrap distribution obtained by sampling with probabilities equal to those weights, change. Bootstrap tilting looks at the relationships between the statistic and summaries ("functionals") of the bootstrap distribution, as the weights change. In particular, the weights are selected by exponential or maximum likelihood tilting; these approximately maximize the change in the statistic given the distance (forward or backward Kullback-Leibler distance) between the weights and the original equal weights.

Assuming that the weighted statistic can be calculated, the most generally useful plot is of the functional (such as quantiles of the weighted bootstrap distribution) against the weighted statistics.

For example, when considering use of a inference procedure such as t-tests or confidence intervals that assumes that standard errors are independent of the statistic, it is useful to check that assumption by plotting either "Centered Quantiles" or "SE" (standard error) against the statistic.

The implementation here doesn't actually require bootstrap sampling with unequal probabilities. Instead it uses importance sampling reweighting to obtain a weighted bootstrap distribution that approximates the shape that would be obtained from sampling with probabilities equal to the weights.

This function attempts to recalculate the bootstrap statistic by adding weights determined by the tilting parameters.

The name "Splus.resamp.weights" is reserved for internal use. To avoid naming conflicts, that name can not be used as a variable name in the data argument to boot.obj, if data is a data frame.

When the bootstrap statistic is an expression, for example mean(x), a modified expression mean(x, weights = Splus.resamp.weights) is created. Only calls to functions that have an argument named weights are modified; e.g. sum(x)/length(x) would fail because sum does not have a weights argument. If the expression calls a function with a "hidden" weights argument, e.g. you may pass weights as part of the ... list, then use the modifiedStatistic argument to specify that, e.g. modifiedStatistic = myFun(x, weights = Splus.resamp.weights). An expression such as mean(y[a==1]) is converted to mean(y[a==1], weights = Splus.resamp.weights) which will fail because the weights vector was not subscripted along with y. In cases such as these pass a function that performs the desired calculations, or use
modifiedStatistic = mean(y[a==1], weights = Splus.resamp.weights[a==1])

(You must use the name Splus.resamp.weights in modifiedStatistic.) If modifiedStatistic is not provided and the bootstrap statistic can not be successfully modified to accomodate weights, there are two possibilities: if passObserved = T, an error is signalled since the recalculated statistic values are required by functional; if passObserved = F, the output statistic component is simply omitted.

The bootstrap statistic should be "functional"; that is, the observed value of the statistic should be equal to the value computed with equal weights. A warning occurs if this is not the case.

REFERENCES:

Hesterberg, T.C. (2001), "Bootstrap Tilting Diagnostics" Proceedings of the Statistical Computing Section, American Statistical Association (CD-ROM).

SEE ALSO:

creates the bootstrap objects. plots the result. performs diagnostics similar to those here. contains example functionals, including the built-in options. is used to calculate L, a linear approximation to the statistic of interest. creates reweighted bootstrap distributions (using importance samping identities) for arbitrary weights. does confidence limits based on bootstrap tilting. creates sets of weights using exponential or maximum likelihood tilting, using L. does asymptotic calculations for standard error or bias based on the linear approximation and a directional quadratic approximation.

EXAMPLES:

x <- qexp(ppoints(30)) 
boot <- bootstrap(x, mean, seed=1, save.indices=T) 
tab1 <- tiltAfterBootstrap(boot) 
tab1 # k = 10.  For 10 different sets of weights, this 
     # shows the weighted mean, five quantiles of the bootstrap 
     # distribution, and effective sample sizes.  This loses 
     # effective sample size in the tails unless importance sampling 
     # is used, see below or doc/tutorial.ssc 
plot(tab1) 
tab2 <- tiltAfterBootstrap(boot, "Centered Quantiles") 
plot(tab2) 
# The bootstrap distributions are wider when the statistic 
# increases (but with some noise on the right due to small effective 
# sample sizes) 
 
tab3 <- tiltAfterBootstrap(boot, functional = "SE") 
plot(tab3) 
# standard error (standard deviation of the bootstrap distribution) 
# increases as the statistic increases (but with noise on the right) 
# Could use this to estimate a variance-stabilizing relationship. 
 
# Use importance sampling to improve the effective sample sizes on ends 
taus <- saddlepointPSolve(c(.05, .95), L=x) 
boot2 <- bootstrap(x, mean, B = c(400, 300, 300), 
    # 400 observations with equal probabilties, 300 tilted each direction 
    sampler.prob = list(NULL,  
      tiltWeights(taus[1], L=x),  # tilt left 
      tiltWeights(taus[2], L=x)), # tilt right 
    seed=1, save.indices=T) 
tab4 <- tiltAfterBootstrap(boot2) 
tab4 # effective sample sizes are now all over 400 
plot(tab4) 
tab5 <- tiltAfterBootstrap(boot2, "SE") 
plot(tab5)