Bootstrap Inference

In statistical analysis, the researcher is usually interested in obtaining not only a point estimate of a statistic but also an estimate of the variation in this point estimate and a confidence interval for the true value of the parameter. For example, a researcher may calculate not only a sample mean but also the standard error of the mean and a confidence interval for the mean.

The traditional methods for calculating standard errors and confidence intervals generally rely upon a statistic, or some known transformation of it, being asymptotically normally distributed. If this normality assumption does not hold, the traditional methods may be inaccurate.

Resampling techniques, such as the bootstrap and jackknife, provide estimates of the standard error, confidence intervals, and distributions for any statistic. To use these procedures, the user must supply the name of the data set under examination and an S-PLUS function or expression that calculates the statistic of interest.

The Bootstrap Inference dialog performs bootstrap inference for a specified statistic and data set. See the Guide to Statistics for details.

To perform bootstrap inference

Choose Statistics __image\arrow5.gif Resample __image\arrow5.gif Bootstrap. The dialog shown below appears.

Model page

__image\boot1.gif

In the Bootstrap Inference dialog, the Model page has the following options:

Data

Data Set

Specify the data to bootstrap or jackknife. This may be a vector, matrix, or data set.

Statistic to Estimate

Expression

This field applies to Bootstrap and Jackknife inference. In the Expression field, specify the expression describing the statistic to be bootstrapped or jackknifed. It may be a function that accepts data as the first argument and returns a vector or matrix, or a call referring to the data that evaluates to a vector or matrix.

For example, to bootstrap or jackknife the regression coefficients for regressing Mileage on Weight in the fuel.frame data, use the expression coef(lm(Mileage~Weight, fuel.frame)) and specify fuel.frame as the Data Set. To bootstrap or jackknife the mean of Mileage, use the expression mean(Mileage).

Save Model Object

Save Model Object

In the Save As field, enter the name for the object in which to save the results of the analysis. If an object with this name already exists, its contents are overwritten. The model object can be used in later functions such as plotting.

Save Resampling Indices

Select this to save the matrix of resampling indices describing which observations appear in each resample.

Options page

__image\boot2.gif

In the Bootstrap Inference dialog, the Options page has the following options:

Resampling Options

Number of Resamples

Specify the number of replicates to draw. The default is 1000 replicates as this is a minimal number recommended for estimating percentiles.

Grouping Variable

Specify a grouping variable to use when resampling observations. If this is specified, sampling is done within each group so that subgroup proportions in the resamples match those of the original sample. This provides inference condition upon subgroup size.

Random Number Seed

Specify an integer between 0 and 1000 to set the random number seed to a desired value. Specifying the seed allows a way to obtain identical results from multiple bootstrap runs.

Block Size

Specify the block size to use when calling the sampling function. See the language help for bootstrap for details.

Print Iteration Numbers

Select this to display iteration progress by printing iteration number ranges. Due to the timing of output display, this is not as useful from the dialog as from the command line function call.

Assign Resampled Data to Frame 1

Select this to assign the resampled data to frame 1 as each sample is generated. See the language help for bootstrap or jackknife for details.

Results page

__image\boot3.gif

In the Bootstrap Inference dialog, the Results page has the following options:

Printed Results

Summary Statistics

Select this to print basic summaries such as the bootstrap or jackknife estimates of bias, mean, and standard error.

Empirical Percentiles

Select this to print empirical percentiles for the statistic under consideration.

BCa Percentiles

Select this to print BCa percentiles for the statistic under consideration. Note that BCa percentiles are generally more accurate than empirical percentiles.

Correlation Matrix of Estimates

Select this to print the correlation matrix for the estimates. Note that this is only relevant if the statistic under consideration is a vector, such as a vector of regression coefficients.

Percentile Options

Percentile Levels

Specify a vector of percentile levels at which to evaluate the empirical or BCa percentiles.

Plot page

__image\boot4.gif

In the Bootstrap Inference dialog, the Plot page has the following options:

Plots

Distribution of Replicates

Select this to plot the distribution of the replicates for each statistic of interest.

Normal Quantile-Quantile

Select this to plot a normal quantile-quantile plot for each statistic of interest.

Jack After Boot page

Jackknife-after-bootstrap is a technique applied to the results of a bootstrap analysis, to get estimates of variability and influence for some functional of the distribution of bootstrap replicates. It is useful for determining which observations most influence the bootstrap results, and for getting estimates of standard error for bootstrap statistics.

__image\boot5.gif

In the Bootstrap Inference dialog, the Plot page has the following options:

Jackknife After Bootstrap

Functional Specify the functional to apply to the distribution of replicates. This may be Mean, Bias, SE, or the name of a function such as max.

Results

Print Results

Select this to print the jackknife-after-bootstrap summaries.

Save In

Enter the name for the object in which to save the jackknife-after-bootstrap results. If an object with this name already exists, its contents are overwritten.

Plots

Influence Plot

Select this to plot a jackknife-after-bootstrap influence plot indicating the degree of influence of each observation on the bootstrap results.

Related programming language functions

bootstrap, jack.after.bootstrap