bootstrap2(data, statistic, treatment, data2, ratio = F, B = 1000,
group, subject, seed = .Random.seed,
trace = resampleOptions()$trace,
save.group, save.subject, save.treatment, L = NULL,
twoSample.args = NULL, ...){
"data" must not be used for any
column name.
Alternatively it may be an expression such as
mean(x,trim=.2).
If
data is given by name (e.g.
data=x) then use that name
in the expression,
otherwise (e.g.
data=air[,4]) use the name
"data" in the
expression (e.g.
mean(data,trim=.2). An exception to this rule is
when argument
data2 is used. In
that case, you must use the name
"data" in the expression, regardless
of whether argument
data or
data2 is given by name.
In any case, the name
"data" is reserved for use to refer to the
data to be bootstrapped, and should not be used in
statistic to
refer to any other object.
If
data is a data frame, the expression may involve variables
in the data frame.
For examples see
.
data. This must have two unique values, which determine the two
samples to be compared. If
data is a data frame, this may be a variable in the data frame, or
an expression involving such variables. One of
treatment or
data2
(but not both) must be used.
data. Observations in
data are taken
to be one sample, and those in
data2 are taken to be the other. If
data2 is a matrix or data frame, it must have the same number of
columns (and column names, if any), as
data. One of
treatment or
data2 (but not both) must be used.
FALSE (the default) then bootstrap the
difference in statistics between the two samples; if
TRUE
then bootstrap the ratio.
data if
treatment supplied, or in
data and
data2),
for stratified sampling.
Within each of the two samples defined by
treatment or
data2, sampling is done separately
for each stratum (determined by unique values of the
group vector).
If
data is a data frame and
treatment is used,
this may be a variable in the data frame, or
an expression involving such variables.
data is a data frame and
treatment is used,
this may be a variable in the data frame,
or an expression involving such variables.
This must be nested
within
treatment, and within
group, if
group is used (all
observations for a subject must be in the same treatment or group sample).
bootstrap makes resampled subjects
unique before calling the statistic.
group,
subject and
treatment vectors.
Default
is
TRUE if number of observations is
<= 10000 or if
data2 supplied,
FALSE otherwise.
If not saved these can generally be recreated by
when needed
if
treatment was supplied, but not if
data2 was supplied.
"jackknife",
"influence",
"regression",
"ace", or
"choose".
See
for further information and references.
Or it may be
a matrix with
n (number of observations or subjects in
data,
or in
data and
data2) rows
and
p (length of the returned statistic) columns;
in this case the
L values for the second
treatment group (or
data2) should be -1 times the value they would
have for e.g.
bootstrap(data2, statistic).
bootstrap2
which inherits from
bootstrap and
resamp. This has components
call,
observed,
replicates,
estimate,
B,
n,
dim.obs,
treatment,
parent.frame,
seed.start,
seed.end, and
bootstrap.objects.
It may have components
ratio,
group,
subject,
L,
Lstar,
indices,
compressedIndices,
and others.
See
for a description of most components. Components particularly relevant
are:
p (the number of variables in
data), containing
the difference in the statistic computed on each of the two samples,
for the original data.
B by
p, containing the difference
in the statistic computed on each the two samples, for each resample.
p rows and columns
"Mean",
"Bias" and
"SE".
TRUE.
bootstrap2
causes creation of the dataset
.Random.seed if it does
not already exist, otherwise its value is updated.
This function resamples within each of the two treatment samples separately. The results are logically equivalent to
bootstrap(data, statistic(data[treatment1,]) - statistic(data[treatment2,]),
group = treatment, ...)
although different random sampling is used,
and only
bootstrap2 supports stratified sampling.
Internally,
bootstrap2 calls
twice, once for each treatment value.
For comparison, and permute the entire data set and then divide it into two samples before computing the statistic on each sample.
For more details on many arguments see: .
Confidence intervals: , , , , .
For a hypothesis test comparing two samples, see: , .
For an annotated list of functions in the package, including other high-level resampling functions, see: .
set.seed(0)
x <- matrix(rnorm(15*3), 15)
treatment <- rep(c(T,F), length=15)
bootstrap2(x, statistic = colMeans, treatment = treatment, seed = 1)
# data2 and group arguments
set.seed(10)
data1 <- data.frame(x = runif(30), g = rep(1:2, c(10, 20)))
data2 <- data.frame(x = runif(20), g = rep(1:2, 10))
boot <- bootstrap2(data = data1, statistic = mean(x), data2 = data2,
group = g, L="regression")
boot
# twoSample.args
boot <- update(boot,
twoSample.args = list( list(seed=5), list(seed=6)))
boot
boot$bootstrap.objects[[1]]