bootstrap2(data, statistic, treatment, data2, ratio = F, B = 1000, group, subject, seed = .Random.seed, trace = resampleOptions()$trace, save.group, save.subject, save.treatment, L = NULL, twoSample.args = NULL, ...){
"data"
must not be used for any
column name.
Alternatively it may be an expression such as
mean(x,trim=.2)
.
If
data
is given by name (e.g.
data=x
) then use that name
in the expression,
otherwise (e.g.
data=air[,4]
) use the name
"data"
in the
expression (e.g.
mean(data,trim=.2)
. An exception to this rule is
when argument
data2
is used. In
that case, you must use the name
"data"
in the expression, regardless
of whether argument
data
or
data2
is given by name.
In any case, the name
"data"
is reserved for use to refer to the
data to be bootstrapped, and should not be used in
statistic
to
refer to any other object.
If
data
is a data frame, the expression may involve variables
in the data frame.
For examples see
.
data
. This must have two unique values, which determine the two
samples to be compared. If
data
is a data frame, this may be a variable in the data frame, or
an expression involving such variables. One of
treatment
or
data2
(but not both) must be used.
data
. Observations in
data
are taken
to be one sample, and those in
data2
are taken to be the other. If
data2
is a matrix or data frame, it must have the same number of
columns (and column names, if any), as
data
. One of
treatment
or
data2
(but not both) must be used.
FALSE
(the default) then bootstrap the
difference in statistics between the two samples; if
TRUE
then bootstrap the ratio.
data
if
treatment
supplied, or in
data
and
data2
),
for stratified sampling.
Within each of the two samples defined by
treatment
or
data2
, sampling is done separately
for each stratum (determined by unique values of the
group
vector).
If
data
is a data frame and
treatment
is used,
this may be a variable in the data frame, or
an expression involving such variables.
data
is a data frame and
treatment
is used,
this may be a variable in the data frame,
or an expression involving such variables.
This must be nested
within
treatment
, and within
group
, if
group
is used (all
observations for a subject must be in the same treatment or group sample).
bootstrap
makes resampled subjects
unique before calling the statistic.
group
,
subject
and
treatment
vectors.
Default
is
TRUE
if number of observations is
<= 10000
or if
data2
supplied,
FALSE
otherwise.
If not saved these can generally be recreated by
when needed
if
treatment
was supplied, but not if
data2
was supplied.
"jackknife"
,
"influence"
,
"regression"
,
"ace"
, or
"choose"
.
See
for further information and references.
Or it may be
a matrix with
n
(number of observations or subjects in
data
,
or in
data
and
data2
) rows
and
p
(length of the returned statistic) columns;
in this case the
L
values for the second
treatment group (or
data2
) should be -1 times the value they would
have for e.g.
bootstrap(data2, statistic)
.
bootstrap2
which inherits from
bootstrap
and
resamp
. This has components
call
,
observed
,
replicates
,
estimate
,
B
,
n
,
dim.obs
,
treatment
,
parent.frame
,
seed.start
,
seed.end
, and
bootstrap.objects
.
It may have components
ratio
,
group
,
subject
,
L
,
Lstar
,
indices
,
compressedIndices
,
and others.
See
for a description of most components. Components particularly relevant
are:
p
(the number of variables in
data
), containing
the difference in the statistic computed on each of the two samples,
for the original data.
B
by
p
, containing the difference
in the statistic computed on each the two samples, for each resample.
p
rows and columns
"Mean"
,
"Bias"
and
"SE"
.
TRUE
.
bootstrap2
causes creation of the dataset
.Random.seed
if it does
not already exist, otherwise its value is updated.
This function resamples within each of the two treatment samples separately. The results are logically equivalent to
bootstrap(data, statistic(data[treatment1,]) - statistic(data[treatment2,]), group = treatment, ...)although different random sampling is used, and only
bootstrap2
supports stratified sampling.
Internally,
bootstrap2
calls
twice, once for each treatment value.
For comparison, and permute the entire data set and then divide it into two samples before computing the statistic on each sample.
For more details on many arguments see: .
Confidence intervals: , , , , .
For a hypothesis test comparing two samples, see: , .
For an annotated list of functions in the package, including other high-level resampling functions, see: .
set.seed(0) x <- matrix(rnorm(15*3), 15) treatment <- rep(c(T,F), length=15) bootstrap2(x, statistic = colMeans, treatment = treatment, seed = 1) # data2 and group arguments set.seed(10) data1 <- data.frame(x = runif(30), g = rep(1:2, c(10, 20))) data2 <- data.frame(x = runif(20), g = rep(1:2, 10)) boot <- bootstrap2(data = data1, statistic = mean(x), data2 = data2, group = g, L="regression") boot # twoSample.args boot <- update(boot, twoSample.args = list( list(seed=5), list(seed=6))) boot boot$bootstrap.objects[[1]]