Permutation test for comparing means of two samples

DESCRIPTION:

Permutation test for comparing means of two samples, for one or more variables.

USAGE:

permutationTestMeans(data, treatment, data2, B = 999, 
    alternative = "two.sided",  
    ratio = F, 
    paired = F, group = NULL, 
    combine = NULL, combineGroup = F, 
    combinationFunction = combinePValues.Fisher, 
    seed = .Random.seed, diffMeans = T, 
    label, statisticNames) 

REQUIRED ARGUMENTS:

data
numerical vector or matrix, or data frame.

OPTIONAL ARGUMENTS:

treatment
vector with as many observations as data. If data is a data frame and the name of treatment is a column in data, then treatment is extracted from the data frame. This must have two unique values, which determine two samples to be compared. One of treatment or data2 (but not both) must be used.
data2
numerical vector or matrix, or data frame, like data. Observations in data are taken to be one sample, and those in data2 are taken to be the other. If data2 is a matrix or data frame, it must have the same number of columns, and column names, if any, as data. One of treatment or data2 (but not both) must be used.
B
integer, number of random permutations to use. With the default value of B=999, p-values are multiples of 1/1000.
alternative
character, one of "two.sided", "greater", or "less" (may be abbreviated), indicating the type of hypotheses test to perform. This may be a vector with length equal to the number of variables in data.
ratio
logical value, if FALSE (the default) then resample the difference in means between the two samples; if TRUE then resample the ratio.
paired
logical, if TRUE then observations are paired, and observations within each pair are randomly permuted. This is equivalent to supplying group as a vector with a different value for each pair of observations. If paired is supplied then argument group is ignored.
group
vector of length equal to the number of observations in data, for stratified sampling or multiple-sample problems. Random permutations are drawn separately for each group (determined by unique values of this vector). If data is a data frame, this may be a variable in the data frame, or expression involving such variables.
If data is a data frame and the name of group is a column in data, then group is extracted from the data frame.
combine
numerical, logical, or character vector, indicating which variables to use for computing combined p-values. Or this may be a list, each of whose elements indicate a set of variables to use.
combineGroup
logical; if T, combine p-values for individual groups. See return component "group$combineGroup-p-values" below.
combinationFunction
a function which combines p-values; see for specifications.
seed
seed for generating resampling indices. May be a legal random number seed or an integer between 0 and 1023 which is passed to set.seed.
diffMeans
logical flag, if TRUE (the default) then the statistic calculated is the difference in means between the two groups determined by the treatment argument. If FALSE the statistic is the sum of the first group. The p-values are the same either way.
label
character, if supplied is used when printing, and as the main title for plotting.
statisticNames
character vector of length equal to the number of statistics calculated; if supplied is used as the statistic names for printing and plotting.

VALUE:

An object of class permutationTestMeans which inherits from resamp. This has components call, observed, replicates, estimate, B, n, dim.obs, p-value, parent.frame, label (if supplied), defaultLabel, combined-p-value (only if p-values are combined), group (only if sampling by group), seed.start, and seed.end. See help for resamp.object for a description of most components. Components particularly relevant are:
observed
vector of length p, containing the difference in means for the two samples determined by the treatment vector, for the original data.
replicates
matrix with dimensions B by p, containing the differences in means between the two samples in each permutation samples.
estimate
data.frame with p rows and columns "alternative" and "p-value", where p is the number of variables (excluding treatment and group).
combined-p-value
vector of combined p-values, of length equal to the number of combinations requested by argument combine.
group
a list, with components
group$p-value
matrix of p-values with K rows and p columns (where K is the number of groups).
group$combined-p-value
this is present only if p-values are combined across variables: matrix of combined p-values, one row for each combination and p columns.
group$combineGroup-p-values
this is present only if combineGroup=T: vector of length p; p-values for each variable obtained by a nonparametric combination across the individual-group p-values for that variable. This is more heavily influenced by the results in small groups or results from groups with small standard deviations than is the "p-value" column in estimate. This calculation is currently slow.

SIDE EFFECTS:

The function permutationTestMeans causes creation of the dataset .Random.seed if it does not already exist, otherwise its value is updated.

DETAILS:

If diffMeans=FALSE, then the observed and replicates return components contain the sum of the first sample (observations with treatment==treatment[1]) rather than the difference between samples. This does not affect p-values.

If treatment and/or group are extracted from data, those columns are deleted. Numerical or logical subscripts in combine should refer to remaining columns of data, and the length of alternative should match the number of remaining columns.

REFERENCES:

Pesarin, F. (2001), Multivariate Permutation Tests with Applications to Biostatistics: Nonparametric Combination Methodology, Wiley, Chichester, UK. (Describes nonparametric combination methodology.)
We wish to thank Dr. Luigi Salmaso for help designing and testing this function.

SEE ALSO:

, , ,

More details on many arguments, see .

Combination of p-values for multivariate statistics, or across groups in the case of : , , , .

Print, summarize, plot: , , , ,

Description of a "permutationTestMeans" object, extract parts: , , , .

Modify a "permutationTestMeans" object: .

For an annotated list of functions in the package, including other high-level resampling functions, see: .

EXAMPLES:

set.seed(0) 
x <- matrix(rnorm(15*3), 15) 
treatment <- rep(1:2, length=15) 
result <- permutationTestMeans(x, treatment = treatment, seed=1) 
result 
summary(result) 
plot(result) 
 
# two combinations 
update(result, combine = list(1:3, 1:2)) 
 
# three groups 
update(result, group = rep(c("a","b","c"), each=5), combineGroup = T) 
 
# Example using two sets of data instead of treatment vector 
y <- x + rnorm(length(x)) 
permutationTestMeans(x, data2=y) 
 
# Paired permutation test 
permutationTestMeans(x, data2=y, paired = T)