Multiple Comparisons For Generalized Least Squares Models

DESCRIPTION:

Compute simultaneous or non-simultaneous confidence intervals or bounds for the specified estimable linear combinations of the effects in a generalized least squares model.

USAGE:

multicomp.gls(x, focus = NULL, adjust = NULL, lmat = NULL, 
              comparisons = "mca", alpha = 0.05, bounds = "both",
              error.type = "fwe", method = "best.fast", 
              crit.point = NULL, Srank = NULL, control = NULL, 
              simsize = NULL, plot = F, labels = NULL, 
              valid.check = T, est.check = T)

REQUIRED ARGUMENTS:

x
a gls object constructed by the gls function.

OPTIONAL ARGUMENTS:

focus
character string specifying the name of a factor in the generalized least squares model upon whose levels linear combinations are computed. The focus, adjust, lmat and/or comparisons arguments together specify the linear combinations. If focus, adjust, and lmat are all NULL, the first factor in the model (if any) is used as the focus factor.
adjust
list of factors and/or covariates in the model (other than the focus factor), and specified adjustment values for them. If adjust = NULL, the adjustment values are the average over the levels of every (non-focus) factor, and the grand average values of numeric covariates. Several combinations of values may be specified for each factor and covariate; adjusted means for the focus factor are computed for every combination specified in the adjust list.
lmat
matrix of coefficients specifying linear combinations. Each column of lmat specifies a linear combination to be estimated under the "textbook parametrization" of the generalized least squares model (see below for details). Factor levels are ordered alphabetically within factor variables, so the linear combinations should be defined accordingly. Specifying lmat directly overrides the focus and adjust arguments.
comparisons
keyword specifying which standard differences of the adjusted means to compute. Available keywords at this time are: the default "mca" for all pairwise differences; "mcc" for pairwise differences between a single adjusted mean of the focus factor or lmat column and the remaining adjusted means (see the control argument below); and, "none" if the adjusted means or lmat columns themselves are of interest without further differencing. Any comparisons value besides the three keywords has the same effect as "none". If several adjustment combinations are specified in the adjust list, the differences given by comparisons are applied within each of the combinations.
alpha
the confidence level. 1 - alpha is the desired joint confidence level if error.type = "fwe", and it is the comparison-wise confidence level if error.type = "cwe". The default is alpha = 0.05.
bounds
character string specifying the type of intervals to compute. Available values are: the default "both" for two-sided intervals; "lower" for intervals with infinite upper bounds but sharper lower bounds than those obtained with bounds = "both"; and "upper" for intervals with infinite lower bounds but sharper upper bounds than those obtained with bounds = "both". Mixed bounds can be achieved by specifying, for example, bounds = "upper" and then providing the negative versions of combinations for which lower bounds are desired in lmat.
error.type
character string specifying the type of error rate. The default "fwe" specifies family-wise error rate protection, so the probability that all bounds hold is at least 1 - alpha. The "cwe" option specifies comparison-wise error rate protection, so the probability that any one preselected bound holds is 1 - alpha.
method
character string specifying the desired method for critical point calculation. The default is "best.fast". Available methods at this time are:

"lsd": Fisher's unprotected lsd method. The critical point for two-sided intervals is the upper - alpha/2 Student's t-value, qt(1 - alpha/2, df.residual). For confidence bounds, the critical point is the upper - alpha point. This is the only method available if error.type = "cwe", and it is not available if error.type = "fwe".

"tukey": Tukey's method. If the linear combinations specify all pairwise differences between several quantities, the critical point is the Tukey studentized-range quantile scaled by sqrt(2). When more than three quantities are to be compared, validity of the Tukey method is checked using the Hayter (1989) sufficient condition, unless you specify valid.check = F.

"dunnett": Dunnett's method. If comparisons = "mca", the critical point for comparisons-with-control intervals or bounds is computed, if valid (see Dunnett (1964)). The validity condition requires the covariance matrix of the treatment-control differences to be equivalent to that of a one-factor model (allowing unequal sample sizes). You may override the validity check by specifying valid.check = F.

"sidak": Sidak's method. If bounds = "both", the critical point is the upper - alpha quantile from the maximum absolute value of k "uncorrelated" multivariate t random variables (see Sidak (1967)). If bounds = "upper" or bounds = "lower", and the estimators specified by lmat and/or comparisons are uncorrelated (or valid.check = F), then the critical point is the corresponding quantile of the maximum without taking absolute values.

"bon": Bonferroni's method. If a total of m bounds are to be computed (counting each interval as 2 bounds), the critical point is qt(1 - alpha/m, df.residual).

"scheffe": Scheffe's method. If the rank of the covariance matrix of the estimators for the linear combinations is Srank, the critical point is sqrt(Srank * qf(1 - alpha, Srank, df.residual)).

"sim": Simulation method. An approximate critical point is generated using the simulation-based method described in Edwards and Berry (1987). This may take a few seconds or more of computer time, depending on the size of the problem. With the default simulation size, the critical point generated gives a family-wise error rate within 10% of the specified alpha, with 99% confidence. See the simsize argument below.

"best.fast": The default method. The smallest valid critical point among the valid "fast" methods (i.e., those excluding method = "sim") is computed.

"best": the smallest critical point among all the valid methods above is computed.

crit.point
a value for the critical point. If none of the methods available via the method argument suits you, a value for the critical point may be specified directly with crit.point. In this case ensuring validity of the critical point is your responsibility. When crit.point is defined, alpha and error.type are merely labels in the output object and may have no meaning.
Srank
rank of the covariance matrix of the estimators for the linear combinations.
control
the column of lmat that should be treated as the control when comparisons = "mcc". The default is the last column of lmat.
simsize
simulation size for the "sim" method of computing critical points. The default choice provides intervals or bounds that have family-wise error rates within 10% of the nominal alpha, with probability 0.99. This amounts to simulation sizes in the tens of thousands for most cases. Smaller simulation sizes are not recommended; see Edwards and Berry (1987).
plot
logical value: if TRUE, a plot of the calculated intervals is displayed on the current graphics device. Alternatively, the output object can be used as an argument to the generic function plot.
labels
vector of character strings specifying labels for the intervals/bounds. The labels are used as row names in the output table and in any created plot. For standard comparisons, multicomp attempts to generate sensible labels.
valid.check
logical value: if TRUE, validity of the specified critical point calculation method is checked. If the validity condition fails, multicomp terminates with an error message.
est.check
logical value: if TRUE, estimability of the desired linear combinations is checked. If the condition fails, multicomp terminates with an error message. Note that in certain cases, too much rounding in the lmat entries can cause the estimability condition to fail.

VALUE:

an object of class "multicomp" with components:
table
matrix in which each row supplies an estimate of a linear combination, a standard error, and lower and/or upper confidence bounds. The row names of the matrix are either generated or taken from the labels argument.
alpha
input value of alpha.
error.type
input value of error.type.
method
method used for critical point calculations.
crit.point
value of the critical point used in calculating the intervals.
lmat
matrix of coefficients specifying the linear combinations estimated.
Srank
rank of the covariance matrix of the estimators for the linear combinations.
simsize
size of the simulation used to generate a critical point when method = "sim".
ylabel
character string used to label the response variable.

SIDE EFFECTS:

If plot = T, a plot is displayed on the current graphics device.

DETAILS:

The textbook parameterization used by multicomp.gls is designed to facilitate the specification of meaningful linear combinations of the model parameters. For example, if Trt is a factor with 3 levels, the parameters for the one-way anova model Y ~ Trt are (mu, Trt1, Trt2, Trt3) . For a 3x2 factorial design of A and B, the model Y ~ A * B has the parameters (mu, A1, A2, A3, B1, B2, A1B1, A1B2, A2B1, A2B2, A3B1, A3B2) . Note that these are over-specifications, and not every linear combination of these parameters is estimable.

Warning: the gls function orders the levels of a factor alphabetically according to the character strings used to identify levels. This is therefore the order of the factor's effects in the textbook parameterization as well. If you are in doubt about the textbook parameterization for your model, invoke multicomp.gls with your gls object and inspect the row labels of the lmat element in the output list.

When specified, focus and adjust attempt to generate an lmat consisting of coefficients of adjusted means for the levels of the focus factor, at each combination in the adjust list. Once lmat is specified, either directly through the lmat argument or indirectly through focus and adjust, it may be further modified by comparisons. If you are in doubt about the lmat matrix in your call to multicomp, carefully examine the lmat element of the output list.

After lmat is specified, linear combinations are checked for estimability by verifying that each is (up to a tolerance) a linear combination of the rows of the design matrix that corresponds to the textbook parameterization. After checking estimability, multicomp.gls generates estimates of the linear combinations of interest, along with their covariance matrix. The critical point and the intervals or bounds are then computed by a call to multicomp.default. The computed intervals are of the form t(lmat) %*% x +/- crit.point * sqrt(t(lmat) %*% vmat %*% lmat).

The Tukey, Dunnett, and one-sided Sidak methods for critical point computations have not been shown to be valid for all choices of the covariance matrix of the estimators, t(lmat) %*% vmat %*% lmat. If this matrix is larger than 3x3, the Hayter (1989) sufficient condition is used to check the validity of Tukey's critical point. For Dunnett's method, the matrix is checked to see if it is of a form resulting from all-to-one comparisons of uncorrelated estimators. Sidak's method is always valid for two-sided intervals, and it is exact when estimators are uncorrelated; for one-sided bounds, this condition validates the Sidak method. These conditions are sufficient for the validity of these methods, but in most cases they are not necessary. The expert user is therefore encouraged to use valid.check = F or a crit.point definition (at their own risk) to override the built-in safety measures.

REFERENCES:

Dunnett, C.W. (1964). New tables for multiple comparisons with a control. Biometrics, 20: 482-491.

Edwards, D. and Berry, J.J. (1987). The efficiency of simulation-based multiple comparisons. Biometrics, 43: 913-928.

Hayter, A.J. (1989). Pairwise comparisons of generally correlated means. Journal American Statistical Association, 84: 208-213.

Hochberg, Y. and Tamhane, A.C. (1987). Multiple Comparison Procedures. New York: Wiley.

Hsu, J.C. (1996). Multiple Comparisons: Theory and Methods. London: Chapman and Hall.

Sidak, Z. (1967). Rectangular confidence regions for the means of multivariate normal distributions. Journal American Statistical Association, 62: 626-633.

SEE ALSO:

, , , , ,

EXAMPLES:

# all-pairwise comparisons via the Tukey-Kramer method.
fuelGLS1 <- gls(Fuel ~ Type, data = fuel.frame,
                weights = varExp(form = ~ Disp.))
fuelMC <- multicomp(fuelGLS1)
print(fuelMC)
plot(fuelMC)
  
# 90% simultaneous upper bounds for comparisons of all other models
#   with the Van by Dunnett's method
fuelGLS1 <- gls(Fuel ~ Type, data = fuel.frame, 
                weights = varExp(form = ~ Disp.))
multicomp(fuelGLS1, comparisons = "mcc", method = "dunnett",
          bounds = "upper", plot = T, alpha = 0.10)
  
# 95% simultaneous lower bounds for comparisons of all other models
# with Compact cars. The focus factor is specified directly.
fuelGLS1 <- gls(Fuel ~ Type, data = fuel.frame,
                weights = varExp(form = ~ Disp.))
multicomp(fuelGLS1, focus = "Type", comparison = "mcc",
          bounds = "lower", control = 1, plot = T)
  
# use lmat to directly specify mcc intervals with the Van;
#   method = "best.fast" automatically selects Dunnett.
fuelGLS1 <- gls(Fuel ~ Type, data = fuel.frame,
                weights = varExp(form = ~ Disp.))
mcclmat <- rbind(rep(0, 5), contr.sum(6))
mcclabels <- c("Compact-Van", "Large-Van", "Medium-Van",
               "Small-Van", "Sporty-Van")
multicomp(fuelGLS1, comparisons = "none", lmat = mcclmat,
          labels = mcclabels)
  
# all-pairwise comparisons of adjusted means
#   method = "best" chooses simulation-based.
fuelGLS2 <- gls(Fuel ~ Type + Weight, data = fuel.frame,
                weights = varExp(form = ~ Disp.))
multicomp(fuelGLS2, method = "best")
  
# non-simultaneous intervals for adjusted mean fuel mileage,
#   adjusting to both  Weight = 2500 and Weight = 3500
fuelGLS2 <- gls(Fuel ~ Type + Weight, data = fuel.frame,
                weights = varExp(form = ~ Disp.))
multicomp(fuelGLS2, adjust = list(Weight = c(2500, 3500)),
          comparisons = "none", method = "lsd", error.type = "cwe")