Combine Multiple Imputation Inferences

DESCRIPTION:

Compute likelihood ratio test statistics in the presence of multiple imputations.

USAGE:

miLikelihoodTest(data, FUN, df1, estimates, estimates0, ...) 
miLikelihoodTest(data, FUN, df1, ...) 

REQUIRED ARGUMENTS:

data
an mi object, containing multiple imputations,
FUN
function such that FUN(x, e1, e0, ...) calculates the likelihood ratio statistic, two times the log of the likelihood ratio between parameter estimates e1 and e0, the maximum likelihood estimates under the alternative and null hypotheses, respectively.

Alternately, if estimates is missing, then FUN(x, ...) must calculate parameter estimates internally and return the the likelihood ratio statistic. There are additional conditions in this case, described below.
df1
integer, degrees of freedom for the likelihood ratio test.

OPTIONAL ARGUMENTS:

estimates
an mi object whose completed data sets are parameter estimates under the alternative hypothesis, based on the complete data sets from data.
estimates0
like estimates, but for the null hypothesis.
...
additional arguments to FUN. These should be given in the name=value form. In particular, these may specify the null hypothesis. These should not be mi objects; instead any mi objects should be incorporated into data (which may be a list containing multiple mi objects).

VALUE:

list with components:
Fstat
F test statistic.
df1, df2
numerator and denominator degrees of freedom for the F-statistic; df1 is the same as the degrees of freedom for the likelihood-ratio chi-square statistic that would be used in the absence of missing data.
r
estimate of average increase in variance due to missing data.

DETAILS:

The standard way of calling this function, where estimates and estimates0 are supplied, assumes that these parameter estimates are approximately normally distributed. For some of the calculations the estimates are averaged across imputations. This is not invariant under transformations of the parameters, and could result in parameters which are outside the parameter space.

The alternative way of calling this function requires that FUN calculate estimates internally. Furthermore, the data must be such that the completed data sets can be stacked using rbind to form a single large data set; then, calling FUN(largeData, ...) should internally compute maximum likelihood parameter estimates based on the large dataset, and return a likelihood ratio statistic for the stacked data. The estimates should be within the parameter space and are typically approximately equal to the averages of the individual complete data set estimates, estimates and likelihood ratios should not be sensitive to ties in the data, and the log-likelihood ratio obtained by stacking m identical datasets should be m times that for a single dataset. This alternative is invariant under transformations of the parameters.

REFERENCES:

Hesterberg, T. (1998). Combining multiple imputation t, chi-square, and F inferences. Insightful Technical Report number 75.

Meng and Rubin (1992). Performing likelihood ratio tests with multiply-imputed data sets. Biometrika 79, 103--111.

Schafer, J. L. (1997). Analysis of Incomplete Multivariate Data. Chapman & Hall, London.

SEE ALSO:

, .

EXAMPLES:

# Example, Normal, H0: mu=230, unknown variance 
x <- miEval(cholesterolImpExample[[3]] - 230) 
estimates <- miEval( c(mean(x), mean((x-mean(x))^2))) 
estimates0 <- miEval( c(0, mean(x^2))) 
f1 <- function(dat, e1, e0, ...){ 
  n <- length(dat) 
  2*((-n*log(e1[2])/2 - sum((dat-e1[1])^2)/(2*e1[2])) - 
     (-n*log(e0[2])/2 - sum((dat-e0[1])^2)/(2*e0[2]))) 
} 
f2 <- function(dat, ...){ 
  n <- length(dat) 
  mu0 <- 0 
  mu1 <- mean(dat) 
  var0 <- mean(dat^2) 
  var1 <- mean((dat-mu1)^2) 
  2*((-n*log(var1)/2 - sum((dat-mu1)^2)/(2*var1)) - 
     (-n*log(var0)/2 - sum((dat-mu0)^2)/(2*var0))) 
} 
miLikelihoodTest( x, f1, 1, estimates, estimates0) 
miLikelihoodTest( x, f2, 1)