Estimates for Multivariate Normal Models

DESCRIPTION:

Estimates parameters for a multivariate normal model. There are four methods for handling missing values.

USAGE:

mdGauss(object, subset, prior = <<see below>>, na.proc = "fail", 
    start = <<see below>>, control) 

REQUIRED ARGUMENTS:

object
a class "preGauss" or "missmodel" object, or a data frame or matrix containing the raw data. When a data frame is input, the model is applied to all numeric variables. When a matrix is input, all variables are used. Using a class "preGauss" object can considerably shorten the total compute time if mdGauss is called more than once. Use routine preGauss to create this object.

OPTIONAL ARGUMENTS:

subset
expression specifying which rows of the data should be used in the fit. This can be a logical vector (which is replicated to have length equal to the number of rows), a numeric vector indicating the observation numbers to be included, or a character vector of the row names to be included. All observations are included by default. If object is a data frame, this expression may use variables in the data frame. This argument is not used if argument object is a class "preGauss" or "missmodel" object.
prior
specifies normal inverted-Wishart prior hyperparameters. Supply either a character string, or an object of class "priorGauss".

Valid character strings are "ml" (maximum likelihood), "noninformative", and "ridge" (for the default ridge prior). String matching is used, so the characters "m", "n", or "r" are sufficient.

A class "priorGauss" object is created by routine priorGauss.

The default value is a noninformative prior. When a class "missmodel" object is input, any value specified in a previous call has priority over the default value (but not over any currently specified value).
na.proc
character, the method to use in handling missing data. Possible values are:
"fail"

stop with an error message if missing values are encountered,

"omit"
omit observations with missing values,
"em"
use the EM algorithm,
"da"
use a data augmentation algorithm.

When argument object is a class "preGauss" or "missmodel" object, argument na.proc must be either "da" or "em".
start
starting values of the parameters. The parameters estimated by mdGauss are the mean and variance--covariance matrix of a multivariate normal distribution. Thus, start may be a list with vector component mu giving the mean and matrix component sigma giving the variance-covariance matrix. Alternatively, a class "Gauss" object created as the paramIter component of the class "missmodel" object may be used. Routines mdGauss, daGauss, and emGauss may be used to create an appropriate "missmodel" object.

In most cases the default starting values are equal to the mean and a diagonal variance-covariance matrix estimate obtained from the observations with no missing values. If an entire column is missing, the default mean for the column is zero, and the default variance for the column is one. Another exception occurs when argument object is a class "missmodel" object. In this case argument start defaults to the final estimates in the input "missmodel" object.
control
A list of parameters used to control the algorithm. If not given, these default to the emGauss.control values, or to the daGauss.control values, as appropriate. See the help files for and for details.

When a class "missmodel" object is input, the control values specified on a previous call has priority over the default values (but not over any currently specified value), but only if these are of the required class ( "da" or "em").

VALUE:

an object of class "missmodel" is returned; see for details.

SIDE EFFECTS:

The function mdGauss creates the data set .Random.seed if it does not already exist, otherwise updates its value.

DETAILS:

The mdGauss function computes estimates of the mean and variance-covariance matrix in a multivariate normal model. mdGauss provides several methods for handling missing values. The EM algorithm computes the modes of the posterior probability distribution. Alternatively, the data augmentation algorithm uses Markov Chain Monte Carlo (MCMC) methods to alternately simulate data for the missing values, and parameter estimates. With this method, care must be taken to ensure that the Markov Chain has reached a steady state. The sequence of estimates should be analyzed to diagnose convergence.

Because the mdGauss function is often called more than once, it is usually preferable to precompute many of the statistics used by mdGauss . This may be done using the preGauss function.

Parameter estimates from either the EM or data augmentation algorithms may be used as starting values to the impGauss function.

REFERENCES:

Schafer, J. L. (1997), Analysis of Incomplete Multivariate Data , Chapman & Hall, London.

SEE ALSO:

, , , , , , , , , , , , , , .

EXAMPLES:

mdGauss(object = cholesterol)   # fails by default
                                # because cholesterol has missing data

# use EM
mdGauss(object = cholesterol, na.proc = "em")

# same, but first create preGauss object for greater efficiency
cholesterol.pre <- preGauss(cholesterol)
cholesterol.em <- mdGauss(cholesterol.pre, na.proc = "em")
cholesterol.em <- emGauss(cholesterol.pre)  # same

# Data augmentation: start with last parameter estimates
# given in cholesterol.em, save iterates 101 to 1100
mdGauss(cholesterol.em, na.proc = "da",
                        control=list(save=101:1100))
# same as:
daGauss(cholesterol.em, control=list(save=101:1100))