EM Algorithm for Conditional Gaussian Models

DESCRIPTION:

Computes EM algorithm estimates for Conditional Gaussian Models.

This function is generic (see Methods); method functions can be written to handle specific classes of data. Classes which already have methods for this function include: preCgm and missmodel. A default method operates on matrices and data frames.

USAGE:

emCgm(object, ...) 

REQUIRED ARGUMENTS:

object
a matrix, data frame, or object of class "preCgm" or "missmodel".

OPTIONAL ARGUMENTS:

...
most methods (see ) have arguments of margins, gauss, design, optData, subset, prior, start, control, and contrasts. Additional arguments are possible. See the specific method for a list of all possible arguments.

VALUE:

an object of class "missmodel" is returned. See for details.

DETAILS:

The emCgm function computes estimates of the parameters in a Conditional Gaussian Model in which the factor variables are modeled according to a hierarchical log-linear model, and, conditional upon the factor variables, the distribution of the numeric variables is multivariate normal. In hierarchical models the inclusion of an interaction effect automatically means that all dependent lower level effects are included in the model. For example, for factors A , B, and C, inclusion of A:B:C automatically means that A , B, C, A:B , A:C, and B:C are also included in the model.

emCgm handles missing values by using the EM algorithm to compute the modes of the posterior probability distribution, given a Dirichlet prior distribution on the cell probabilities in the log-linear model. A noninformative prior (see ) is always assumed for the parameters in the multivariate normal distribution.

Because the emCgm function is often called more than once, it is usually preferable to precompute quantities used by emCgm . This may be done using the preCgm function.

REFERENCES:

Schafer, J. L. (1997), Analysis of Incomplete Multivariate Data, Chapman & Hall, London.

SEE ALSO:

, , , , , , , .

EXAMPLES:

# emCgm(language)          # NOTE: this iterates a long time,
                           # suggesting need for restricted model.
# emCgm.default(language)  # same.

# PreProcess 
language.pre <- preCgm(language) 

# Categorical variables LAN, AGE, PRI, SEX, GRD specify a 5 dimensional 
# contingency table with 4*5*5*2*5= 1000 cells 
# Specify loglinear model with all main effects and 2-variable associations: 
margins.form <- ~ LAN + AGE + PRI + SEX + GRD + 
             LAN:AGE + LAN:PRI + LAN:SEX + LAN:GRD + 
             AGE:PRI + AGE:SEX + AGE:GRD + 
             PRI:SEX + PRI:GRD + 
             SEX:GRD 

#linear contrast 
lc <- c(-2,-1,0,1,2) 
design.form <- ~ LAN + C(AGE,lc,1) + C(PRI,lc,1) + SEX + C(GRD,lc,1) 

# Set hyperparameter to 1.05 to ensure a mode in the
# interior of the parameter space 
language.em <- emCgm(language.pre, margins = margins.form,
                     design = design.form, prior = 1.05) 
# same as: 
emCgm.preCgm(language.pre, margins = margins.form,
                     design = design.form, prior = 1.05)