emCgm.default(object, margins, gauss, design, optData, subset,
prior = 1, start = NULL, control = emCgm.control(), contrasts = NULL)
emCgm.preCgm(object, margins, gauss, design, optData,
prior = 1, start = NULL, control = emCgm.control(), contrasts = NULL)
emCgm.missmodel(object, margins, gauss, design optData,
prior = 1, start = NULL, control = emCgm.control(), contrasts = NULL)
emCgm.default: a data frame or matrix containing the raw data.
When a data frame is input and if the
margins argument is not
provided, then the loglinear part of the model is assumed to be a
saturated model in which all
factor variables are used to form the
table. If the
gauss argument is not provided, then all numeric
variables in the data frame are included in the conditional Gaussian
part of the model.
When a matrix is input, you must provide the
margins argument,
which identifies the variables to use in the discrete part
of the model. If the
gauss argument is omitted, then all remaining
variables in the matrix are used in the Gaussian part of the
conditional Gaussian distribution.
emCgm.preCgm: an object of class
"preCgm".
emCgm.missmodel: an object of class
"missmodel" containing
the results of a previous analysis. Any of the functions
mdCgm,
completeCgm,
emCgm, or
daCgm may be used to produce the
missmodel object.
list(1:2, 3:4) would
indicate fitting the 1,2 margin (summing over variables 3 and 4) and
the 3,4 margin in a four-way table. This same model can be specified
using the names of the variables (e.g.,
list(c("V1", "V2"), c("V3", "V4"))),
or using formula notation, as in
margins = ~V1:V2 + V3:V4.
margins is not specified, a saturated model is fitted.
emLoglin.default: When a matrix is input as argument
object,
argument
margins must be specified. When a data frame is input and
argument
margins is missing, then a saturated model involving all
factor variables is fitted.
emLoglin.missmodel: If not given, argument
margins defaults to
the margins specified in the
call statement of the input
"missmodel" object.
c(1, 2, 4), as a
vector of variable names, e.g.
c("V1", "V2", "V4"), and using
formula notation, e.g.
~V1+V2+V4. If argument
gauss is
omitted, then all numeric variables (which do not appear in argument
margins) are used in the multivariate gaussian model.
optData. Optionally, an
ncell by
m
matrix may be input directly as the design matrix.
i=1, ..., ncell denote the cells in the loglinear model, and let
mu(i) denote the vector of numeric variable means in cell
i. Then
the formula
design provides the design matrix for predicting the
cell means. As an example, let
"V1" and
"V2" be the names of the
factor variables, and let
"age" be a vector giving an average age
for the subjects in each cell. Then formula
design=~V1+V2 indicates
a main effect model for the cell means, while
design=~V1 + V2 + age
indicates a main effect model for the cell means, adjusted for average
cell age.
ncell by
m
matrix may be input. In this case, the regression model is obtained
as a linear function of the columns of the input matrix.
design is not specified, then the design matrix
is taken to be an identity matrix.
ncell rows containing predictors to be used in
computing the
design matrix. In the example given in the description
for argument
design, the variable
age would be input in argument
optData.
object is a data frame,
this expression may use variables in the data frame.
"priorLoglin",
or a vector of hyperparameters.
"ml" (maximum likelihood),
"noninformative", and
"data.dependent". String matching is used,
so the characters
"m",
"n", or
"d" are sufficient. The values
of the hyperparameters change with the algorithm (see
priorLoglin
for details). E.g.
"noninformative" means a common value of 1 for
EM, and a common value of 0.5 for DA.
"priorLoglin" object is created by routine
priorLoglin.
dataDepPrior.
See
for details.
"noninformative".
emLoglin.missmodel: If not given, argument
prior defaults to
the prior probabilities specified in the
call statement of the input
"missmodel" object. If these are not specified, then the default
(which depends on the algorithm) is used.
"cgm" object of starting values of the model
parameters. The parameters estimated by
mdCgm are the cell means and
variance--covariance matrix of a multivariate Gaussian distribution,
and log-linear model cell probabilities.
start may be a list with matrix component
mu giving the
matrix of means in each of its
ncell columns (where the columns must
be in the same order as the log-linear model cells, and the rows must
be in the same order as the continuous variables), a matrix component
sigma giving the variance-covariance matrix, and a vector
pi
giving the cell probabilities. If structural zeros appear in the
contingency table,
start$pi must contain zeros to indicate the
structural zeros; see
for details.
"cgm" object created as the
paramIter
component of the class
"missmodel" object may be input for the
starting values. Routines
mdCgm,
daCgm, and
emCgm may be used to
create an appropriate
"missmodel" object.
1s for
pi, and a matrix of means and a diagonal matrix of
variances estimated obtained from the numeric observations with no
missing values.
object is a class
"missmodel" object,
start
defaults to the final estimates in the input
"missmodel" object.
daCgm.missmodel: if not given, argument
control defaults to
the control parameters specified in the
call statement of the input
"missmodel" object, but only if these are of the correct class. If
these are not given (or are not of the correct class), then the
argument
control defaults to
daCgm.control values.
design formula. The elements of the list should have the same
name as the variable and should be either a contrast matrix
(specifically, any full-rank matrix with as many rows as there are
levels in the factor), or else a function to compute such a matrix
given the number of levels.
"missmodel" is returned; see
for details.
See the help file for for additional details.
emCgm(object = language) # NOTE: this iterates forever,
# suggesting need for restricted model.
emCgm.default(object = language) # same
# Restricted model
# Categorical variables LAN, AGE, PRI, SEX, GRD specify a 5 dimensional
# contingency table with 4*5*5*2*5= 1000 cells
# Specify loglinear model with all main effects and 2-variable associations:
margins.form <- ~ LAN + AGE + PRI + SEX + GRD +
LAN:AGE + LAN:PRI + LAN:SEX + LAN:GRD +
AGE:PRI + AGE:SEX + AGE:GRD +
PRI:SEX + PRI:GRD +
SEX:GRD
#linear contrast
lc <- c(-2,-1,0,1,2)
design.form <- ~ LAN + C(AGE,lc,1) + C(PRI,lc,1) + SEX + C(GRD,lc,1)
# PreProcess
language.pre <- preCgm(language)
# Set hyperparameter to 1.05 to ensure a mode in the
# interior of the parameter space
language.em <- emCgm(language.pre, margins = margins.form,
design = design.form, prior = 1.05)
# same as:
emCgm.preCgm(language.pre, margins = margins.form,
design = design.form, prior = 1.05)