completeLoglin(data, frequency, margins, subset, prior = 1, start = NULL, control = emLoglin.control())
data
.
If
data
is a data frame and this is the
(unquoted) name of a variable in the data frame, then that variable is used.
If omitted, all frequencies are assumed to be 1 (unless specified in
argument
margins
).
list(1:2, 3:4)
would indicate fitting
the 1,2 margin (summing over variables 3 and 4) and the 3,4 margin in
a four-way table. This same model can be specified using the names of
the variables (e.g.,
list(c("V1", "V2"), c("V3", "V4"))
), or using
formula notation, as in
margins = ~V1:V2 + V3:V4
. When formula
notation is used, the argument
frequency
can be included as the
dependent variable (as in
margins = frequency~V1:V2 + V3:V4
).
margins
is not specified,
a saturated model is fitted. When a matrix is input as argument
data
, a saturated model is defined as
a model with a single interaction term that includes every column in
the data matrix. When a data frame is input, a
saturated model includes all factor variables in the single
interaction term. Cell counts in the table are determined by the
frequency
variable.
data
is a data frame,
this expression may use variables in the data frame.
"priorLoglin"
, or an array of
hyperparameters.
"ml"
(maximum likelihood) and
"noninformative"
.
String matching is used,
so the characters
"m"
or
"n"
are sufficient. The values
of the hyperparameters changes with the algorithm (see
for details). E.g.
"noninformative"
means a common value of 1 for
EM, and a common value of 0.5 for DA.
"priorLoglin"
object is created by routine
priorLoglin
.
start
for the order to use in specifying a vector of
hyperparameters. If a single numeric value is input, its value is
replicated for all cells in the table.
The hyperparameters for a data dependent prior (following an
independence model) can be generated using routine
dataDepPrior
.
See
for details.
"noninformative"
. When a class
"missmodel"
object is input, any value specified in a previous call has priority
over the default value (but not over any currently specified value).
(NA)
, when a vector of
hyperparameters is input as argument
prior
.
mdLoglin
are the cell probabilities. Thus,
start
is an array with length equal to the total number of cells in the
table and containing a probability estimate for each cell. Starting
values for cells that are structural zeros in the table should be
zero. The default starting values are all equal to one divided by the
total number of cells in the table. Suppose that the table is defined
by the variables
X1
,
X2
, and
X3
.
Then the cells in the table are ordered such
that the index for variable
X1
varies fastest, the index for
variable
X2
varies next fastest, etc.
maxit
,
tolerance
, and
trace
.
"missmodel"
is returned; see
for details. In the class
"missmodel"
object returned by
completeLoglin
, the
paramIter
component
contains one or more rows of parameter estimates, and the
algorithm
element contains an object of class
"em"
.
The
completeLoglin
function computes estimates of the cell
probabilities in hierarchical log-linear models. A hierarchical
log-linear model is a multinomial model that predicts the log of the
cell probabilities for the multinomial as a linear factorial model.
In a hierarchical model the inclusion of an interaction effect
automatically means that all dependent lower level effects are
included in the model. For example, for factors
A
,
B
, and
C
,
inclusion of
A:B:C
automatically means that
A
,
B
,
C
,
A:B
,
A:C
, and
B:C
are also included in the model.
Agresti, A. (1990),
Categorical Data Analysis ,
John Wiley & Sons, New York.
Bishop, Y. M. M., Fienberg, S. E., and Holland, H. W.,
Discrete Multivariate Analysis: Theory and Practice ,
MIT Press, Cambridge, MA.
completeLoglin(data = na.omit(crime), margins = count~Visit.1:Visit.2)