loglin
, to allow log-linear models to be
specified and fitted in a manner similar to that of other fitting
functions, such as
glm
.
loglm(formula, data = sys.parent(), subset, na.action, ...)
data
argument is required and must be a (complete) array of frequencies.
In this case the variables on the right-hand side may be the names of
the
dimnames
attribute of the frequency
array, or may be the positive integers: 1, 2, 3, ... used as
alternative names for the 1st, 2nd, 3rd, ... dimension (classifying
factor).
.
to stand for
all other variables in the data frame'' is allowed. Any non-factors
on the right-hand side of the formula are coerced to factor.
crosstabs
.
TRUE
specifies that the (possibly
constructed) array of frequencies is to be retained as part of the
fitted model object. The default action is to use the same value as
that used for
fit
.
loglin
.
"loglm"
conveying the
results of the fitted log-linear model. Methods exist for the generic
functions
print
,
summary
,
deviance
,
fitted
,
coef
,
resid
,
anova
and
update
, which perform the expected tasks.
Only log-likelihood ratio tests are allowed using
anova
. The deviance is simply an
alternative name for the log-likelihood ratio statistic for testing
the current model within a saturated model, in accordance with
standard usage in generalized linear models.
If the left-hand side of the formula is empty the
data
argument supplies the frequency
array and the right-hand side of the formula is used to construct the
list of fixed faces as required by
loglin
. Structural zeros may be
specified by giving a
start
argument with those entries set to
zero, as described in the help information for
loglin
. If the left-hand side is not
empty, all variables on the right-hand side are regarded as
classifying factors and an array of frequencies is constructed. If
some cells in the complete array are not specified they are treated as
structural zeros. The right-hand side of the formula is again used to
construct the list of faces on which the observed and fitted totals
must agree, as required by
loglin
. Hence
terms such as
a:b
,
a*b
and
a/b
are all equivalent.
If structural zeros are present, the calculation of degrees of freedom
may not be correct.
loglin
itself takes
no action to allow for structural zeros.
loglm
deducts one degree of freedom for
each structural zero, but cannot make allowance for gains in error
degrees of freedom due to loss of dimension in the model space. (This
would require checking the rank of the model matrix, but since
iterative proportional scaling methods are developed largely to avoid
constructing the model matrix explicitly, the computation is at least
difficult.) When structural zeros (or zero fitted values) are present
the estimated coefficients will not be available due to infinite
estimates. The deviances will normally continue to be correct,
though.
# The data frames Cars93, minn38 and quine are available # in the MASS library. # Case 1: frequencies specified as an array. sapply(minn38, function(x) length(levels(x))) minn38a <- array(0, c(3,4,7,2), lapply(minn38[, -5], levels)) minn38a[data.matrix(minn38[,-5])] <- minn38$f fm <- loglm(~1 + 2 + 3 + 4, minn38a) # numerals as names. deviance(fm) fm1 <- update(fm, .~.^2) fm2 <- update(fm, .~.^3, print = T) anova(fm, fm1, fm2) names(quine) fm <- loglm(Days ~ .^2, quine) gm <- glm(Days ~ .^2, poisson, quine) # check glm. c(deviance(fm), deviance(gm)) # deviances agree c(fm$df, gm$df.residual) # resid df do not! # The loglm residual degrees of freedom is wrong because of # a non-detectable redundancy in the model matrix.