Data Augmentation for Log-Linear Models

DESCRIPTION:

Methods for data augmentation assuming a log-linear model.

USAGE:

daLoglin.default(object, frequency, margins, subset, prior = 0.5, 
    start = <<see below>>, control = daLoglin.control()) 
daLoglin.preLoglin(object, margins, prior = 0.5, 
    start = <<see below>>, control = daLoglin.control()) 
daLoglin.missmodel(object, margins, prior = 0.5, 
    start = <<see below>>, control = daLoglin.control()) 

REQUIRED ARGUMENTS:

object
for emLoglin.default: a data frame or matrix containing the raw data. When a data frame is input, the table is specified by the levels of the factor variables. When a matrix is input, it is assumed that the levels of a variable form a sequence of integers from one to the maximum value of the variable.

for daLoglin.preLoglin, an object of class "preLoglin" (produced by the preLoglin function).

for daLoglin.missmodel, an object of class "missmodel" containing the results of a previous log-linear analysis. Any of the functions mdLoglin, completeLoglin, emLoglin, or daLoglin may be used to produce the missmodel object.

OPTIONAL ARGUMENTS:

frequency
The frequency of the corresponding row in argument object. If object is a data frame and this is the (unquoted) name of a variable in the data frame, then that variable is used. If omitted, all frequencies are assumed to be 1 (unless specified in argument margins).
margins
the marginal totals to be fit. A margin is described by the factors not summed over. Thus list(1:2, 3:4) would indicate fitting the 1,2 margin (summing over variables 3 and 4) and the 3,4 margin in a four-way table. This same model can be specified using the names of the variables (e.g., list(c("V1", "V2"), c("V3", "V4"))), or using formula notation, as in margins = ~V1:V2 + V3:V4. When formula notation is used, the argument frequency can be included as the dependent variable (as in margins=frequency~V1:V2 + V3:V4). If margins is not specified, a saturated model is fit.

For daLoglin.default: when a matrix is input as argument object, a saturated model is defined as a model with a single interaction term that includes every column in the data matrix. When a data frame is input, a saturated model includes all factor variables in the single interaction term. Cell counts in the table are determined by the frequency variable.

For daLoglin.missmodel: if not given, argument margins defaults to the margins specified in the call statement of the input "missmodel" object.
subset
expression specifying which rows of the data should be used in the fit. This can be a logical vector (which is replicated to have length equal to the number of rows), a numeric vector indicating the observation numbers to be included, or a character vector of the row names to be included. All observations are included by default. If object is a data frame, this expression may use variables in the data frame.
prior
specifies Dirichlet prior hyperparameters. Supply either a character string, or an object of class "priorLoglin", or an array of hyperparameters.

Valid character strings are "ml" (maximum likelihood) or "noninformative". String matching is used, so the characters "m" or "n" are sufficient. The values of the hyperparameters changes with the algorithm (see for details). E.g. "noninformative" means a common value of 1 for EM, and a common value of 0.5 for DA.

A class "priorLoglin" object is created by routine priorLoglin.

See argument start for the order to use in specifying a vector of hyperparameters. If a single numeric value is input, its value is replicated for all cells in the table. The hyperparameters for a data dependent prior (following an independence model) can be generated using routine dataDepPrior. See for details.

The default value is "noninformative". When a class "missmodel" object is input, any value specified in a previous call has priority over the default value (but not over any currently specified value).

Structural zeros must be coded as missing ( NA) when a vector of hyperparameters is input as argument prior.

For daLoglin.missmodel: If not given, argument prior defaults to the prior probabilities specified in the call statement of the input "missmodel" object. If these are not specified, then the prior probability defaults to 0.5.
start
starting values of the parameters. The parameters estimated by daLoglin methods are the cell probabilities. Thus, start is a vector with length equal to the total length of the table containing a probability estimate for each cell in the table. Starting values for cells that are structural zeros in the table should be zero. The default starting values are all equal to one divided by the number of cells in the table. Suppose that the table is defined by the variables X1, X2, and X3. Then the cells in the table are ordered such that the index for variable X1 varies fastest, the index for variable X2 varies next fastest, etc.

For daLoglin.missmodel: If not given and if argument margins is not specified, then argument start defaults to the final estimates in the input "missmodel" object. If argument margins is specified, then argument start must be provided. Also notice that when argument margins is specified, care must be taken to ensure that structural zeros in these final estimates are also structural zeros in the new model.
control
A list of parameters used to control the algorithm; see for details.

For daLoglin.missmodel: if not given, argument control defaults to the control parameters specified in the call statement of the input "missmodel" object, but only if these are of the correct class. If these are not given (or cannot be used), then the argument control defaults to daLoglin.control.

VALUE:

an object of class "missmodel" is returned; see for further details.

SIDE EFFECTS:

All methods create the data set .Random.seed if it does not already exist; otherwise update its value.

DETAILS:

See the help file for for additional details.

SEE ALSO:

, , , , , , , .

EXAMPLES:

daLoglin.default(object = crime, margins = count~Visit.1:Visit.2,
              control = list(save = 101:500))

attach(crime)
crime.pre <- preLoglin(data = crime, frequency = count,
              control = list(save = 101:500))
daLoglin.preLoglin(crime.pre, margins = ~Visit.1:Visit.2,
              control = list(save = 101:500))
crime.em <- emLoglin(object = crime, margins = count~Visit.1:Visit.2) 
daLoglin.missmodel(object = crime.em,
              control = list(save = 101:500))