Preprocessor for Log-Linear Models Routines

DESCRIPTION:

Sorts and groups the data for subsequent analysis by a log-linear model routine which handles missing values using EM or data augmentation algorithms.

USAGE:

preLoglin(data, frequency, margins, subset) 

REQUIRED ARGUMENTS:

data
a data frame or matrix containing the raw data. When a data frame in input, the table is specified by the levels of the factor variables. When the input object is a matrix, it is assumed that the levels of a variable is a sequence of integers from one to the maximum value of the variable.

OPTIONAL ARGUMENTS:

frequency
the frequency of the corresponding row in argument data. If data is a data frame and this is the (unquoted) name of a variable in the data frame, then that variable is used. If omitted, all frequencies are assumed to be 1 (unless specified in argument margins).
margins
the marginal totals to be fit. A margin is described by the factors not summed over. Thus list(1:2, 3:4) would indicate fitting the 1,2 margin (summing over variables 3 and 4) and the 3,4 margin in a four-way table. This same model can be specified using the names of the variables (e.g., list(c("V1", "V2"), c("V3", "V4"))), or using formula notation, as in margins = ~V1:V2 + V3:V4. When formula notation is used, the argument frequency can be included as the dependent variable (as in margins = frequency~V1:V2 + V3:V4).

For preLoglin, the margins argument is only used to identify the variables, so different expressions involving the same variables lead to the same preLoglin object. E.g. the same preLoglin object results from margin = ~ x + y or margin = ~ x : y. Similarly, the same preLoglin object can be used in any further analysis involving the same variables. E.g. the preLoglin object created using margin = ~ x + y may be used to call emLoglin with margin = ~ x : y.

If margins is not specified: when data is a matrix, then every column is included; when data is a data frame, all factor variables are included. Cell counts in the table are determined by the frequency variable.
subset
expression specifying which rows of the data should be used in the fit. This can be a logical vector (which is replicated to have length equal to the number of rows), a numeric vector indicating the observation numbers to be included, or a character vector of the row names to be included. All observations are included by default. If data is a data frame, this expression may use variables in the data frame.

VALUE:

an object of class "preLoglin"; see for details.

DETAILS:

This routine performs the preprocessing required before a data set can be analyzed using the data augmentation or EM algorithms. In repeated calls to the data augmentation, EM, or impute routines, this preprocessing can significantly speed computations.

SEE ALSO:

, , , .

EXAMPLES:

crime.pre <- preLoglin(data = crime, margins = count~Visit.1 : Visit.2)
crime.pre <- preLoglin(data = crime, margins = count~Visit.1 + Visit.2) # same 
crime.pre <- preLoglin(data = crime, margins = count~Visit.1 * Visit.2) # same