The
language
and
languageImpExample
data frames have 279 rows and 12 columns.
This data were collected to compare instruments
for predicting success in studying foreign languages.
In particular, interest is in the Foreign
Language Attitude Scale (FLAS).
Ten of the variables have missing values.
The descriptions below are for
language
;
languageImpExample
has the same columns, but variables with
missing values are now of class
miVariable
, containing the original
data and 10 sets of multiple imputations for the missing values.
These data frames contain the following columns:
levels(language$AGE)
to see the levels.
Schafer, J. L. (1997),
Analysis of Incomplete Multivariate Data ,
Chapman & Hall, London.
The above book cites Raymond (1987) as the original source of the data,
but does not provide bibliographic details in the References section.
Schimert, J., Schafer, J. L., Clarkson, D. B., Fraley, C., Hesterberg, T., (2000) Analyzing Data with Missing Values in S-PLUS , Insightful Corporation, Seattle, Washington (This manual is available on-line as file Missing.pdf), Chapter 11.
#PreProcess to save computation below language.s <- preCgm(language) #Categorical variables LAN, AGE, PRI, SEX, GRD specify a 5 dimensional # contingency table with 4*5*5*2*5 = 1000 cells #Specify loglinear model with all main effects and 2-variable associations: margins.form <- ~ LAN + AGE + PRI + SEX + GRD + LAN:AGE + LAN:PRI + LAN:SEX + LAN:GRD + AGE:PRI + AGE:SEX + AGE:GRD + PRI:SEX + PRI:GRD + SEX:GRD #linear contrast lc <- -2:2 design.form <- ~ LAN + C(AGE,lc,1) + C(PRI,lc,1) + SEX + C(GRD,lc,1) # Find a starting value for DA, by running EM. # Set the hyperparameter to 1.05 to ensure a mode in the # interior of the parameter space (Schafer p. 369). language.em <- emCgm(language.s, margins = margins.form, design = design.form, prior = 1.05, control = list(trace = F)) # Data dependent prior with hyperparameters scaled to add to 50 (page 369) options(object.size = 700000000) dataDepend <- dataDepPrior(language.s, nPriorObs = 50, algorithm = "da") language.da <- daCgm(language.em, prior = dataDepend, control = list(niter = 1000, save = 100:1000, seed = 620)) # Generate 10 imputations, using a single chain # with 250 cycles between imputations languageImpExample <- impCgm(language.da, nimpute = 10, control = list(niter = 250, seed = 749))