Foreign Language Attitude Scale data

SUMMARY:

The language and languageImpExample data frames have 279 rows and 12 columns. This data were collected to compare instruments for predicting success in studying foreign languages. In particular, interest is in the Foreign Language Attitude Scale (FLAS).

Ten of the variables have missing values. The descriptions below are for language; languageImpExample has the same columns, but variables with missing values are now of class miVariable, containing the original data and 10 sets of multiple imputations for the missing values.

DATA DESCRIPTION:

These data frames contain the following columns:

ARGUMENTS:

AGE
factor with 11 missing values and 5 levels: age group. For this and other factor variables, do e.g. levels(language$AGE) to see the levels.
PRI
factor with 11 missing values and 5 levels: number of prior foreign language courses
SEX
factor with 1 missing value and 2 levels.
GRD
factor with 47 missing values and 5 levels: final grade in foreign language course.
LAN
factor with 0 missing values and 4 levels: foreign language studied.
FLAS
integer vector with 0 missing values: score on Foreign Language Attitude Scale.
MLAT
integer vector with 49 missing values: score on Modern Language Aptitude Test.
SATV
integer vector with 34 missing values: Scholastic Aptitude Test, verbal score.
SATM
integer vector with 34 missing values: Scholastic Aptitude Test, math score.
ENG
integer vector with 37 missing values: score on Penn State English placement exam.
HGPA
numeric vector with 1 missing value: High School grade point average.
CGPA
numeric vector with 34 missing values: current college grade point average.

SOURCE:

Schafer, J. L. (1997), Analysis of Incomplete Multivariate Data , Chapman & Hall, London.

The above book cites Raymond (1987) as the original source of the data, but does not provide bibliographic details in the References section.

REFERENCES:

Schimert, J., Schafer, J. L., Clarkson, D. B., Fraley, C., Hesterberg, T., (2000) Analyzing Data with Missing Values in S-PLUS , Insightful Corporation, Seattle, Washington (This manual is available on-line as file Missing.pdf), Chapter 11.

SEE ALSO:

, , , , , .

EXAMPLES:

#PreProcess to save computation below 
language.s <- preCgm(language) 
#Categorical variables LAN, AGE, PRI, SEX, GRD specify a 5 dimensional 
# contingency table with 4*5*5*2*5 = 1000 cells 
#Specify loglinear model with all main effects and 2-variable associations: 
margins.form <- ~ LAN + AGE + PRI + SEX + GRD + 
             LAN:AGE + LAN:PRI + LAN:SEX + LAN:GRD + 
             AGE:PRI + AGE:SEX + AGE:GRD + 
             PRI:SEX + PRI:GRD + 
             SEX:GRD 
#linear contrast 
lc <- -2:2 
design.form <- ~ LAN + C(AGE,lc,1) + C(PRI,lc,1) + SEX + C(GRD,lc,1) 
# Find a starting value for DA, by running EM.   
# Set the hyperparameter to 1.05 to ensure a mode in the  
# interior of the parameter space (Schafer p. 369). 
language.em <- emCgm(language.s, margins = margins.form, 
                     design = design.form, 
                     prior = 1.05, control = list(trace = F)) 
# Data dependent prior with hyperparameters scaled to add to 50 (page 369) 
options(object.size = 700000000) 
dataDepend <- dataDepPrior(language.s, nPriorObs = 50, algorithm = "da") 
language.da <- daCgm(language.em, prior = dataDepend, 
                     control = list(niter = 1000, save = 100:1000, 
                       seed = 620)) 
# Generate 10 imputations, using a single chain 
# with 250 cycles between imputations 
languageImpExample <- impCgm(language.da, nimpute = 10, 
                             control = list(niter = 250, seed = 749))