Principal Components
For investigations involving a large number of observed variables, it is often useful to simplify the analysis by considering a smaller number of linear combinations of the original variables. For example, scholastic achievement tests typically consist of a number of examinations in different subject areas. In attempting to rate students applying for admission, college administrators frequently attempt to reduce the scores from all subject areas to a single, overall score. Principal components is a standard technique for finding optimal linear combinations of the variables.
To perform principal components analysis:
Choose Statistics Multivariate
Principal Components. The dialog shown below appears.
Model page
In the Principal Components Analysis dialog, the Model page has the following options:
Data
Data (Principal Components)
In the Data group you have the choice of using a data set and then specifying a formula, or using a covariance list to perform the principal components analysis. The appropriate fields become enabled depending upon your choice.
Enter an S-PLUS expression that identifies the rows to use in the analysis. To use all the rows in the data set, leave this field blank.
Select this box to omit from the analysis any rows in the data set that contain missing values for any of the variables in the model.
Use Covariance List as Input
Select this to use a covariance list as model input, instead of a data set. Selecting this enables the Covariance List field. Selecting this enables the Covariance List field and makes the other Data fields and Formula fields unavailable.
Covariance List
Enter the name of a covariance list to be used as alternative model input. This list must have the form of a list returned by cov.wt and cov.mve. Components must include center and cov. A cor component is not used; however, an n.obs component is used if present.
Formula
Formula (Principal Components)
Variables Choose several variables to include in the principal components analysis.
Formula The Formula edit field is automatically filled using the variables selected from the Variables dropdown list. There is no response variable for principal component analysis; the formula shows the selected variables additively, following a tilde (~). The formula field may be edited directly.
Model Scaling
Select either Covariance (unscaled) or Correlation (scaled to have unit variance) to define the scaling on which the computation of principal components is based on. The default is Covariance
Results page
In the Principal Components Analysis dialog, the Results page has the following options:
Short Output for Principal Components Analysis
Select this to print a summary of the model results in the designated output window. Printed results include sums of squares of the component loadings, the size of the data, the names of the components in the fitted model object, and the call that created the model object.
Component Importance
Select this to include the importance of each factor in the printed results.
Loadings
Select this to include the loadings matrix with the printed results.
Loading Options
In the Cutoff Loading Value field enter a number giving the cutoff for printing the loadings. Elements of the loadings matrix whose absolute value is smaller than the cutoff value appear as blanks. This field is only enabled when Loadings is selected.
Plot page
In the Principal Components Analysis dialog, the Plot page has the following options:
Screeplot
Select this to produce a barplot of eigenvalues for each principal component.
Biplot
Select this to produce a biplot between two factors of the fitted model (Factor Analysis) or of the component loadings (Principal Analysis). The biplot shows the relation of the factors to both the original variables and the original data. This field is enabled only when the number of factors to be fitted is greater than one.
Biplot Options
Biplot Which Scores Enter the two factors or components to be plotted in the form c(factor1, factor2). By default, a biplot of the first two factors is created. This field is enabled only when Biplot is selected.
Predict page
In the Principal Components Analysis dialog, the Predict page has the following options:
New Data
Enter the name of a matrix or data set to use for computing predictions. It must contain the same names as the terms in the right side of the formula for the model. If omitted, the original data are used for computing predictions.
Enter the name of a data set in which a part of the analysis, such as fitted values and residuals, predictions, confidence intervals, or standard errors, is saved.
Predictions
Select this to save predictions to the data set specified in Save In.
Related S-Plus language functions for Principal Components Analysis
princomp, princomp.object, loadings, biplot.princomp, screeplot, plot.loadings,
Other related S-Plus language functions
svd, cancor, factanal