Discriminant Analysis
Discriminant analysis is a multivariate technique used to classify observations based on a set of feature data. It is assumed that the feature vectors have a Gaussian distribution.
Choose Statistics Multivariate
Discriminant Analysis. The dialog shown below appears.
Model page
In the Discriminate Analysis dialog, the Model page has the following options:
Data Set
Choose the data set (training data) containing the feature vectors and the factor identifying group membership.
Weights
Select the column in the data set containing the weights or or enter the name of an existing weight vector or matrix. An expression can be used as well (e.g. rep(1,100)). All weights must be positive.
Frequencies
Select the column in the data set containing the frequencies or enter the name of an existing frequencies vector.
Subset Rows
Enter the row indices of the data set to subset. For example, a range is separated by a colon, 1:10, and individual row indices are separated by commas 1, 2, 3, 4:20.
Select this box to omit from the analysis any rows in the data set that contain missing values for any of the variables in the model.
Dependent Variables
Select a column in the Data Set that is the factor that identifies group membership.
Independent Variables
Select the columns in the Data Set that are the feature variables.
Discriminant Analysis Formula
Enter a formula specifying the group variable and feature variables, with the group variable on the left of a ~ operator, and the feature variables, separated by + operators, on the right. The formula is automatically filled if the Dependent and Independent fields are filled first. If a Data Set is given, all names used in the formula should be defined as variables in the data frame.
Model
Family Select the family constructor. The choices are classical, common principle component, and canonical.
Covariance Struct. Select the covariance structure of the feature vectors. The choice of covariance structures is dependent on the chosen Family.
Group Prior Select the type of prior for group membership. The choices are proportional, uniform, or none. Comma delimited numeric values can be entered. If supplied the numeric values must be positive and sum to one.
In the Save As field, enter the name for the object in which to save the results of the analysis. If an object with this name already exists, its contents are overwritten. The model object can be used in later functions such as plotting.
Results page
In the Discriminant Analysis dialog, the Results page has the following options:
Printed/Graphic Results
Short Output for Discriminant Analysis
Display the estimates for the group means, covariance matrices, and model coefficients.
Long Output for Discriminant Analysis
Display the estimates for the group means, covariance matrices, model coefficients, and classification error. Compute multivariate tests for the equivalence of the means and tests for normality of the training data.
Plot for Discriminant Analysis
Generate scatter plots of the training data. For the canonical discriminant function, the canonical variates are plotted.
Saved Results for Discriminant Analysis
Save In Enter the name of the data set in which to put the predicted results, or select an existing data set from the dropdown list. If an existing data set is selected the new columns are concatenated beyond the last column. For each observation in the training data, a posterior probability of group membership is estimated. A factor column is also generated that assigns each observation to the group with the highest probability.
Plug-in Select to use the plug-in estimates
Predictive Select to use the Bayesian approach to discriminant analysis. This option is not available for the canonical discrimnant function.
Debiased Select to use a bias correction for the plug-in estimates. This option is not available for the canonical discriminant function.
Crossvalidate Select to use the leave-one-out crossvalidate estimates. This option is not available for the proportional, equal correlation, or common principal component covariance models.
Related S-Plus language functions
discrim, summary.discrim, plot.discrim, anova.discrim, multicomp.discrim