ANOVA
Analysis of variance (ANOVA) is generally used to explore the influence of one or more categorical variables upon a continuous response. Fixed effects ANOVA differs from linear regression only in the types of summaries desired. The ANOVA and linear regression models are otherwise equivalent.
To perform fixed effects analysis of variance
Choose Statistics ANOVA
Fixed Effects. The dialog shown below appears.
Model page
In the Analysis of Variance dialog, the Model page has the following options:
Data
Data Set
Select a data set from the dropdown list or type the name of a data set. You can also type into the Data Set edit field any expression that evaluates to a data set.
Weights
Enter the column that specifies weights to be applied to all observations used in the analysis. To weight all rows equally, leave this blank.
Enter an S-PLUS expression that identifies the rows to use in the analysis. To use all the rows in the data set, leave this field blank.
Select this box to omit from the analysis any rows in the data set that contain missing values for any of the variables in the model.
Variables
Dependent Variables
Select a variable as the dependent variable in the formula. The variable name will appear in the formula field below, followed by a '~'.
Select one or more variables as the independent variables, or predictor, in the formula. To select more than one variable, Ctrl-click the variables.
In the Formula field, enter a formula specifying the desired model. In its simplest form a formula consists of the response variable, a tilde (~), and a list of predictor variables separated by "+"s. An intercept is automatically included by default.
Create Formula
Click the Create Formula button to open a formula builder dialog used to construct a formula specifying the desired model. See the online Help section Building Formulas for more information.
In the Save As field, enter the name for the object in which to save the results of the analysis. If an object with this name already exists, its contents are overwritten. The model object can be used in later functions such as plotting.
Options page
In the Analysis of Variance dialog, the Options page has the following options:
Contrasts
Assign Contrast Choose contrasts for the factors; by default, the Helmert contrasts are assigned to unordered factors and polynomial contrasts are assigned to ordered factors.
to Variable(s) Select one or more variables to which the selected contrast in Assign Contrast will be assigned.
Contrasts This field displays the selection and assignment chosen in Assign Contrast and to Variable(s).
Results page
In the Analysis of Variance dialog, the Results page has the following options:
Printed Results
This option is selected by default.
This option is selected by default. The sums of squares decomposition reflects the amount of variance each term contributes to the overall model variation.
Select to print the Type III Sums of Squares.
Estimated Coefficients
Select this to print the estimated coefficients. There are K-1 such coefficients for each K-level factor.
Estimated K Coef for K-Level Factor
Print K coefficients for each K-level factor.
Means
Select to print the mean values.
Adjusted Means
Select to print the adjusted mean values.
Saved Results
Enter the name of a data set in which a part of the analysis, such as fitted values and residuals, predictions, confidence intervals, or standard errors, is saved. If an object with the name you enter does not already exist (in database 1), then it is created
Fitted Values
Save the fitted values from the model in the object specified in Save In.
Residuals
Save the residuals from the model in the object specified in Save In. These are the ordinary residuals (the response minus the fitted value).
Plot page
In the Analysis of Variance dialog, the Plot page has the following options:
Plots
Residuals vs Fit
Select this to display a plot of the residuals versus the fitted values.
Sqrt Abs Residuals vs Fit
Display a plot of the square root of the absolute values of the residuals versus the fitted values. This plot is useful for checking for the constant variance assumption of the model.
Response vs Fit
Display a plot of the response variable versus the fitted values. The line y = x is also drawn on the graph.
Residuals Normal QQ
Display a normal quantile-quantile plot of the residuals.
Residual-Fit Spread
Display a residual-fit spread plot. This is a visual analog of the multiple R-squared statistic. It compares the spread of the fitted values to the spread of the residuals.
Cook's Distance
Display a plot of Cook's distance values versus the observation number.
Display partial residual plots for all the terms in the model.
Options
Include Smooth
Display a smooth curve, computed with loess.smooth, on the Residuals vs Fit, Sqrt Abs Residuals vs Fit, and Response vs Fit plots. See the online Help for loess.smooth for details.
Include Rugplot
Display a rugplot on the Residuals vs Fit, Sqrt Abs Residuals vs Fit, and Response vs Fit plots. A rugplot is a sequence of vertical bars along the x-axis that mark the "observed" x values.
Number of Extreme Points to Identify
Enter the number of extreme points that are identified on the Residuals vs Fit, Sqrt Abs Residuals vs Fit, Residuals Normal QQ, and Cook's Distance plots. The row names from the data set specified on the model page are used to identify the points.
Partial Residual Plot Options
Include Partial Fit
Include the partial fit for the term on the plot.
Include Rugplot
Display rugplots on the partial residual plots. A rugplot is a sequence of vertical bars along the x-axis that mark the "observed" x values.
Common Y-Axis Scale
Give all the partial residual plots the same vertical units. This is essential for comparing the importance of fitted terms in additive models.
Compare page
In the Analysis of Variance dialog, the Compare page has the following options:
Variable
Levels Of
Select the term in the model to which comparisons will be made. This list is empty until a selection has been made in Model Object.
Comparison Type
Select the type of comparisons to be made among the adjusted means.
mca all pairwise differences
mcc all pairwise differences between all adjusted means and the adjusted means for the factor level specified in Compare To Level
none if the adjusted means themselves are of interest without further differencing.
Compare To Level
Select the factor level to which all other levels will be compared. This field is available only when Comparisons is set to mcc.
Results
Enter the name for the object in which to save the results of the analysis.
Print Results
Select this to print out the results of the analysis in the designated output window.
Plot Intervals
Select this for a graphical representation of the intervals.
Options
Choose a method for critical point calculation from the dropdown list. The following options are available:
Confidence Level for Multiple Comparisons
Enter the joint confidence level desired. This value should be less than 1 and greater than 0.
Bounds
Select upper.and.lower for confidence intervals. For one-sided confidence bounds, select either upper or lower.
Error Type
Select the error rate type. If family-wise is selected, the probability that all bounds hold is the level specified in Confidence Level. If comparison-wise is selected, the probability that any one pre-selected bound holds is the level specified in the Confidence Level.
Specify a list of other factors and/or covariates in the model, and specified adjustment values for these.
Contrast Matrix
Enter the name of a contrast matrix. Each column specifies a linear combination to be estimated under the textbook parameterization of the linear model. See the online Help for multicomp or the chapter Multiple Comparisons in the Guide to Statistics for more information.
Critical Point
Enter a value for the critical point used in the confidence intervals/bounds. Use this if none of the methods are suitable.
Simulation Size
Enter the size of the simulation to use. This is available when Method is Simulation or Best. The default value provides intervals or bounds whose actual family-wise error rate is within 10% of the requested rate.
Scheffe Rank
Enter the rank of the design matrix. For example, in a model consisting solely of a sum of continuous predictors, this would be the number of coefficients. This is used by the methods Scheffe, best, and best.fast for computing the Scheffe estimates.
Validity Check
Select this to check the validity of the specified critical point calculation method for the desired comparisons. If the validity check fails, processing stops with an error message.
Estimability Check
Select this to check estimability of the desired linear combinations. If the estimability condition fails, processing stops with an error message.
S-Plus language functions related to Analysis of Variance
aov, summary.aov, plot.lm, coef, dummy.coef
Other related S-Plus language functions
lm, manova, raov, multicomp