Generalized Additive Models
Generalized additive models extend linear models and generalized linear models by flexibly modeling additive nonlinear relationships between the predictors and the response. Whereas linear models assume that the response is linear in each predictor, additive models assume only that the response is affected by each predictor in a smooth way. The response is modeled as a sum of smooth functions in the predictors, where the smooth functions are estimated automatically using smoothers. Additive models may be useful for obtaining a final fit or for exploring what types of variable transformations might be appropriate for use in a standard linear model.
To fit a generalized additive model
Choose Statistics
Regression
Generalized
Additive. The
dialog shown below appears.
Model page
In the Generalized Additive Models dialog, the Model page contains the following options:
Data
Data Set
Select a data set from the dropdown list or type the name of a data set. You can also type into the Data Set edit field any expression that evaluates to a data set.
Weights
Enter the column that specifies weights to be applied to all observations used in the analysis. To weight all rows equally, leave this blank.
Enter an S-PLUS expression that identifies the rows to use in the analysis. To use all the rows in the data set, leave this field blank.
Select this box to omit from the analysis any rows in the data set that contain missing values for any of the variables in the model.
Formula
In the Formula field, enter a formula specifying the desired model. In its simplest form a formula consists of the response variable, a tilde (~), and a list of predictor variables separated by "+"s. An intercept is automatically included by default.
Create Formula
Click the Create Formula button to open a formula builder dialog used to construct a formula specifying the desired model. See the online Help section Building Formulas for more information.
Model
Family Select a distribution family for the model.
Link Select the link function for the model. The link function of the response is modeled as the sum of linear terms. The possible link functions depend on the family.
Variance For the quasi family a variance function can be selected.
Save Model Object
In the Save As field, enter the name for the object in which to save the results of the analysis. If an object with this name already exists, its contents are overwritten. The model object can be used in later functions such as plotting.
Options page
In the Generalized Additive Models dialog, the Options page contains the following options:
Optimization Parameters
Enter a numeric value specifying the maximum number of iterations to perform for the maximum likelihood estimation procedure.
Enter a positive number used as the tolerance for the convergence criterion in the algorithm.
Maximum Backfitting Iterations
Enter the maximum number of backfitting iterations.
Backfitting Convergence Tolerance
Enter the convergence threshold for backfitting iterations.
Print Iteration Trace
Select to print a summary of each iteration.
Results page
In the Generalized Additive Models dialog, the Results page contains the following options:
Printed Results
Short Output for Generalized Additive Model
Display a short summary of the model fit to the designated output window. This includes the model call, the degrees of freedom and the residual deviance.
Long Output for Generalized Additive Model
Display a detailed summary of the model fit to the designated output window.
Saved Results
Enter the name of a data set in which a part of the analysis, such as fitted values and residuals, predictions, confidence intervals, or standard errors, is saved.
Fitted Values
Save the fitted values from the model in the object specified in Save In.
Working Residuals
Store the working residuals in the object specified in Save In. The working residuals are the response minus the fitted value.
Pearson Residuals
Select to save the Pearson residuals. They are a rescaled version of the working residuals. Their sums-of-squares is the chi-squared statistic.
Deviance Residuals
Select to save the deviance residuals. These residuals are reasonable for use in detecting observations with unduly large influence in the fitting process, since they reflect the same criterion as used in the fitting.
Response Residuals
Select to save the response residuals. These are the ordinary residuals (the response minus the fitted value).
Plot page
In the Generalized Additive Models dialog, the Plot page contains the following options:
Plots
Residuals vs Fit
Select this to display a plot of the residuals versus the fitted values.
Sqrt Abs Residuals vs Fit
Display a plot of the square root of the absolute values of the residuals versus the fitted values. This plot is useful for checking for the constant variance assumption of the model.
Response vs Fit
Display a plot of the response variable versus the fitted values. The line y = x is also drawn on the graph.
Residuals Normal QQ
Display a normal quantile-quantile plot of the residuals.
Residual-Fit Spread
Display a residual-fit spread plot. This is a visual analog of the multiple R-squared statistic. It compares the spread of the fitted values to the spread of the residuals.
Display partial residual plots for all the terms in the model.
Options
Include Smooth
Display a smooth curve, computed with loess.smooth, on the Residuals vs Fit, Sqrt Abs Residuals vs Fit, and Response vs Fit plots. See the online Help for loess.smooth for details.
Include Rugplot
Display a rugplot on the Residuals vs Fit, Sqrt Abs Residuals vs Fit, and Response vs Fit plots. A rugplot is a sequence of vertical bars along the x-axis that mark the "observed" x values.
Number of Extreme Points to Identify
Enter the number of extreme points that are identified on the Residuals vs Fit, Sqrt Abs Residuals vs Fit, Residuals Normal QQ, and Cook's Distance plots. The row names from the data set specified on the model page are used to identify the points.
Partial Residual Plot Options
Include Partial Fit
Include the partial fit for the term on the plot.
Include Rugplot
Display rugplots on the partial residual plots. A rugplot is a sequence of vertical bars along the x-axis that mark the "observed" x values.
Common Y-Axis Scale
Give all the partial residual plots the same vertical units. This is essential for comparing the importance of fitted terms in additive models.
Predict page
In the Generalized Additive Models dialog, the Predict page contains the following options:
New Data
Enter the name of a matrix or data set to use for computing predictions. It must contain the same names as the terms in the right side of the formula for the model. If omitted, the original data are used for computing predictions.
Save
Enter the name of a data set in which a part of the analysis, such as fitted values and residuals, predictions, confidence intervals, or standard errors, is saved.
Predictions
Select this to save predictions to the data set specified in Save In.
Standard Errors
Store the pointwise standard errors for the predictions in the object specified in Save In.
Options
Select the type of prediction to be saved
S-Plus language functions related to Generalized Additive Models
gam, plot.gam, plot.glm, predict.gam, summary.gam
Other related S-Plus language functions
glm, loess, ace, avas