Generalized Additive Models

Generalized additive models extend linear models and generalized linear models by flexibly modeling additive nonlinear relationships between the predictors and the response. Whereas linear models assume that the response is linear in each predictor, additive models assume only that the response is affected by each predictor in a smooth way. The response is modeled as a sum of smooth functions in the predictors, where the smooth functions are estimated automatically using smoothers. Additive models may be useful for obtaining a final fit or for exploring what types of variable transformations might be appropriate for use in a standard linear model.

To fit a generalized additive model

Choose Statistics __image\arrow5.gif Regression __image\arrow5.gif Generalized Additive. The dialog shown below appears.

Model page

__image\gam1.gif

In the Generalized Additive Models dialog, the Model page contains the following options:

Data

Data Set

Select a data set from the dropdown list or type the name of a data set. You can also type into the Data Set edit field any expression that evaluates to a data set.

Weights

Enter the column that specifies weights to be applied to all observations used in the analysis. To weight all rows equally, leave this blank.

Subset Rows

Enter an S-PLUS expression that identifies the rows to use in the analysis. To use all the rows in the data set, leave this field blank.

Omit Rows with Missing Values

Select this box to omit from the analysis any rows in the data set that contain missing values for any of the variables in the model.

Formula

Formula

In the Formula field, enter a formula specifying the desired model. In its simplest form a formula consists of the response variable, a tilde (~), and a list of predictor variables separated by "+"s. An intercept is automatically included by default.

Create Formula

Click the Create Formula button to open a formula builder dialog used to construct a formula specifying the desired model. See the online Help section Building Formulas for more information.

Model

Model Options

Family Select a distribution family for the model.

Link Select the link function for the model. The link function of the response is modeled as the sum of linear terms. The possible link functions depend on the family.

Variance For the quasi family a variance function can be selected.

Save Model Object

Save Model Object

In the Save As field, enter the name for the object in which to save the results of the analysis. If an object with this name already exists, its contents are overwritten. The model object can be used in later functions such as plotting.

Options page

__image\gam2.gif

In the Generalized Additive Models dialog, the Options page contains the following options:

Optimization Parameters

Maximum Iteration

Enter a numeric value specifying the maximum number of iterations to perform for the maximum likelihood estimation procedure.

Convergence Tolerance

Enter a positive number used as the tolerance for the convergence criterion in the algorithm.

Maximum Backfitting Iterations

Enter the maximum number of backfitting iterations.

Backfitting Convergence Tolerance

Enter the convergence threshold for backfitting iterations.

Print Iteration Trace

Select to print a summary of each iteration.

Results page

../__image/gam3.gif

In the Generalized Additive Models dialog, the Results page contains the following options:

Printed Results

Short Output for Generalized Additive Model

Display a short summary of the model fit to the designated output window. This includes the model call, the degrees of freedom and the residual deviance.

Long Output for Generalized Additive Model

Display a detailed summary of the model fit to the designated output window.

Saved Results

Save In

Enter the name of a data set in which a part of the analysis, such as fitted values and residuals, predictions, confidence intervals, or standard errors, is saved.

Fitted Values

Save the fitted values from the model in the object specified in Save In.

Working Residuals

Store the working residuals in the object specified in Save In. The working residuals are the response minus the fitted value.

Pearson Residuals

Select to save the Pearson residuals. They are a rescaled version of the working residuals. Their sums-of-squares is the chi-squared statistic.

Deviance Residuals

Select to save the deviance residuals. These residuals are reasonable for use in detecting observations with unduly large influence in the fitting process, since they reflect the same criterion as used in the fitting.

Response Residuals

Select to save the response residuals. These are the ordinary residuals (the response minus the fitted value).

Plot page

__image\gam4.gif

In the Generalized Additive Models dialog, the Plot page contains the following options:

Plots

Residuals vs Fit

Select this to display a plot of the residuals versus the fitted values.

Sqrt Abs Residuals vs Fit

Display a plot of the square root of the absolute values of the residuals versus the fitted values. This plot is useful for checking for the constant variance assumption of the model.

Response vs Fit

Display a plot of the response variable versus the fitted values. The line y = x is also drawn on the graph.

Residuals Normal QQ

Display a normal quantile-quantile plot of the residuals.

Residual-Fit Spread

Display a residual-fit spread plot. This is a visual analog of the multiple R-squared statistic. It compares the spread of the fitted values to the spread of the residuals.

Partial Residuals

Display partial residual plots for all the terms in the model.

Options

Include Smooth

Display a smooth curve, computed with loess.smooth, on the Residuals vs Fit, Sqrt Abs Residuals vs Fit, and Response vs Fit plots. See the online Help for loess.smooth for details.

Include Rugplot

Display a rugplot on the Residuals vs Fit, Sqrt Abs Residuals vs Fit, and Response vs Fit plots. A rugplot is a sequence of vertical bars along the x-axis that mark the "observed" x values.

Number of Extreme Points to Identify

Enter the number of extreme points that are identified on the Residuals vs Fit, Sqrt Abs Residuals vs Fit, Residuals Normal QQ, and Cook's Distance plots. The row names from the data set specified on the model page are used to identify the points.

Partial Residual Plot Options

Include Partial Fit

Include the partial fit for the term on the plot.

Include Rugplot

Display rugplots on the partial residual plots. A rugplot is a sequence of vertical bars along the x-axis that mark the "observed" x values.

Common Y-Axis Scale

Give all the partial residual plots the same vertical units. This is essential for comparing the importance of fitted terms in additive models.

Predict page

__image\gam5.gif

In the Generalized Additive Models dialog, the Predict page contains the following options:

New Data

Enter the name of a matrix or data set to use for computing predictions. It must contain the same names as the terms in the right side of the formula for the model. If omitted, the original data are used for computing predictions.

Save

Save In

Enter the name of a data set in which a part of the analysis, such as fitted values and residuals, predictions, confidence intervals, or standard errors, is saved.

Predictions

Select this to save predictions to the data set specified in Save In.

Standard Errors

Store the pointwise standard errors for the predictions in the object specified in Save In.

Options

Prediction Type

Select the type of prediction to be saved

S-Plus language functions related to Generalized Additive Models

gam, plot.gam, plot.glm, predict.gam, summary.gam

Other related S-Plus language functions

glm, loess, ace, avas