Generalized Linear Models

Generalized linear models are generalizations of the familiar linear regression model to situations where the response is discrete or the model varies in other ways from the standard linear model. The most widely used generalized linear models are logistic regression models for binary data and log-linear models for count data.

To fit a generalized linear model

Choose Statistics __image\arrow5.gif Regression __image\arrow5.gif Generalized Linear. The dialog shown below appears.

Model page

__image\glm1.gif

In the Generalized Linear Models dialog, the Model page has the following options:

Data

Data Set

Select a data set from the dropdown list or type the name of a data set. You can also type into the Data Set edit field any expression that evaluates to a data set.

Weights

Enter the column that specifies weights to be applied to all observations used in the analysis. To weight all rows equally, leave this blank.

Subset Rows

Enter an S-PLUS expression that identifies the rows to use in the analysis. To use all the rows in the data set, leave this field blank.

Omit Rows with Missing Values

Select this box to omit from the analysis any rows in the data set that contain missing values for any of the variables in the model.

Variables

Dependent Variables

Select a variable as the dependent variable in the formula. The variable name will appear in the formula field below, followed by a '~'.

Independent Variables

Select one or more variables as the independent variables, or predictor, in the formula. To select more than one variable, Ctrl-click the variables.

Formula

In the Formula field, enter a formula specifying the desired model. In its simplest form a formula consists of the response variable, a tilde (~), and a list of predictor variables separated by "+"s. An intercept is automatically included by default.

Create Formula

Click the Create Formula button to open a formula builder dialog used to construct a formula specifying the desired model. See the online Help section Building Formulas for more information.

Model

Model Options

Family Select a distribution family for the model.

Link Select the link function for the model. The link function of the response is modeled as the sum of linear terms. The possible link functions depend on the family.

Variance For the quasi family a variance function can be selected.

Save Model Object

In the Save As field, enter the name for the object in which to save the results of the analysis. If an object with this name already exists, its contents are overwritten. The model object can be used in later functions such as plotting.

Options page

__image\glm2.gif

In the Generalized Linear Models dialog, the Options page has the following options:

Optimization Parameters

Maximum Iteration

Enter a numeric value specifying the maximum number of iterations to perform for the maximum likelihood estimation procedure.

Convergence Tolerance

Enter a positive number used as the tolerance for the convergence criterion in the algorithm.

Print Iteration Trace

Select to print a summary of each iteration.

Results page

__image\glm3.gif

In the Generalized Linear Models dialog, the Results page has the following options:

Short Output for Generalized Linear Model

Display a short summary of the model fit to the designated output window. This includes the model call, the degrees of freedom and the residual deviance.

Long Output for Generalized Linear Model

Display a detailed summary of the model fit to the designated output window.

ANOVA Table

Display an analysis of variance table. The sums-of-squares in the table are for the terms added sequentially (Type I sums-of-squares).

Correlation Matrix of Estimates

Display the correlation matrix of the regression coefficients. This option is available only if Long Output is selected.

Saved Results

Save In

Enter the name of a data set in which a part of the analysis, such as fitted values and residuals, predictions, confidence intervals, or standard errors, is saved. If an object with the name you enter does not already exist (in database 1), then it is created

Fitted Values

Save the fitted values from the model in the object specified in Save In.

Working Residuals

Store the working residuals in the object specified in Save In. The working residuals are the response minus the fitted value.

Pearson Residuals

Select to save the Pearson residuals. They are a rescaled version of the working residuals. Their sums-of-squares is the chi-squared statistic.

Deviance Residuals

Select to save the deviance residuals. These residuals are reasonable for use in detecting observations with unduly large influence in the fitting process, since they reflect the same criterion as used in the fitting.

Response Residuals

Select to save the response residuals. These are the ordinary residuals (the response minus the fitted value).

Plot page

__image\glm4.gif

In the Generalized Linear Models dialog, the Plot page has the following options:

Plots

Residuals vs Fit

Select this to display a plot of the residuals versus the fitted values.

Sqrt Abs Residuals vs Fit

Display a plot of the square root of the absolute values of the residuals versus the fitted values. This plot is useful for checking for the constant variance assumption of the model.

Response vs Fit

Display a plot of the response variable versus the fitted values. The line y = x is also drawn on the graph.

Residuals Normal QQ

Display a normal quantile-quantile plot of the residuals.

Partial Residuals

Display partial residual plots for all the terms in the model.

Predict page

__image\glm5.gif

In the Generalized Linear Models dialog, the Predict page has the following options:

New Data

Enter the name of a matrix or data set to use for computing predictions. It must contain the same names as the terms in the right side of the formula for the model. If omitted, the original data are used for computing predictions.

Save

Save In

Enter the name of a data set in which a part of the analysis, such as fitted values and residuals, predictions, confidence intervals, or standard errors, is saved.

Predictions

Select this to save predictions to the data set specified in Save In.

Standard Errors

Store the pointwise standard errors for the predictions in the object specified in Save In.

Predict TypePrediction_Type

Select the type of prediction to be saved.

S-Plus language functions related to Generalized Linear Models

glm, plot.glm, predict.glm, summary.glm

Other related S-Plus language functions

gam, lm, loess