Generalized Linear Models
Generalized linear models are generalizations of the familiar linear regression model to situations where the response is discrete or the model varies in other ways from the standard linear model. The most widely used generalized linear models are logistic regression models for binary data and log-linear models for count data.
To fit a generalized linear model
Choose Statistics Regression
Generalized Linear. The dialog shown below appears.
Model page
In the Generalized Linear Models dialog, the Model page has the following options:
Data
Data Set
Select a data set from the dropdown list or type the name of a data set. You can also type into the Data Set edit field any expression that evaluates to a data set.
Weights
Enter the column that specifies weights to be applied to all observations used in the analysis. To weight all rows equally, leave this blank.
Enter an S-PLUS expression that identifies the rows to use in the analysis. To use all the rows in the data set, leave this field blank.
Select this box to omit from the analysis any rows in the data set that contain missing values for any of the variables in the model.
Variables
Dependent Variables
Select a variable as the dependent variable in the formula. The variable name will appear in the formula field below, followed by a '~'.
Select one or more variables as the independent variables, or predictor, in the formula. To select more than one variable, Ctrl-click the variables.
In the Formula field, enter a formula specifying the desired model. In its simplest form a formula consists of the response variable, a tilde (~), and a list of predictor variables separated by "+"s. An intercept is automatically included by default.
Create Formula
Click the Create Formula button to open a formula builder dialog used to construct a formula specifying the desired model. See the online Help section Building Formulas for more information.
Model
Family Select a distribution family for the model.
Link Select the link function for the model. The link function of the response is modeled as the sum of linear terms. The possible link functions depend on the family.
Variance For the quasi family a variance function can be selected.
In the Save As field, enter the name for the object in which to save the results of the analysis. If an object with this name already exists, its contents are overwritten. The model object can be used in later functions such as plotting.
Options page
In the Generalized Linear Models dialog, the Options page has the following options:
Optimization Parameters
Enter a numeric value specifying the maximum number of iterations to perform for the maximum likelihood estimation procedure.
Enter a positive number used as the tolerance for the convergence criterion in the algorithm.
Print Iteration Trace
Select to print a summary of each iteration.
Results page
In the Generalized Linear Models dialog, the Results page has the following options:
Short Output for Generalized Linear Model
Display a short summary of the model fit to the designated output window. This includes the model call, the degrees of freedom and the residual deviance.
Long Output for Generalized Linear Model
Display a detailed summary of the model fit to the designated output window.
ANOVA Table
Display an analysis of variance table. The sums-of-squares in the table are for the terms added sequentially (Type I sums-of-squares).
Correlation Matrix of Estimates
Display the correlation matrix of the regression coefficients. This option is available only if Long Output is selected.
Saved Results
Enter the name of a data set in which a part of the analysis, such as fitted values and residuals, predictions, confidence intervals, or standard errors, is saved. If an object with the name you enter does not already exist (in database 1), then it is created
Fitted Values
Save the fitted values from the model in the object specified in Save In.
Working Residuals
Store the working residuals in the object specified in Save In. The working residuals are the response minus the fitted value.
Pearson Residuals
Select to save the Pearson residuals. They are a rescaled version of the working residuals. Their sums-of-squares is the chi-squared statistic.
Deviance Residuals
Select to save the deviance residuals. These residuals are reasonable for use in detecting observations with unduly large influence in the fitting process, since they reflect the same criterion as used in the fitting.
Response Residuals
Select to save the response residuals. These are the ordinary residuals (the response minus the fitted value).
Plot page
In the Generalized Linear Models dialog, the Plot page has the following options:
Plots
Residuals vs Fit
Select this to display a plot of the residuals versus the fitted values.
Sqrt Abs Residuals vs Fit
Display a plot of the square root of the absolute values of the residuals versus the fitted values. This plot is useful for checking for the constant variance assumption of the model.
Response vs Fit
Display a plot of the response variable versus the fitted values. The line y = x is also drawn on the graph.
Residuals Normal QQ
Display a normal quantile-quantile plot of the residuals.
Display partial residual plots for all the terms in the model.
Predict page
In the Generalized Linear Models dialog, the Predict page has the following options:
New Data
Enter the name of a matrix or data set to use for computing predictions. It must contain the same names as the terms in the right side of the formula for the model. If omitted, the original data are used for computing predictions.
Save
Enter the name of a data set in which a part of the analysis, such as fitted values and residuals, predictions, confidence intervals, or standard errors, is saved.
Predictions
Select this to save predictions to the data set specified in Save In.
Standard Errors
Store the pointwise standard errors for the predictions in the object specified in Save In.
Predict TypePrediction_Type
Select the type of prediction to be saved.
S-Plus language functions related to Generalized Linear Models
glm, plot.glm, predict.glm, summary.glm
Other related S-Plus language functions
gam, lm, loess