glm which is a generalized linear
fit of the data. The
glm function is generic (see Methods); method
functions can be written to handle specific classes of data. Classes
which already have methods for this function include:
model.list.
glm(formula, family = gaussian, data=<<see below>>,
weights, subset=<<see below>>, na.action,
start, control, method = "glm.fit", model = F, x = F, y = T,
contrasts = NULL, ...)
response ~ predictors. See the documentation of
and
for details.
link and
variance functions, initialization
and iterative weights. Families supported are
,
,
,
,
and
.
Functions like
produce a family object,
but can be given without the parentheses. Family functions can take arguments,
as in
binomial(link=probit).
subset and the
weights argument.
If this is missing, then the variables in the formula should be on the
search list.
This may also be a single number to handle some special cases -- see
below for details.
subset argument has been used.
The default (with
na.fail) is to create an error
if any missing values are found.
A possible alternative is
na.omit, which deletes observations
that contain one or more missing values.
glm itself.
"glm.fit".
Or may indicate that a data structure is to be returned
before fitting. The method
"model.frame"
returns the model frame, and
"model.list" returns the model list;
in either case
there is no fitting. If
method="model.list" the fitting method may be
included as well, in case the model list is to
be fit later (by a call to
).
For example,
c("model.list", "glm.fit") (the order is not important). This is the
only case in
which a vector is recognized.
TRUE, the
is returned in component
model.
If this argument is itself a
,
then the
formula
and
data arguments are ignored, and
model is used to define the model.
TRUE, the
is returned in component
x.
TRUE, the response variable is returned in
component
y (default is
TRUE).
control argument.
May also pass additional arguments for the fitting routines (see
).
One possibility is
qr=TRUE, in which case the QR
decomposition of the model.matrix is returned in component
qr.
"glm"
representing the fit, or of class
"model.frame"
or
"model.list" if signalled by the
method argument. See
,
,
,
and
for details.
The output can be examined by
,
,
,
and
.
Components can be extracted using
predict,
fitted,
residuals
,
deviance,
formula, and
family.
It can be modified using
.
It has all the components of an
object, with a few more.
Other generic functions that have methods for
glm objects are
drop1,
add1
,
step and
preplot. Use
for further details.
The response variable must conform with the definition of
family, for example
factor or binary data if
family=binomial is declared.
The model is fit using
Iterative Reweighted Least Squares(IRLS). The working response and iterative weights are computed using the functions contained in the
family object.
GLM models can also be fit using the function
.
The workhorse of
glm is the function
which expects an
x and
y argument rather than a formula.
NAMES.
Variables occurring in a formula are evaluated differently from
arguments to S-PLUS functions, because the formula is an object
that is passed around unevaluated from one function to another.
The functions such as
glm that finally arrange to evaluate the
variables in the formula try to establish a context based on the
data argument.
(More precisely, the function
does the
actual evaluation, assuming that its caller behaves in
the way described here.)
If the
data argument to
glm
is missing or is an object (typically, a data frame),
then the local context for
variable names is the frame of the function that called
glm, or the top-level
expression frame if you called
glm directly.
Names in the formula can refer to variables in the local context as well
as global variables or variables in the
data object.
The
data argument can also be a number, in which case that number defines
the local context.
This can arise, for example, if a function is written to call
glm, perhaps
in a loop, but the local context is definitely
notthat function.
In this case, the function can set
data to
sys.parent(), and the local
context will be the next function up the calling stack.
A numeric value for
data can also be supplied if a local context
is being explicitly created by a call to
new.frame.
Notice that supplying
data as a number implies that this is the
onlylocal context; local variables in any other function will not be
available when the model frame is evaluated.
This is potentially subtle.
Fortunately, it
is not something the ordinary user of
glm needs to worry about.
It is relevant for those writing functions that call
glm or other
such model-fitting functions.
McCullagh, P. and Nelder, J. A. (1983), Generalized Linear Models, Chapman and Hall, London.
glm(skips ~ ., family = poisson, data = solder.balance)
glm(Kyphosis ~ poly(Age, 2) + (Number > 5)*Start,
family = binomial, data = kyphosis)
glm(ozone^(1/3) ~ bs(radiation, 5) + poly(wind, temperature, degree = 2),
data = air)