glm
which is a generalized linear
fit of the data. The
glm
function is generic (see Methods); method
functions can be written to handle specific classes of data. Classes
which already have methods for this function include:
model.list
.
glm(formula, family = gaussian, data=<<see below>>, weights, subset=<<see below>>, na.action, start, control, method = "glm.fit", model = F, x = F, y = T, contrasts = NULL, ...)
response ~ predictors
. See the documentation of
and
for details.
link
and
variance
functions, initialization
and iterative weights. Families supported are
,
,
,
,
and
.
Functions like
produce a family object,
but can be given without the parentheses. Family functions can take arguments,
as in
binomial(link=probit)
.
subset
and the
weights
argument.
If this is missing, then the variables in the formula should be on the
search list.
This may also be a single number to handle some special cases -- see
below for details.
subset
argument has been used.
The default (with
na.fail
) is to create an error
if any missing values are found.
A possible alternative is
na.omit
, which deletes observations
that contain one or more missing values.
glm
itself.
"glm.fit"
.
Or may indicate that a data structure is to be returned
before fitting. The method
"model.frame"
returns the model frame, and
"model.list"
returns the model list;
in either case
there is no fitting. If
method="model.list"
the fitting method may be
included as well, in case the model list is to
be fit later (by a call to
).
For example,
c("model.list", "glm.fit")
(the order is not important). This is the
only case in
which a vector is recognized.
TRUE
, the
is returned in component
model
.
If this argument is itself a
,
then the
formula
and
data
arguments are ignored, and
model
is used to define the model.
TRUE
, the
is returned in component
x
.
TRUE
, the response variable is returned in
component
y
(default is
TRUE
).
control
argument.
May also pass additional arguments for the fitting routines (see
).
One possibility is
qr=TRUE
, in which case the QR
decomposition of the model.matrix is returned in component
qr
.
"glm"
representing the fit, or of class
"model.frame"
or
"model.list"
if signalled by the
method
argument. See
,
,
,
and
for details.
The output can be examined by
,
,
,
and
.
Components can be extracted using
predict
,
fitted
,
residuals
,
deviance
,
formula
, and
family
.
It can be modified using
.
It has all the components of an
object, with a few more.
Other generic functions that have methods for
glm
objects are
drop1
,
add1
,
step
and
preplot
. Use
for further details.
The response variable must conform with the definition of
family
, for example
factor or binary data if
family=binomial
is declared.
The model is fit using
Iterative Reweighted Least Squares(IRLS). The working response and iterative weights are computed using the functions contained in the
family
object.
GLM models can also be fit using the function
.
The workhorse of
glm
is the function
which expects an
x
and
y
argument rather than a formula.
NAMES.
Variables occurring in a formula are evaluated differently from
arguments to S-PLUS functions, because the formula is an object
that is passed around unevaluated from one function to another.
The functions such as
glm
that finally arrange to evaluate the
variables in the formula try to establish a context based on the
data
argument.
(More precisely, the function
does the
actual evaluation, assuming that its caller behaves in
the way described here.)
If the
data
argument to
glm
is missing or is an object (typically, a data frame),
then the local context for
variable names is the frame of the function that called
glm
, or the top-level
expression frame if you called
glm
directly.
Names in the formula can refer to variables in the local context as well
as global variables or variables in the
data
object.
The
data
argument can also be a number, in which case that number defines
the local context.
This can arise, for example, if a function is written to call
glm
, perhaps
in a loop, but the local context is definitely
notthat function.
In this case, the function can set
data
to
sys.parent()
, and the local
context will be the next function up the calling stack.
A numeric value for
data
can also be supplied if a local context
is being explicitly created by a call to
new.frame
.
Notice that supplying
data
as a number implies that this is the
onlylocal context; local variables in any other function will not be
available when the model frame is evaluated.
This is potentially subtle.
Fortunately, it
is not something the ordinary user of
glm
needs to worry about.
It is relevant for those writing functions that call
glm
or other
such model-fitting functions.
McCullagh, P. and Nelder, J. A. (1983), Generalized Linear Models, Chapman and Hall, London.
glm(skips ~ ., family = poisson, data = solder.balance) glm(Kyphosis ~ poly(Age, 2) + (Number > 5)*Start, family = binomial, data = kyphosis) glm(ozone^(1/3) ~ bs(radiation, 5) + poly(wind, temperature, degree = 2), data = air)