tapply(X, INDICES, FUN=<<see below>>, ..., simplify=T)
bdVector of data to be grouped by
indices.
Missing values (
NAs) are allowed if
FUN accepts them.
X.
The elements of the categories define the position in a multi-way
array corresponding to each
X observation.
Missing values (
NAs) are allowed.
The names of
INDICES are used as the names of the dimnames of the result.
If a vector or
bdVector is given, it will be treated as a list with one component.
FUN is omitted,
tapply returns a vector that can be
used to subscript the multi-way array that
tapply normally produces.
This vector is useful for computing residuals.
See the example.
FUN.
tapply will always return an array of mode list.
If TRUE (the default), then if
FUN always returns a scalar the
tapply will return an array with the mode of the scalar, and
if the array would be one dimensional then the dimension is removed,
to make it a vector.
(
simplify is ignored if
FUN is not supplied.)
FUN is missing, a vector of indices is returned.
These are the indices giving the position in the array that would be returned if
FUN
were not missing.
FUN is present,
tapply
calls
FUN for each cell that has any data in it.
If
FUN returns a single atomic value for each cell (e.g. functions
mean or
var
), then
tapply returns a multi-way array
containing the values.
The array has the same number of dimensions as
INDICES has components; the
number of levels in a dimension is the number of levels in the corresponding
component of
INDICES.
This is a vector if
INDICES has only one component.
FUN does not return a single atomic value,
tapply returns an array
of mode
"list", whose components are the values of the individual
calls to
FUN.
Another way of saying this is that the result is a list that has a
dim
attribute (this prints as a list, but you can subscript it like an array).
Evaluates a function,
FUN, on data values that correspond to each
cell of a multi-way array.
tapply.
tapply(income, list(cut(age, 5), gender), mean) # 5 by 2 matrix of the mean income for each age-gender combination # generate mean republican votes for regions of the U.S. # category that gives the region for each observation region <- state.region[row(votes.repub)] election <- category(votes.year)[col(votes.repub)] mn <- tapply(votes.repub,list(region,election),mean) round(mn,1) # table of mean vote by region and election positions <- tapply(votes.repub,list(region,election)) # positions is a vector of indices for mn (treated as a vector) residuals <- votes.repub - mn[positions]