Data Frame Objects

DESCRIPTION:

Data frames are objects of class data.frame.
They combine the behavior of matrices, in the sense that they can be addressed by rows (meaning observations) and columns (meaning variables), with the behavior of lists or frames in S-PLUS, in the sense that the variables can be used like individual objects;

GENERATION:

The functions data.frame and read.table generate data frames.

METHODS:

Generic functions that have methods for class "data.frame" include:
[, [[, aperm, atan, dbdetach, dim, dimnames, formula, ordered<-, pairs, plot, print, signif, summary, t.
In addition the groups Math, Ops and Summary all have methods for "data.frame".

INHERITANCE:

The classes "design" and "pframe" inherit from "data.frame".

STRUCTURE:

Data frames are implemented as lists whose components are vectors or matrices with the same number of observations (length, or number of rows for matrices). The following attributes must be included and behave as follows:

VALUE:

row.names
a vector of length equal to the number of observations. These must be unique, except as noted below.
names
the names must exist, be of full length, and be unique.
The following attribute is optional:
dup.row.names
if this is non-null, then the row.names need not be unique. This makes creating and subscripting the data frame faster.

DETAILS:

Data frames can be used like lists or frames, for example, by attaching the object to the search list, by setting it up as a frame in the evaluator or the browser, or by passing it to a model-fitting function along with a formula using the variable names in the data frame.
Many matrix-like computations are defined as methods for data frames, notably, subsets and the dim and dimnames attributes. Data frames are not ordinary matrices; most importantly, any object can become a variable in the data frame, so long as it is addressable by the observations. In practice, this means that the variables should be one of vectors, matrices, or some other class of objects that can itself be treated as either a vector or matrix (in particular, can be subset like a vector or matrix). If the variable is vector-like, it should have length equal to the number of rows; if matrix-like, it should have the same number of rows as the data frame.
The definition of the dimension and the dimnames of a data frame is done differently from that of a matrix. Every data frame is required to have an attribute "row.names" whose length is, by definition, the number of rows of the data frame. The number of columns is by definition the number of variables; that is, the length of the data frame as a list (this is true even if any variable is a matrix). The dimnames list is equivalent to list(row.names(x), names(x))
A data frame is a matrix, but does not inherit from class "matrix" . If x is a data frame, is.matrix(x) is TRUE , but inherits(x, "matrix") is FALSE .

SEE ALSO:

, , , , , , .