crosstabs(formula, data=sys.parent(), margin=<<see below>>, subset, na.action=na.fail, drop.unused.levels=T, yates=F)
+
operators, on the right
of the
~
.
Each term on the right hand side should be a factor, and will
be converted to one if not.
If there is a term to the left of the
~
it should be a vector of
counts -- this useful for data that has already been tabulated.
If the formula is omitted or is
~ .
and the data argument is a data frame,
then all the variables in data will be cross-tabulated.
formula
(and in the
subset
argument) may be found.
If a variable is not found by searching in the data frame or
frame given by
data
, it is expected to be on the search list.
data
argument, otherwise they will be looked for on
the search list.
All observations are included by default.
1
to the number of variables to be cross-tabulated, and
repeated values within a vector are not allowed.
The names of the list are the labels
to put in the legend printed with the table.
1
means to calculate row sums and
integer(0)
means
to calculate the grand sum.
The default for a two way cross-tabulation is
list("Row%"=1, "Col%"=2, "Total%"=integer(0))
and that for a
one way table is
list("Total%"=integer(0))
.
For higher dimensional cross-tabulations, the default results
in printing the row and column proportions for each layer--
list("N/RowTotal" = setdiff(i, 2),
"N/ColTotal" = setdiff(i, 1), "N/Total" = integer(0))
where
i
is 1:number.of.factors.
The
margin
argument here is similar to that in
loglin
.
na.action
.
The default is
na.fail
, which issues a fatal error message describing
the problem.
A common alternative is
na.exclude
, which deletes cases with
NA
s in any
of the variables to be cross-tabulated.
na.include
will add the level
NA
to each factor before cross-tabulating
them (the formula may also include terms like
na.include(x)
to do
this only for certain variables).
print.crosstabs
,
and has no effect if no printing is done.
crosstabs
.
This is an array of counts, suitable for use in functions like
loglin
.
It also has an attribute
marginals
, a list of arrays of the marginal
proportions specified by the
margin
argument.
(These arrays
are stacked by the print method for
crosstabs
so that corresponding
entries lie near each other.)
It also may have an attribute
na.message
, giving a message that the
na.action
function sometimes gives when it deals with missing values
in the data (e.g.,
na.exclude
will supply a
na.message
telling how many
cases were ignored).
This function provides a convenient interface to the
table
and
tapply
functions, for tabulation (counting the number of observations that fall in each cell in a contingency table). If you want to do other
calculations, say, say compute means or sums for observations in cells, try
tapply
.
The printing method,
print.crosstabs
, will generally add row and column
totals for each
2 dimensional layer of the table and will compute an overall chi squared
statistic to test independence of all the variables in the table.
If you want to omit them you may by calling
print.crosstabs
directly.
The formula could be used to describe the marginal proportions and tests to perform but does not yet. Hence all terms should be addends in the formula.
crosstabs(~Solder+Opening, data=solder, subset=skips>10) # Produces the following output: # Call: # crosstabs( ~ Solder + Opening, data = solder, subset = skips > 10) # 158 cases in table # +----------+ # |N | # |N/RowTotal| # |N/ColTotal| # |N/Total | # +----------+ # Solder |Opening # |S |M |L |RowTotl| # -------+-------+-------+-------+-------+ # Thin |99 |15 | 9 |123 | # |0.805 |0.122 |0.073 |0.78 | # |0.805 |0.577 |1.000 | | # |0.627 |0.095 |0.057 | | # -------+-------+-------+-------+-------+ # Thick |24 |11 | 0 |35 | # |0.686 |0.314 |0.000 |0.22 | # |0.195 |0.423 |0.000 | | # |0.152 |0.070 |0.000 | | # -------+-------+-------+-------+-------+ # ColTotl|123 |26 |9 |158 | # |0.778 |0.165 |0.057 | | # -------+-------+-------+-------+-------+ # Test for independence of all factors # Chi^2 = 9.18309 d.f.= 2 (p=0.01013719) # Yates' correction not used # Some expected values are less than 5, don't trust stated p-value # Example 2 petfood <- data.frame(Pet=c("Dog","Dog","Cat","Cat","Cat"), Food=c("Wet","Wet","Dry","Wet",NA)) crosstabs(data=petfood, na.action=na.exclude) # Produces the following output: # Call: # crosstabs(data = petfood, na.action = na.exclude) # 4 cases in table # Dropping 1 cases because of missing values # +----------+ # |N | # |N/RowTotal| # |N/ColTotal| # |N/Total | # +----------+ # Pet |Food # |Dry |Wet |RowTotl| # -------+-------+-------+-------+ # Cat |1 |1 |2 | # |0.50 |0.50 |0.5 | # |1.00 |0.33 | | # |0.25 |0.25 | | # -------+-------+-------+-------+ # Dog |0 |2 |2 | # |0.00 |1.00 |0.5 | # |0.00 |0.67 | | # |0.00 |0.50 | | # -------+-------+-------+-------+ # ColTotl|1 |3 |4 | # |0.25 |0.75 | | # -------+-------+-------+-------+ # Test for independence of all factors # Chi^2 = 1.333333 d.f.= 1 (p=0.2482131) # Yates' correction not used # Some expected values are less than 5, don't trust stated p-value