arbor
trees.
The following components must be included in a legitimate
arbor
object.
Of these, only the
where
component has the same length as
the data used to fit the
arbor
object.
row.names
of
frame
contain the (unique) node numbers that
follow a binary ordering indexed by node depth.
Elements of
frame
include
var
,
the variable used in the split at each node
(leaf nodes are denoted by the string
<leaf>
),
n
, the size of each node,
wt
, the sum of the weights given to the observations in each node,
dev
, the deviance of each node,
yval
, the fitted value of the response at each node,
and
splits
, a two column matrix of left and right split labels
for each node.
All of these are the same as for a
tree
object.
yval2
.
The first column of
yval2
is the same as
yval
.
For the poisson and exponential methods, the second column of
yval2
contains the number of
events at the node.
For classification the rest of
yval2
consists of matrices of
class counts and probabilities. Each matrix includes a column
for each class.
The anova method does not have a
yval2
.
Also included in the frame are
complexity
, the complexity parameter at
which this split will collapse,
ncompete
, the number of competitor splits
retained, and
nsurrogate
, the number of surrogate splits retained.
frame
corresponding to the leaf node
that each observation falls into.
count
, the number of observations sent left
or right by the split (for competitor splits this is the number that
would have been sent left or right had this split been used, for surrogate
splits it is the number missing the primary split variable which were decided
using this surrogate),
ncat
, the number of categories or levels for the
variable (+/-1 for a continuous variable),
improve
, which is the improvement
in deviance given by this split, or, for surrogates, the concordance of the
surrogate with the primary,
index
, the
numeric split point, and
adj
, a measure
of how much of the gain over and above naive did I do. For
a factor, this column contains the row number of the csplit matrix.
For a continuous variable, the sign of
ncat
determines whether the
subset
x<cutpoint
or
x>cutpoint
is sent to the left.
expression
and class
term
summarizing the formula.
Used by various methods, but typically not of direct relevance to users.
control
is described in
help files for arbor and arbor.control.
parms
varies by method. See the manual for details of useage and output
description.
summary
and
print
used by the different
methods. See the manual for more details.
update(tree)
.
x
), the
response vector
(
y
) and the model frame (
model
). If none of these
are requested explicitly, by setting their argument=T, the
response variable (
y
) will be returned by default. The model frame
from one run can
be used as input into a future run of the
arbor
function, see examples in the
arbor
help file.