This function requires the bigdata library section to be loaded.
bd.split(data, expr, row.language=T)
bdFrame(s) or
data.frame(s).
expr argument is evaluated as a
S-PLUS expression if the
row.language argument
is
F. In this case, the S-PLUS expression cannot
perform any big data operations, or an error is generated.
TRUE,
evaluate
expr
using the row-oriented
.
If this is
FALSE,
evaluate
expr as a general S-PLUS expression.
expr, and the second contains the remaining rows.
The data sets are either
bdFrame or
data.frame objects,
of the same type as
data[[1]],
This function determines whether each row should be output in the first or second
output data set
by evaluating
expr
in S-PLUS or the
.
The expression language allows expressions and operators in terms of the input column names.
When using a S-PLUS expression, it should be one that works on vectors,
since the columns referenced by the column names will be passed to the
S-PLUS expression as vectors.
The result in both cases should be a logical value.
If
data specifies more than one input data set,
the columns from the input data sets are referenced by adding
"inN$" to the beginning of the column name.
For example, the column named
"Weight" in the first
input data set would be referenced as
"in1$Weight",
and the column
"X" in the third input
data set would be referenced as
"in3$X".
If there are multiple input data sets,
the second, third, etc. inputs are used to evaluate
expr,
but their rows are not copied to the output.
## Return one data set with the rows where Weight>3000, ## and a second data set with the remaining rows. bd.split(fuel.frame, "Weight>3000")