This function requires the bigdata library section to be loaded.
bd.filter.rows(data, expr, columns=NULL, include=T, row.language=T)
bdFrame
(s) or
data.frame
(s).
expr
argument is evaluated as a
S-PLUS expression if the
row.language
argument
is
F
. In this case, the S-PLUS expression cannot
perform any big data operations, or an error is generated.
NULL
, this specifies all of the data set columns.
If there are multiple input data sets, this only refers to the first one.
TRUE
, then only the selected rows are included in the output.
If
FALSE
, then the selected rows are excluded from the output.
TRUE
,
evaluate
filter.expr
using the row-oriented
.
If
FALSE
,
evaluate
expr
as a general S-PLUS expression.
bdFrame
or
data.frame
of the same type as
data[[1]]
,
with rows selected by
expr
included or excluded.
This function determines whether each row should be selected
by evaluating
expr
as a general S-PLUS expression or using the row-oriented
the
.
The expression language allows expressions and operators in terms of the input column names.
When you use a S-PLUS expression, it should be one that works on vectors,
because the columns referenced by the column names are passed to the
S-PLUS expression as vectors.
The result in both cases should be a logical value.
If
data
specifies more than one input data set,
then the columns from the input data sets are referenced by adding
"inN$"
to the beginning of the column name.
For example, the column named
"Weight"
in the first
input data set would be referenced as
"in1$Weight"
,
and the column
"X"
in the third input
data set would be referenced as
"in3$X"
.
If there are multiple input data sets,
then the second, third, and so on inputs are used to evaluate
expr
,
but their rows are not copied to the output.
## Get only those rows where Weight>3000. bd.filter.rows(fuel.frame, "Weight>3000")
## Remove rows where Type=="Van". bd.filter.rows(fuel.frame, "Type=='Van'", include=F)