This function requires the bigdata library section to be loaded.
bd.by.window(data, window, FUN, args=NULL, offset=0,
drop.incomplete=F, output=T)
bdFrame or
data.frame.
FUN argument is a S-PLUS function
that is called to process a data frame. This function itself cannot
perform any big data operations, or an error is generated.
NULL, then the
FUN function should have only one argument,
the input data block.
If this is a list, then the elements are passed as additional arguments
to the
FUN function.
If the list elements have names,
these must match argument names for the
FUN function.
offset=window, so each block directly follows
the previous one.
If
offset is greater than
window,
some rows will be skipped between blocks.
TRUE, this will only process blocks with
window rows.
If this is
FALSE, blocks at the end of the data set will be
processed, even if they have less than
window rows.
FUN function.
This could be set to
FALSE to execute a function
with side-effects.
output argument is
TRUE,
this function returns a
bdFrame or
data.frame,
of the same type as
data, appending the data frames output
by the
FUN function.
If the
output argument is
FALSE,
this function returns
NULL.
This function applies the S-PLUS function (
FUN) to
multiple data blocks within the input data as defined
by a moving window over the data rows.
Each data block is converted to a
data.frame, and passed
to the
FUN function.
If one of the data blocks is too large to fit in memory, an error will occur.
## For each distinct block of five rows in fuel.frame,
## calculate the mean of the Weight column.
bd.by.window(fuel.frame, 5,
function(df)
data.frame(meanWeight=mean(df$Weight)))
## For a moving window of five rows in fuel.frame,
## with each block adjusted by one row, including the
## short blocks at the end of the dataset, print
## the mean of the Weight column and the number of rows
## in the block.
bd.by.window(fuel.frame, 5,
function(df)
cat("mean=",mean(df$Weight),"nrow=",nrow(df),"\n"),
offset=1, output=F)
## For a moving window of five rows in fuel.frame,
## if the mean of the Weight column is greater than
## the min.mean value passed in via the args argument,
## print the mean of the Weight column in the block.
bd.by.window(fuel.frame, 5,
function(df, min.mean)
if (mean(df$Weight)>min.mean)
cat("mean=",mean(df$Weight),"\n"),
output=F, args=list(min.mean=3000))