This function requires the bigdata library section to be loaded.
bd.partition(data, train=0.7, test=0.3, seed=NULL)
bdFrame
or
data.frame
.
0
to
1
0
to
1
NULL
, uses a new random seed for sampling every time.
If an integer, it uses this for the seed.
The default value will set the seed based on the S-PLUS random seed.
bdFrame
(s) or
data.frame
(s),
of the same type as
data
.
.Random.seed
if it does not already exist, otherwise its value is updated.
This function simply splits the input into multiple (up to three) outputs according to the train and test fraction parameters. The length of the returned list is dependent on the fractions input. For example, if 1.0 or greater is entered in the train parameter, only one output will be generated. If train=.25 and test=.75, a two element list will be returned. If train=.23 and test=.75, then the returned list will contain three objects the validation object will contain the remaining 2 percent of the observations.
# Partition fuel.frame into three data sets containing 65%, 20% # and 15% of the observations: bd.partition(fuel.frame, 0.65, 0.20)