balancedSample(n, size = n, prob = NULL, full.partition = "none")
size
is generated from values
1:n
.
n
, or
NULL
(indicating equal
probabilities).
The vector is normalized internally to sum to one.
Index
i
is chosen with probability
size*prob[i]
.
"first"
,
"last"
, or
"none"
; Return the initial
(if
"first"
) or final (if
"last"
)
size
elements of a full
sample of size
n
. If
"none"
, do not generate a full sample. Valid
only if
size < n
; ignored otherwise. See below.
size
of indices drawn without replacement
if possible, i.e. if
size<=n
and
size*max(prob)<=1
.
In particular, the default values
size=n
,
prob=NULL
generate a simple permutation of
1:n
.
Otherwise samples are with as little replacement as possible -- actual
frequencies are rounded up or down from the goal
size/n
or
size*prob
,
and the probability that
result[i]=j
is
1/n
or
prob[j]
.
.Random.seed
if it does not already exist. Otherwise its value is updated.
The algorithm when
prob
is supplied uses a random permutation,
systematic sampling, and a final random permutation.
prob
is randomly permuted, together with the indices
1:n
.
The interval (0,1) is divided into
n
subintervals
I[i]
with
length(I[i])
proportional to
prob[i]
.
Next,
size
values
u[j]
are generated uniform on the interval
(0,1) using systematic sampling.
Let
u[1]
be random uniform on
(0,1/size)
and
u[j] = u[j-1] + 1/size
, for
j
in
2:size
. Thus the
u[j]
all lie in (0,1) and are equally spaced.
If
u[j]
is in interval
I[i]
then the
j
th component of
a temporary result is the
i
th permuted index.
At this point, if
I[i]
has length greater than
k/size
then
there are
k
or more consecutive copies of the
i
th permuted index in
the temporary result.
A final random permutation of the result
ensures that repeats do not always appear together.
Calling
balancedSample
with
full.partition = "first", size = m
and then
(after re-setting the seed)
full.partition = "last", size = n-m
produces
complementary indices which, when concatenated together, produce
results equivalent to calling
balancedSample
with
size = n
: with
probs
present, the concatenated results are the same, up to a
permutation, as the results with
size = n
; with no
probs
, the
concatenated results are identical to the results with
size = n
.
balancedSample(4) # random permutation balancedSample(4, 2) # two observations chosen without replacement balancedSample(4, 6) # each observation once, two observations twice balancedSample(4, 8) # each observation twice balancedSample(4, 8, prob=(1:4)) # expected frequencies .8, 1.6, 2.4, 3.2 # These are equivalent (in the long run; they vary randomly) balancedSample(5, 100) sample(rep(1:5, length=100), 100, replace=F)