bdVector of the desired quantiles of the data.
quantile(x, probs = 0:4/4, na.rm = F, ...)
quantile.default(x, probs = 0:4/4, na.rm = F,
alpha = 1, rule = 1, weights = NULL, freq = NULL)
bdVector of data.
Missing values are not allowed unless
na.rm=TRUE.
bdVector of desired probability levels.
Values must be between 0 and 1 inclusive.
The default produces a "five number summary":
the minimum, lower quartile, median, upper quartile, and maximum
of
x.
rule is 1,
NAs are supplied for any such points.
If
rule is 2, the extreme values of
x are used.
If
rule is 3, linear extrapolation is used.
This option is irrelevant if
alpha=1.
x, or
NULL if no weights. Quantiles are calculated for the
weighted distribution with probabilities proportional to
weights on the values of
x.
x, giving
frequencies. If supplied then results are equivalent to supplying
rep(x, freq) instead of
x. The effect is
similar to the
weights argument, except that values are
actually repeated so that the quantiles returned may be exactly equal
to a repeated value of
x rather than interpolated between
adjacent values.
bdVector of empirical quantiles corresponding to the
probs levels in the sorted
x data.
The algorithm linearly interpolates between order statistics of
x,
assuming that the
ith order statistic is the
(i-alpha)/(n-1+2*alpha) quantile if no weights are present,
where
n=length(x).
The algorithm uses partial sorting, hence is quickly able to find
a few quantiles even of large datasets.
approx((1:n - alpha) / (n + 1 - 2 * alpha),
x, probs, rule=rule)
If
x contains randomly-generated values from a
distribution, then
alpha=1 gives quantiles which are
biased (they tend to be too narrow),
alpha=1/3 gives
approximately median-unbiased estimates of the quantiles of the
distribution, and
alpha=0 matches the correct
probabilities for a new observation "X" from that distribution, i.e.
prob(X < quantile(x, p, alpha=0)) = p
(the relationship is exact if
p=k/(n+1) for some integer
k and the distribution is continuous, and approximate
otherwise).
If weights are present, then
alpha=.5 corresponds to
interpolating between the midpoints of segments of the step function
with step widths proportional to
weights. For other
values of
alpha the horizontal positions of those
midpoints are transformed linearly; for
alpha=1 the
horizontal positions of the two extreme midpoints are at 0 and 1.
If weights are present and there are ties in
x, then the
corresponding weights are averged, so that results are independent
of the order of observations.
If both weights and frequencies are supplied, then
x and
weights are replicated using the frequencies. This may
use a lot of memory.
Hyndman, R. J. and Fan, Y (1996), "Sample Quantiles in Statistical Packages," The American Statistician, 50, 361-364.
quantile(car.miles) # five number summary
quantile(testscores[,1], c(.33,.67)) # 33% and 67% quantiles of
# data from testscores
diff(quantile(testscores[,1], c(.25, .75))) # interquartile range
# create function iqr
iqr <- function (x) diff(quantile(x, c(.25, .75)))
iqr(car.miles) # returns 23
set.seed(2); x <- runif(9)
probs <- seq(0, 1, length=101)
plot(probs, quantile(x, probs, alpha=1), type="l", ylim=c(-.14,1))
lines(probs, quantile(x, probs, alpha=.5), col=2)
lines(probs, quantile(x, probs, alpha=0), col=3)
lines(probs, quantile(x, probs, alpha=0, rule=3), col=3, lty=3)
# weighted distributions
plot(probs, quantile(sort(x), probs, weights=1:9, alpha=.5),
type="l", ylim=0:1)
w <- 1:9 / sum(1:9)
points(cumsum(w)-w/2, sort(x))
lines(cumsum(w), sort(x), type="S", col=2)
lines(probs, quantile(sort(x), probs, weights=1:9, alpha=1), col=3)
# Frequencies
quantile(rep(x, 1:9)) # For reference
quantile(x, freq = 1:9) # This should match the previous