concomitants(x, ...)
concomitants(x, y, qfun, args.qfun = NULL, qx = NULL,
df = 3, weights = NULL)
concomitants.default.
x, containing the empirical
distribution to be adjusted.
qfun(runif(n)) should give random values
from the known distribution for
x.
qfun. For
example, if
qfun=qnorm, then this could be
list(mean=2, sd=3).
x. Supply either
qx or
qfun.
If supplied, this should contain values from the known distribution
for
x. This defaults to
qfun(ppoints(n)), where
n is the
length of
x.
x and
y.
This should be at least
2; a linear relationship results if
df=2, while a smoothing spline is used for larger values.
NULL (indicating no weights) or a vector the same length as
x,
containing probabilities for a weighted distribution of
x and
y.
y, but adjusted based on the difference between
x and
qx. This is basically
y plus (prediction for
y
given
qx) minus (prediction for
y given
x); in the linear
case this reduces to
y + beta * (qx - x).
Methods may return other objects; in particular,
returns an object of class
concomitants.bootstrap that inherits from
"bootstrap".
This implementation uses
smooth.spline to allow
the relationship between
x and
y to be curvilinear.
The higher the correlation between
x and a smooth monotone
transformation of
y, the more accurate the result is.
With a perfect nonlinear relationship (conditional variance
of
y given
x equal to zero)
the result would be
equal to
qfuny(ppoints(n)) where
qfuny is the quantile
function for
y (aside from errors due to imperfect estimation
of the nonlinear relationship).
If weights are present, then we presume that
x and
y were
obtained by importance sampling or some other mechanism that yields
weighted samples
Let F be the target distribution and G the design distribution for
x;
i.e. mean(x <= a)  = G(a) and mean(x <= a, weights=weights)  = F(a).
In this case,
qfun should be the inverse of F, and the weighted
distribution of
qx should be approximately F
(the unweighted
qx values correspond to G).
The output
y are values from the weighted distribution for
y.
Do, K. and Hall, P. (1992), "Distribution Estimation using Concomitants of Order Statistics, with Application to Monte Carlo Simulation for the Bootstrap," Journal of the Royal Statistical Society Series B, 54(2), 595-607.
Efron, B. (1990), "More Efficient Bootstrap Computations," Journal of the American Statistical Society, 85, 79-89.
Hesterberg, T.C. (1995), "Tail-Specific Linear Approximations for Efficient Bootstrap Simulations," Journal of Computational and Graphical Statistics, 4, 113-133.
Hesterberg, T.C. (1997), "Fast Bootstrapping by Combining Importance Sampling and Concomitants," Computing Science and Statistics, 29(2), 72-78.
set.seed(0) x <- rnorm(100) y <- .95*x+sqrt(1-.95^2) * rnorm(100) qx <- qnorm(ppoints(100)) adj.y <- concomitants(x, y, qx=qx) qx.o <- qx qx.o[order(x)] <- qx plot(x, y) arrows(x, y, x2=qx.o, y2=adj.y, size=.05, col=5) # The arrows run from the original unadjusted points to # the adjusted values. In the original data there are too # many large values of x; given the relationship between # x and y, this probably means that the values of y are # also too large. The arrowheads are at the adjusted points # Show the empirical and theoretical values of x axis(3, labels=F, at=x) axis(3, labels=F, at=qx, tck=.02, col=5) # Show the empirical and adjusted values of y axis(4, labels=F, at=y) axis(4, labels=F, at=adj.y, tck=.02, col=5) # Normal probabiity plots, for empirical y and adjusted y par(mfrow=c(2,1)) qqnorm(y); abline(0,1) qqnorm(adj.y); abline(0,1) par(mfrow=c(1,1)) # Note that the adjusted values of y are closer to the # exact distribution # Nonlinear relationship set.seed(1) y <- x + x^2/9 + .05*rnorm(100) plot(x,y) adj.y <- concomitants(x, y, qx=qx, df=4) arrows(x, y, x2=qx.o, y2=adj.y, size=.05, col=5) # The adjustments folow the curve of the relationship axis(3, labels=F, at=x) axis(3, labels=F, at=qx, tck=.02, col=5) axis(4, labels=F, at=y) axis(4, labels=F, at=adj.y, tck=.02, col=5) # Nonlinear relationship, with weights set.seed(1) y2 <- x2+x2^2/9 + .05*rnorm(100) adj.y2 <- concomitants(x2, y2, weights = w2, qx=qx2, df=4) plot(x2, y2) arrows(x2, y2, x2=qx2.o, y2=adj.y2, size=.05, col=5) # The adjustments folow the curve of the relationship axis(3, labels=F, at=x2) axis(3, labels=F, at=qx2, tck=.02, col=5) axis(4, labels=F, at=y2) axis(4, labels=F, at=adj.y2, tck=.02, col=5)