Details on tilting formulae

DESCRIPTION:

Here are details of tilting formula for exponential and maximum-likelihood tilting, particular for multiple groups (e.g. two-sample problems, or stratified sampling). These formulae are used in multiple functions, including:

USAGE:

tiltMean(tau, L, tilt="exponential", weights, group,  
              lambda = NULL, ...) 
tiltMeanSolve(q, L, tilt="exponential", weights, group, ...) 
               
tiltWeights(tau, L, tilt="exponential", weights, group, lambda, ...) 
 
saddlepointP(tau, L, ..., weights, group, ...) 
saddlepointPSolve(p, L, ..., weights, group, ...) 
 
tiltBootProbs(tau, boot.obj, tilt="exponential", ..., L, group, ...) 
limits.tilt(boot.obj, probs, L, ..., group, ...) 
 
revSaddlepointP(tau, L, tilt = "exponential", weights, group, ...) 
revSaddlepointPSolve(probs, L, tilt = "exponential", weights, group, ...) 
 

Dimensions:

We use the following dimensions below:

N = sample size (number of observations in the orignal data) B = number of bootstrap samples P = number of statistics (e.g. number of regression coefficients). Often equal to 1 K = number of tilting parameters, quantiles, or probabilities; e.g. for a simple confidence interval K=2, for lower and upper tails. Often equal to 1. G = number of groups (in multiple-sample problems, or strata)

Then for example "vector [N]" is short for "a vector of length N", and "matrix [N,P]" indicates an N by P matrix.

ARGUMENTS:

L
vector [N] of values or matrix [N,1] which determine a discrete distribution, or a matrix [N,P] of values with each column giving a different distribution. Only limits.tilt supports the latter, and it gives results equivalent to calling it with one column at a time.

These are typically empiricial influence values; see for available methods for calculating this.

tau
vector [K] of tilting parameters. Each value determines a single weighted distribution. Values of tau above zero place more weight on the rightmost values of L, and conversely for negative tau.
tilt
one of "exponential" or "ml", for exponential or maximum likelihood tilting, respectively. Only exponential tilting is used for saddlepoint calculations.
weights
NULL, or vector [N] of weights, if the discrete distribution has unequal probabilities on the values (before tilting).
group
NULL, or vector [N] indicating stratified sampling or multiple-group problems; unique values of this vector determine groups.
lambda
matrix [G,K] (a vector is allowed if K==1) containing normalizing constants for maximum likelihood tilting with groups (see DETAILS, below). These are computed if not provided. A warning is issued if these do not result in tilting weights that sum to 1 in each group. If lambda has row names they must match the unique values of group; otherwise the order of rows should match the sorted unique values of group.
q
vector [K] of desired tilted means (weighted means, with weights determined by tilting); solve for tau.
p
vector [K] of desired probabilities (saddlepoint estimates); solve for tau.
probs
vector [K] of desired probabilities for one-sided bootstrap tilting confidence intervals; solve for tau, calculated tilted weights, then calculate statistic for the weighted distribution.
...
other arguments to the functions; see the respective help files for these arguments.

DETAILS:

Consider first the simplest case, where the statistic of interest is scalar ( P=1) for a single sample with no stratified sampling ( G=1), and a single value of tau ( K=1). L should be a vector or column matrix with N elements. Then exponential tilting places probability

    c weights * exp((L-Lbar) * tau) 

on the values in L, where Lbar is the weighted mean of L and c is a normalizing constant. Maximum likelihood tilting places probabilities
    c weights / (1 - (L-Lbar) * tau) 

For both maximum likelihood and exponential tilting, results are normalized to sum to 1.

The "ml" weights are empirical maximum likelihood weights, that maximize the product of probabilities subject to the weighted mean matching a specified value.

Note that in order for the "ml" weights to be positive, the values of tau are restricted to the interval 1/range(L-Lbar). If tau is outside of the acceptable range, the corresponding returned weights are set to NA.

Multiple Samples or Stratified Sampling:

When group is supplied (for multiple-sample or stratified sampling applications), if there is only one group then results are equivalent to the case without groups.

With multiple groups, exponential tilting weights are equal to

    c[g] weights[gi] * exp((L[gi] - Lbar[g]) * tau * N / N[g]) 

where g indicates the gth group, N[g] is the size of group g, Lbar[g] the (weighted) mean of group g, [gi] indicates the ith observation in group g, and c[g] are normalizing constants so the tilted weights sum to 1 in each group. (Note that weights and L are still vectors.)

This is equivalent to using a tilting parameter of tau[g]=tau*N/N[g] in group g.

Maximum likelihood weights for the multiple group case use a different parameterization. In this case weights are equal to

    weights[gi] / (lambda[g] - (L[gi] - Lbar[g]) * tau * N / N[g]) 

Here, lambda[g] are normalizing constants so that weights sum to 1 in each group, and the prior weights weights[gi] must sum to 1 in each group (in other cases prior weights need not be normalized). Newtons method is used to solve for lambda[g] given tau.

The parameterization involving lambda could be used in the single-group case, but is less convenient to work with, as it typically requires numerically solving two equations in two unknowns. The other parameterization requires solving for only a single unknown, tau (there is also a normalizing constant, but it does not require a numerical search). In the multiple group case, the appropriate optimization problem for ML tilting does not yield a closed-form relationship between the tau[g] values for different groups when the usual parameterization is used. Using the lambda parameterization, the relationship is that tau[g]=t/N[g], for some constant t. The same relationship holds for exponential tilting. We then use the form tau[g] = tau / (N[g]/N) for consistency with the case of only a single group.

Vector tau:

Except for tiltWeights Most of the functions listed above are vectorized; tau or other arguments ( q, p, probs) may be vectors [K], with each value determining a different tilted mean, confidence level, etc. The above formulae are applied on one of the K values at a time.

Multivariate Case:

The code does not currently handle the true multivariate case, where the statistic is vector-valued, L is an [N,P] matrix, tau a vector [P], and tilting should be done based on a linear combination of the variables in L.

Most calculations one would perform in this case can be done using matrix multiplication to reduce to the univariate case; e.g. tiltWeights(tau=1, L=myL %*% mytau) computes weights based on the linear combination of influence values myL and tilting vector mytau.

Relationships:

For scalar tau

tiltMean(tau, L, ...) = colMeans(L, weights = tiltWeights(tau, L, ...)) 

For tiltMeanSolve, tilting parameters (and potentially lambda) are found so that the above equals q.

REFERENCES:

Hesterberg, T.C. (2003), "Tilting Calculations for Resampling Inferences and Approximations", Research Report No. 103.

SEE ALSO:

, , , , , , , .