Compute Nonparametric Survival Estimates

DESCRIPTION:

Computes Turnbull's generalization (EM algorithm) of the Kaplan-Meier estimate.

USAGE:

kaplanMeier(formula, data=sys.parent(), weights, subset, 
     na.action, se.fit=T, conf.interval="log", coverage=0.95, 
     control=kaplanMeier.control) 

REQUIRED ARGUMENTS:

formula
formula giving the response on the left side of a tilde "~", and if desired, one or more stratification variables on the right side.
data
data frame in which to interpret the variables named in formula, subset and weights. If this argument is missing, the variables should be in the search list.

OPTIONAL ARGUMENTS:

weights
vector of case weights. If supplied, the algorithm fits to minimize the sum of the weights multiplied into the likelihood contribution for each observation. In this way, the weights behave like frequencies and the degrees of freedom are computed accordingly. The length of weights must be the same as the length of the response. The weights must be nonnegative and it is strongly recommended that they be strictly positive, since zero weights are ambiguous. If you need to assign zero weights to some observations, use the subset argument instead. By default, no weights are included in the model.
subset
expression indicating the subset of rows in data that should be used in the fit. By default, all rows are included.
na.action
missing-data filter function, applied to the model frame after the subset argument has been used. The default filter is options()$na.action.
se.fit
logical value indicating whether standard errors should be returned for right or left censor data. By default, se.fit=TRUE. Note that standard errors cannot be calculated for Turnbull's generalization of the Kaplan-Meier estimate for interval censored data; for more details, see the references listed below.
conf.interval
character string specifying the confidence interval type. Possible values are: 1) "none" for no confidence intervals; 2) "identity" for standard intervals curve +- k*se(curve), where k is determined by coverage; 3) "log" for intervals based on the cumulative hazard or log(survival); and 4) "log-log" for intervals based on the log hazard or log(-log(survival)). The last type of confidence interval never extends past 0 or 1. By default, conf.interval="log". Only enough of the string to uniquely identify it is necessary.
coverage
numeric value specifying the level of a two-sided confidence interval for the survival curve(s). The default value is 0.95.
control
list of three parameters ( tolerance, maxit, and maxmsd), as returned by kaplanMeier.control. See kaplanMeier.control for more details.

VALUE:

a list containing components "fits" and "call". The fits component is a matrix containing the Kaplan-Meier estimate. In the matrix, time1 is the lower endpoint of a survival interval, time2 is the upper endpoint, survival is the probability of survival to time2, std.err (if present) is the estimated standard deviation of the survival probability, and lower and upper (if present) are the lower and upper confidence bounds for the probability of survival to time2.

REFERENCES:

Turnbull, B. (1974). Nonparametric estimation of a survivorship function with doubly censored data. Journal of the American Statistical Association 69: 169-173.

Turnbull, B. (1976). The empirical distribution function with arbitrarily grouped, censored, and truncated data. Journal of the Royal Statistical Society (Series B) 38:290-295.

SEE ALSO:

, , .

EXAMPLES:

kaplanMeier(censor(days, event) ~ voltage, data = capacitor2,
        weights = weights)
kaplanMeier(censor(time, status) ~ group, data = leukemia)