psm is a modification of Therneau's
survreg function for
fitting the accelerated failure time family of parametric survival
models.
psm uses the
Design class for automatic
anova
,
fastbw,
calibrate,
validate, and
other functions.
Hazard.psm,
Survival.psm,
Quantile.psm
, and
Mean.psm create S functions that
evaluate the hazard, survival, quantile, and mean (expected value)
functions analytically, as functions of time or probabilities and the
linear predictor values.
The
residuals.psm function exists mainly to compute normalized
(standardized) residuals and to censor them (i.e., return them as
Surv
objects) just as the original failure time variable was
censored. These residuals are useful for checking the underlying
distributional assumption (see the examples). To get these residuals,
the fit must have specified
y=TRUE. A
lines method for these
residuals automatically draws a curve with the assumed standardized
survival distribution. A
survplot method runs the standardized
censored residuals through
survfit to get Kaplan-Meier estimates,
with optional stratification (automatically grouping a continuous
variable into quantiles) and then through
survplot.survfit to plot
them. Then
lines is invoked to show the theoretical curve. Other
types of residuals are computed by
residuals using
residuals.survreg
.
Older versions of
survreg used by
psm (e.g., on S-Plus
2000) had the following additional arguments
method, link, parms,
fixed. See
on such systems for details.
psm
passes those arguments to
survreg.
psm(formula=formula(data),
data=if (.R.) parent.frame() else sys.parent(), weights,
subset, na.action=na.delete, dist="weibull",
init=NULL, scale=0,
control=if(!.R.) survReg.control() else survreg.control(),
parms=NULL,
model=FALSE, x=FALSE, y=TRUE, time.inc, ...)
# dist=c("extreme", "logistic", "gaussian", "exponential",
# "rayleigh", "t") for S-Plus before 5.0
# dist=c("extreme", "logistic", "gaussian", "weibull",
# "exponential", "rayleigh", "lognormal",
# "loglogistic" "t") for R, S-Plus 5,6
# Older versions had arguments method, link, parms, fixed
## S3 method for class 'psm':
print(x, correlation=FALSE, ...)
Hazard(object, ...)
## S3 method for class 'psm':
Hazard(object, ...) # for psm fit
# E.g. lambda <- Hazard(fit)
Survival(object, ...)
## S3 method for class 'psm':
Survival(object, ...) # for psm
# E.g. survival <- Survival(fit)
## S3 method for class 'psm':
Quantile(object, ...) # for psm
# E.g. quantsurv <- Quantile(fit)
## S3 method for class 'psm':
Mean(object, ...) # for psm
# E.g. meant <- Mean(fit)
# lambda(times, lp) # get hazard function at t=times, xbeta=lp
# survival(times, lp) # survival function at t=times, lp
# quantsurv(q, lp) # quantiles of survival time
# meant(lp) # mean survival time
## S3 method for class 'psm':
residuals(object, type="censored.normalized", ...)
## S3 method for class 'residuals.psm.censored.normalized':
survplot(fit, x, g=4, col, main, ...)
## S3 method for class 'residuals.psm.censored.normalized':
lines(x, n=100, lty=1, xlim,
lwd=3, ...)
# for type="censored.normalized"
Surv object.
psm. For
survplot with
residuals from
psm,
object is the result of
residuals.psm.
psm
survreg (
survReg for S-Plus 5. or 6.).
fixed is used for S-Plus before 5.,
parms is used for
S-Plus 5, 6, and R. See
cph for
na.action.
TRUE to include the model frame in the returned object
TRUE to include the design matrix in the object produced
by
psm. For the
survplot method,
x is an optional
stratification variable (character, numeric, or categorical). For
lines.residuals.psm.censored.normalized,
x is the result
of
residuals.psm. For
print it is the result of
psm.
TRUE to include the
Surv() matrix
survplot, and also in make confidence bars. Default is 30
if time variable has
units="Day", 1 otherwise, unless
maximum follow-up time < 1. Then max time/10 is used as
time.inc.
If
time.inc is not given and max time/default
time.inc is
> 25,
time.inc is increased.
TRUE to print the correlation matrix
for parameter estimates
survplot from
survplot.residuals.psm.censored.normalized. Ignored for
lines.
times and
lp are
vectors, they must be of the same length.
q and
lp are both vectors,
a matrix of quantiles is returned, with rows corresponding to
lp
and columns to
q.
survreg for other
types (
survReg for S-Plus 6).
lines.residuals.psm.censored.normalized
lines, default is 1
survplot method, corresponding to levels of
x
(must be a scalar if there is no
x)
survplot. If omitted, is the name or label of
x if
x is given. Use
main="" to suppress a title when you
specify
x.
The object
survreg.distributions contains definitions of properties
of the various survival distributions.
psm does not trap singularity errors due to the way
survreg.fit
does matrix inversion. It will trap non-convergence (thus returning
fit$fail=TRUE
) if you give the argument
failure=2 inside the
control
list which is passed to
survreg.fit. For example, use
f <- psm(S ~ x, control=list(failure=2, maxiter=20))
to allow up to
20 iterations and to set
f$fail=TRUE in case of non-convergence.
This is especially useful in simulation work.
psm returns a fit object with all the information
survreg would store as
well as what
Design stores and
units and
time.inc.
Hazard
,
Survival, and
Quantile return S-functions.
residuals.psm
with
type="censored.normalized" returns a
Surv object
which has a special attribute
"theoretical" which is used by the
lines
routine. This is the assumed standardized survival function as a function
of time or transformed time.
Frank Harrell
Department of Biostatistics
Vanderbilt University
f.harrell@vanderbilt.edu
n <- 400
set.seed(1)
age <- rnorm(n, 50, 12)
sex <- factor(sample(c('Female','Male'),n,TRUE))
dd <- datadist(age,sex)
options(datadist='dd')
# Population hazard function:
h <- .02*exp(.06*(age-50)+.8*(sex=='Female'))
d.time <- -log(runif(n))/h
cens <- 15*runif(n)
death <- ifelse(d.time <= cens,1,0)
d.time <- pmin(d.time, cens)
f <- psm(Surv(d.time,death) ~ sex*pol(age,2),
dist=if(.R.)'lognormal' else 'gaussian')
# Log-normal model is a bad fit for proportional hazards data
anova(f)
fastbw(f) # if deletes sex while keeping age*sex ignore the result
f <- update(f, x=TRUE,y=TRUE) # so can validate, compute certain resids
validate(f, dxy=TRUE, B=10) # ordinarily use B=150 or more
plot(f, age=NA, sex=NA) # needs datadist since no explicit age, hosp.
survplot(f, age=c(20,60)) # needs datadist since hospital not set here
# latex(f)
S <- Survival(f)
plot(f$linear.predictors, S(6, f$linear.predictors),
xlab=if(.R.)expression(X*hat(beta)) else 'X*Beta',
ylab=if(.R.)expression(S(6,X*hat(beta))) else 'S(6|X*Beta)')
# plots 6-month survival as a function of linear predictor (X*Beta hat)
times <- seq(0,24,by=.25)
plot(times, S(times,0), type='l') # plots survival curve at X*Beta hat=0
lam <- Hazard(f)
plot(times, lam(times,0), type='l') # similarly for hazard function
med <- Quantile(f) # new function defaults to computing median only
lp <- seq(-3, 5, by=.1)
plot(lp, med(lp=lp), ylab="Median Survival Time")
med(c(.25,.5), f$linear.predictors)
# prints matrix with 2 columns
# fit a model with no predictors
f <- psm(Surv(d.time,death) ~ 1, dist=if(.R.)"weibull" else "extreme")
f
pphsm(f) # print proportional hazards form
g <- survest(f)
plot(g$time, g$surv, xlab='Time', type='l',
ylab=if(.R.)expression(S(t)) else 'S(t)')
f <- psm(Surv(d.time,death) ~ age,
dist=if(.R.)"loglogistic" else "logistic", y=TRUE)
r <- resid(f, 'cens') # note abbreviation
survplot(survfit(r), conf='none')
# plot Kaplan-Meier estimate of
# survival function of standardized residuals
survplot(survfit(r ~ cut2(age, g=2)), conf='none')
# both strata should be n(0,1)
lines(r) # add theoretical survival function
#More simply:
survplot(r, age, g=2)
options(datadist=NULL)