validate
function specific to models
fitted with
cph
or
psm
. Statistics validated include the
Nagelkerke R^2,
D_{xy}, slope shrinkage, the
discrimination indexD [(model L.R. chi-square - 1)/L], the unreliability index
U = (difference in -2 log likelihood between uncalibrated
X beta and
X beta with overall slope calibrated to test sample) / L,
and the overall quality index Q = D - U.
L is -2 log likelihood with beta=0. The "corrected" slope
can be thought of as shrinkage factor that takes into account overfitting.
See
predab.resample
for the list of resampling methods.
# fit <- cph(formula=Surv(ftime,event) ~ terms, x=TRUE, y=TRUE, ...) ## S3 method for class 'cph': validate(fit,method="boot", B=40,bw=FALSE,rule="aic",type="residual", sls=.05,aics=0,pr=FALSE,dxy=FALSE,u,tol=1e-9, ...) ## S3 method for class 'psm': validate(fit, method="boot",B=40, bw=FALSE,rule="aic",type="residual",sls=.05,aics=0,pr=FALSE, dxy=FALSE,tol=1e-12, rel.tolerance=1e-5, maxiter=15, ...)
cph
. The options
x=TRUE
and
y=TRUE
must have been specified. If the model contains any stratification factors
and dxy=TRUE,
the options
surv=TRUE
and
time.inc=u
must also have been given,
where
u
is the same value of
u
given to
validate
.
TRUE
to validate Somers' D_{xy} using
rcorr.cens
, which takes longer.
dxy=TRUE
.
In that case, strata are not included in X beta and the
survival curves may cross. Predictions at time
t=u
are
correlated with observed survival times. Does not apply to
validate.psm
.
Frank Harrell
Department of Biostatistics, Vanderbilt University
f.harrell@vanderbilt.edu
n <- 1000 set.seed(731) age <- 50 + 12*rnorm(n) label(age) <- "Age" sex <- factor(sample(c('Male','Female'), n, TRUE)) cens <- 15*runif(n) h <- .02*exp(.04*(age-50)+.8*(sex=='Female')) dt <- -log(runif(n))/h e <- ifelse(dt <= cens,1,0) dt <- pmin(dt, cens) units(dt) <- "Year" S <- Surv(dt,e) f <- cph(S ~ age*sex, x=TRUE, y=TRUE) # Validate full model fit validate(f, B=10) # normally B=150 # Validate a model with stratification. Dxy is the only # discrimination measure for such models, by Dxy requires # one to choose a single time at which to predict S(t|X) f <- cph(S ~ rcs(age)*strat(sex), x=TRUE, y=TRUE, surv=TRUE, time.inc=2) validate(f, dxy=TRUE, u=2, B=10) # normally B=150 # Note u=time.inc