Sample Data Sets For Survival Analysis
SUMMARY:
These data sets are included to illustrate the survival analysis
methods in S-PLUS.
ARGUMENTS:
- bladder
-
Study on time to recurrence of bladder cancer from Wei, Lin and
Weissfeld (1989).
The data frame has multiple rows per patient. The columns:
- id
patient ID
- rx
-
treatment group (1 = placebo, 2 = thiopeta)
- number
-
the number of initial tumors
- size
-
size of the largest initial tumor
- start
-
entry into the study or the time of last recurrence
- stop
-
time to event (months)
- event
-
indicator of cancer recurrence (1) or censoring (0)
- enum
-
number of recurrences of bladder cancer
- capacitor
-
A simulated accelerated life testing of capacitors from Meeker and Duke (1982).
A data frame with columns:
- days
time to failure
- event
-
indicator of failure (1) or censoring (0)
- voltage
-
voltage at which the test was run
- heart
-
The Stanford heart transplant data from Kalbfleisch and Prentice (1980).
A data frame with each patient represented by two rows.
The first entry for a patient has:
- start
= 0
- transplant
-
= 0
- stop
-
time to transplant in days
The second entry for a patient has:
- start
time to transplant
- transplant
-
= 1
- stop
-
time to death or censoring
The other columns are:
- event
indicator of death (1) or censoring (0)
- age
-
(age of acceptance in days/365.25) - 48
- year
-
(date of acceptance in days since October 1, 1967)/365.25
- surgery
-
prior surgery (1 = yes, 0 = no)
- id
-
patient ID
- leukemia
-
Data from Embury et al. (1977) on trial to evaluate efficacy of
maintenance chemotherapy for acute myelogenous leukemia.
A data frame with columns:
- time
time to remission after chemotherapy (weeks)
- status
-
indicator of remission (1) or censored time (0)
- group
-
treatment group,
"maintained"
or
"nonmaintained"
- lung
-
Lung cancer data from Mayo Clinic (Loprinzi et al. 1994).
A data frame with columns:
- inst
code for the institution at which the patient was hospitalized
- time
-
survival time
- status
-
indicator of death (2) or censoring (1)
- age
-
patient's age
- sex
-
1 = male, 2 = female
- ph.ecog
-
physician's estimate of the ECOG performance score (0-4)
- ph.karno
-
physician's estimate of the Karnofsky score, a competitor to
the ECOG performance score
- pat.karno
-
patient's estimate of his/her Karnofsky score
- meal.cal
-
calories consumed at meals excluding beverages and snacks
- wt.loss
-
weight loss in the last six months
- ovarian
-
Data from Edmunson et al. (1979) on ovarian cancer.
A data frame with columns:
- futime
number of days in study
- fustat
-
indicator of death (1) or censoring (0)
- age
-
patient age in days/365.25
- residual.dz
-
an indicator of the extent of the residual disease
- rx
-
treatment given
- ecog.ps
-
a measure of performance score or functional status using the
Eastern Cooperative Oncology Group's scale
The survival analysis chapter in the S-PLUS documentation describes
these data sets further and illustrates survival analysis methods with
them.
REFERENCE:
Edmunson, J. H., Fleming, T. R., Decker, D. G., Malkasian, G. D.,
Jefferies, J. A., Webb, M. J., and Kvols, L. K. (1979).
Different chemotherapeutic sensitivities and host factors affecting
prognosis in advanced ovarian carcinoma vs. minimal residual disease.
Cancer Treatment Reports
63, 241-47.
Embury, S. H., Elias, L., Heller, P. H., Hood, C. E., Greenberg, P. L., and
Schrier, S. L. (1977).
Remission maintenance therapy in acute myelogenous leukemia.
Western Journal of Medicine
126, 267-272.
Kalbfleisch, J.D. and Prentice R.L. (1980).
The Statistical Analysis of Failure Time Data.
New York: Wiley.
Loprinzi, C. L., Laurie, J. A., Wieand, H. S. Krook, J. E., Novotny, P. J.,
Kugler, J. W., Bartel, J., Law, M., Bateman, M., Klatt, N. E., Dose, A. M.,
Etzell, P. S., Nelimark, R. A., Mailliard, J. A., and Moertel, C. G.
(1994).
Prospective evaluation of prognostic variables from
patient-completed questionnaires.
Journal of Clinical Oncology
12, 601-607.
Meeker, Jr, W. Q. and Duke, S. D. (1982).
User's Manual for CENSOR - A User-Oriented Computer Program for Life Data
Analysis.
Statistical Laboratory, Iowa State University, Ames, IA 50011.
Wei, L. J., Lin, D. Y. and Weissfeld, L. (1989).
Regression analysis of multivariate incomplete failure time data by
modeling marginal distributions.
Journal of the American Statistical Association
84, 1065-73.