x and a
binary (0-1) variable
y, and the corresponding receiver operating
characteristic curve area
c. Note that
Dxy = 2(c-0.5).
somers
allows for a
weights variable, which specifies frequencies
to associate with each observation.
somers2(x, y, weights=NULL, normwt=FALSE, na.rm=TRUE)
NAs are allowed.
0-1.
NAs are allowed.
TRUE to make
weights sum to the actual number of non-missing
observations.
FALSE to suppress checking for NAs.
The
rcorr.cens function, which although slower than
somers2 for large
sample sizes, can also be used to obtain Dxy for non-censored binary
y
, and it has the advantage of computing the standard deviation of
the correlation index.
C,
Dxy,
n (number of non-missing
pairs), and
Missing. Uses the formula
C = (mean(rank(x)[y == 1]) - (n1 + 1)/2)/(n - n1)
, where
n1 is the
frequency of
y=1.
Frank Harrell
Department of Biostatistics
Vanderbilt University School of Medicine
f.harrell@vanderbilt.edu
set.seed(1) predicted <- runif(200) dead <- sample(0:1, 200, TRUE) roc.area <- somers2(predicted, dead)["C"]