miss(x, sort = T)
TRUE
or
"rc"
then rows and columns are first sorted by the number of
missing values, then reordered to put columns and rows with
similar missingness patterns together (see below for details).
If
"c"
then just columns are sorted, if
"r"
then just rows
are sorted, and if
FALSE
then neither are sorted.
If
"R"
or
"Rc"
then rows are reordered to put similar rows together,
but without first sorting by the number missing in each row.
"miss"
with components:
x
).
x
).
x
, and one row
for each unique pattern of missing values. The ordering is
determined by the
sort
argument.
pattern
, indicating
the number of rows of
x
with that pattern of missing values.
x
so that all rows with
the first missingness pattern are first, rows with the second pattern
are next, etc.
sort
is
TRUE
or contains
"c"
.
miss
requires that missing data are denoted by
NA
s or
NaN
s.
The result of this function is normally printed by
print.miss
, which
provides a formatted display. In order to see all components
of the result use
print.default(miss(x))
.
Columns are sorted first by the number missing in the column.
Then, among columns with equal number missing,
columns are reordered to form two groups, such that
columns which have nonmissing data in the row with the
most nonmissing observations are in the first group.
These groups are reordered, recursively, according to
missingness in the row with the second (third, ...)
most nonmissing observations. Rows with the same number
of nonmissing observations are used in their original order.
After columns are sorted (or not sorted), rows are sorted
in much the same way the columns were,
first by the number missing in the row (this step is optional),
then by missingness in the first column, second, etc.
In other words,
rows with the same number of missing values
are sorted in the order of the first occurence(s) of a missing
value(s).
As an example, suppose rows
x[i,]
and
x[j,]
both have
k
missing values.
Let
x[i,im]
and
x[j,jm]
be the first missing values in
row
i
and
j
respectively.
Then
x[i,]
is placed
before
x[j,]
if
im > jm
.
If
im == jm
then the position of the next missing value
is considered, etc.
x <- longley.x; x[runif(96) > .9] <- NA # random missing data M <- miss(x) M # equivalent to print(M) or print.miss(M) print.default(M) # print all components, no special format print(M, all.obs=F) # omit last part of printout plot(M) # Other information about missing values can be obtained using e.g.: rowSums(is.na(x)) # number missing in each row rowSums(!is.na(x)) # number not missing in each row round(100 * colMeans(is.na(x)), 1) # percent missing by column round(cor(is.na(x)), 2) # correlation of missingness patterns # Missing value codes other than NA should be changed to NA, e.g. x[x == -9] <- NA # Do this before calling miss(x).