corDesign(design.option, n.layer=1, correlation.matrix=NULL, size, type.layer=rep("exchangeable", n.layer), block, data)
correlation.matrix
,
a 1-layer generic structured correlation design as specified
in
type.layer
or a 1-layer block-wise correlation design as specified
in
block
and
type.layer
.
n.layer
is set to 1
for fixed designs,
and arguments
size
,
type.layer
and
block
are ignored.
A matrix with zero correlation is a special case of fixed designs and
n.layer
of such designs is set to zero.
"dimnames"
attribute for its row and column names. Such integers are the indexes
of record names, which are used to match the record id
as specified in the
cluster
argument
of a
gee
call to identify observations within clusters.
correlation.matrix
.
In other cases, the default is a 1-level design,
i.e. a (2 x 2) correlation matrix for
n.layer=1
.
n.layer
.
The choices for generic correlation structures are:
type.layer
should be
a vector of
n.layer
character
strings chosen from one of the generic structures.
type.layer
and
block
are required arguments.
In these cases
type.layer
should be
a data frame with
n.layer
rows.
Each row represents a layer of structured correlation,
and the sequence of the rows
correspond to the order of the layers.
By default, the integer
row.names
are used for such an order.
This numeric order of a layer is this layer's id,
layer.id
.
For a nested design,
layer.id
indicates
the ordering in the hierarchical structures.
That is,
the first row,
layer.id=1
,
is the first layer and is the highest layer, which is defined, by default,
as the layer closest
to the diagonal of the design matrix.
The last row is the lowest layer in our definition but is
the top level within the hierarchical structure.
A higher-order layer by
layer.id
always has the precedence over a lower-order layer,
and the first layer has the highest precedence.
type.layer
provides
sufficient information and the argument
block
can be suppressed.
Specification of a correlation structure for a
layer.id
requires a correlation type.
Correlation structures: stationary, nonstationary, AR and unstruct
are coordinate-dependent,
and some designs require a covariate or a parameter to complete the specification.
For example, the
"AR"
or
"contAR"
structures
require additional time variables and the
"stationary"
or
"nonstationary"
structures
require a parameter to specify non-zero correlations.
In such cases,
type.layer
should be a data frame
with three variables:
layer.id
for nested designs or
correlation structures requiring variables or parameters,
and is optional for a 1-layer design
(
n.layer=1
).
NA
or
"NA"
is permitted, with each
NA
indicating
that the parameterization of the corresponding
layer.id
does not require any variable.
NAs
are replaced by record id's.
x.layer
have to be included either in the data or in the search list. type
.
This column is optional, and
NA
is permitted for each row.
layer.id
,
begin.row
,
begin.col
,
end.row
,
end.col
.
The
layer.id
associates a block with a layer.
The other four columns specify the locations of blocks
in the upper triangle of the correlation
design matrix.
The
row.names
of this data frame should identify
the individual blocks.
Note that by default,
integers are used for
row.names
in a data frame.
layer.id
should correspond to the id or
order of the layer as specified in the argument
type.layer
,
so that each block has an associated layer.
A
layer.id
may be associated with more than one block.
Blocks of the same
layer.id
have the same
correlation structure as defined
in argument
type.layer
.
The (
begin.row, begin.col
) are the (row, column)-
coordinates of the beginning cell, and
the (
end.row, end.col
) are the (row, column)-
coordinates of the ending cell.
The values of these coordinates should lie between one
and the argument
size
.
data
should include all variables specified in
type.layer
.
The number of rows of the data must be the
same as the row dimension of the correlation matrix of a complete cluster.
The default will generate an index variable
record.names
,
which is the record identification of a complete cluster.
This index variable will be used to match
the second variable of the
cluster
argument of a
gee
or a
geeDesign
call. Therefore, the records of any incomplete cluster and its correlation
matrix can be identified.
"corDesign"
is returned.
See
corDesign.object
for details.
"unstruct"
requires specification of a variable,
the values of the variable have to be consistent across blocks.
If a variable is not provided in
x.layer
(see argument
type.layer
),
the record id in
data
is used by default.
This default is adequate for an
"unstruct"
layer
with only one block and may not be correct
for multiblock and multilayer designs.
"unstruct"
layer with multiple blocks
requires an indexing variable to specify the (row,column)-coordinate of each
parameter.
The (row,column)-coordinates in each block
have to be the same as that specified
by the indexing variable for a pivot block.
Any cluster in an unbalanced data is a realization
of all pivot blocks and can be identified by
the record id's and other variables in
x.layer
.
The output of
corDesign
is mainly used to specify
the
correlation
argument for the
geeDesign
function. Such correlation designs are
desired when a correct correlation structure is somewhat known
or is important in specifying the working correlation in GEE.
This is particularly the case for multivariate correlated data
and multilevel data. It is also useful to ease the efficiency concern
when the number of clusters is small and the cluster size is large and complicated.
A nested design in this context means that a layer of correlation is nested in
another layer.
This is not exactly the same notion as in experimental designs.
In our cases, we allow recursively nested structure.
This
corDesign
is meant for arbritary unbalanced designs, therefore, the
arguments are complicated.
It is recommended that users try simple nested designs first and always
review the
summary
of
corDesign
object before calling
geeDesign
and
gee.fit
.
For more details, use
as.list
to see all components as described in the help file of
corDesign.object
.
If
n.layer
is greater than one,
the design has a multilayer structured correlation, of which
nested designs and unbalanced block designs are examples.
These designs have more than one layer of correlation structures,
and each layer has an id, a correlation type, and within-layer
block information as specified by arguments
type.layer
and
block
.
Each layer defines an unique parameterization for
correlation with a type of structure chosen from the generic structures listed
above.
A layer may have one or more blocks, and blocks of the same layer
have the same parameterization and correlation type.
Different layers have distinct parameterizations, so that
blocks of different parameterizations have to be considered
as different layers.
Cells with zero correlation
in the design matrix may be used to construct the representation of
multilayer designs, but they
do not constitute a layer of correlation parameterization.
The diagonal of the design matrix is also not considered
as a layer of correlation.
For designs with more than 1-layer,
arguments
type.layer
and
block
are necessary.
If not provided, a nested (
2^n.layer
)
factorial design assuming 2-level for each layer is generated.
This design has
(
2^n.layer-1
) blocks in the upper triangle
of the correlation matrix. See examples.
If argument
type.layer
has a layer variable,
x.layer
, it must be in the search path
or in the data frame when
gee
is called.
In a
gee
call with unbalanced data,
the record id specified in the argument
cluster
is used to match the row names of the
"dimnames"
attribute
of the correlation matrix or the design matrix.
Chao, E. C. (2003). Structured correlation designs in modeling clustered data. Insightful technical report.
## A nested 2^3 factorial design corDesign(design.option="nested", n.layer=3) ## A 1-layer design ex1 <- data.frame(layer.id="1", begin.row=1, begin.col=2, end.row=11, end.col=12) corDesign(design.option="block", size=12, n.layer=1, type.layer="AR", block=ex1) ## ex2 <- data.frame(layer.id = c(1, 2), begin.row = c(1, 3), begin.col = c(1,3), end.row = c(2, 5), end.col = c(2, 5)) ## An alternative way for large number of layers or blocks ex2 <- data.frame(c(1,2),t(matrix(c(1,1,2,2,3,3,5,5),4))) names(ex2) <- c("layer.id","begin.row","begin.col","end.row","end.col") ex2.design <- corDesign(design.option="block", size=5, n.layer=2, type.layer=data.frame(type=c("AR","exchangeable")), block=ex2) summary(ex2.design) as.list(ex2.design) Seizure.Subject <- recordDesign(cluster = "Subject", data = Seizure) ex2.gee <- geeDesign(y ~ Time + group, cluster = cbind(clusterID,recordID), variance = "glm.scale", family = "poisson", link = "log", correlation = ex2.design, data = Seizure.Subject) gee.fit(ex2.gee) ## A 1-layer of nonstationary nested in a layer of unstructured correlation ex1 <- data.frame(t(matrix(c(1,1,2,3,4,1,5,6,7,8,2,1,5,4,8),5))) names(ex1) <- c("layer.id","begin.row","begin.col","end.row","end.col") type.1 <- data.frame(type=c("nonstationary","unstruct"), x.layer=c("record","xx"),par=c(2,NA)) data.1 <- data.frame(record=rep(1:4,2),xx=1:8) xx1 <- corDesign(design.option = "block", n.layer = 2, size = 8, type.layer = type.1, block = ex1, data=data.1) summary(xx1)