Title: | Simultaneous Inference for Multiple Linear Contrasts in GEE Models |
---|---|
Description: | Provides global hypothesis tests, multiple testing procedures and simultaneous confidence intervals for multiple linear contrasts of regression coefficients in a single generalized estimating equation (GEE) model or across multiple GEE models. GEE models are fit by a modified version of the 'geeM' package. |
Authors: | Robin Ristl [aut, cre], Lee McDaniel [ctb] (Author of 'geeM' package), Nick Henderson [ctb] (Author of 'geeM' package), Melanie Prague [ctb] (Contributor to 'geeM' package) |
Maintainer: | Robin Ristl <[email protected]> |
License: | GPL-3 |
Version: | 1.20 |
Built: | 2024-10-25 03:46:33 UTC |
Source: | https://github.com/cran/mmmgee |
A data set was simulated with repeated observations of a continous outcome variable Y.lin, a count data outcome Y.poi and a binary outcome Y.bin. In the simulation, the mean of an outcome variable depends on the binary grouping variable gr.lang and one of the continuos predictors x1, x2 and x3. Observations of all outcomes in the same subject, indicated by the variable id, are correlated. Also x1, x2 and x3 are correlated within subjects. Data are independent between subjects. The example code shows how to fit a marginal GEE model for each outcome variable and how to test, with different methods, the null hypothesis that the grouping variable has no effect on any of the three endpoints.
data(datasim)
data(datasim)
A data frame.
data(datasim) head(datasim) mod1<-geem2(Y.lin~gr.lang+x1,id=id,data=datasim,family="gaussian",corstr="exchangeable") mod2<-geem2(Y.poi~gr.lang+x2,id=id,data=datasim,family="poisson",corstr="exchangeable") mod3<-geem2(Y.bin~gr.lang+x3,id=id,data=datasim,family="binomial",corstr="exchangeable") L1<-L2<-L3<-matrix(c(0,1,0),nrow=1) mmmgee.test(list(mod1,mod2,mod3),L=list(L1,L2,L3),statistic="Wald",type="maximum", biascorr=TRUE,asymptotic=FALSE,closed.test=TRUE) ## Not run: mmmgee.test(list(mod1,mod2,mod3),L=list(L1,L2,L3),statistic="score",closed.test=TRUE) mmmgee.test(list(mod1,mod2,mod3),L=list(L1,L2,L3),statistic="score",type="quadratic", closed.test=TRUE) mmmgee.test(list(mod1,mod2,mod3),L=list(L1,L2,L3),statistic="Wald",type="quadratic", biascorr=TRUE,asymptotic=FALSE,closed.test=TRUE) mmmgee.test(list(mod1,mod2,mod3),L=list(L1,L2,L3),statistic="Wald",type="quadratic", scaled=TRUE,biascorr=TRUE,asymptotic=FALSE,closed.test=TRUE) ## End(Not run)
data(datasim) head(datasim) mod1<-geem2(Y.lin~gr.lang+x1,id=id,data=datasim,family="gaussian",corstr="exchangeable") mod2<-geem2(Y.poi~gr.lang+x2,id=id,data=datasim,family="poisson",corstr="exchangeable") mod3<-geem2(Y.bin~gr.lang+x3,id=id,data=datasim,family="binomial",corstr="exchangeable") L1<-L2<-L3<-matrix(c(0,1,0),nrow=1) mmmgee.test(list(mod1,mod2,mod3),L=list(L1,L2,L3),statistic="Wald",type="maximum", biascorr=TRUE,asymptotic=FALSE,closed.test=TRUE) ## Not run: mmmgee.test(list(mod1,mod2,mod3),L=list(L1,L2,L3),statistic="score",closed.test=TRUE) mmmgee.test(list(mod1,mod2,mod3),L=list(L1,L2,L3),statistic="score",type="quadratic", closed.test=TRUE) mmmgee.test(list(mod1,mod2,mod3),L=list(L1,L2,L3),statistic="Wald",type="quadratic", biascorr=TRUE,asymptotic=FALSE,closed.test=TRUE) mmmgee.test(list(mod1,mod2,mod3),L=list(L1,L2,L3),statistic="Wald",type="quadratic", scaled=TRUE,biascorr=TRUE,asymptotic=FALSE,closed.test=TRUE) ## End(Not run)
geem2
is a modified version of geem
to fit generalized estimating equation models and to provide objects that can be used
for simultaneous inference across multiple marginal models using mmmgee
and mmmgee.test
.
Like geem
, geem2
estimates coefficients and nuisance parameters using generalized estimating equations. The link and variance functions can be
specified by the user and the syntax is similar to glm
.
geem2(formula, id, waves = NULL, data = parent.frame(), family = gaussian, corstr = "independence", Mv = 1, weights = NULL, corr.mat = NULL, init.beta = NULL, init.alpha = NULL, init.phi = 1, scale.fix = FALSE, nodummy = FALSE, sandwich = TRUE, useP = TRUE, maxit = 20, tol = 1e-05, restriction = NULL, conv.criterion = c("ratio", "difference"))
geem2(formula, id, waves = NULL, data = parent.frame(), family = gaussian, corstr = "independence", Mv = 1, weights = NULL, corr.mat = NULL, init.beta = NULL, init.alpha = NULL, init.phi = 1, scale.fix = FALSE, nodummy = FALSE, sandwich = TRUE, useP = TRUE, maxit = 20, tol = 1e-05, restriction = NULL, conv.criterion = c("ratio", "difference"))
formula |
a formula expression similar to that for |
id |
a vector identifying the clusters. By default, data are assumed to be sorted such that observations in a cluster are in consecutive rows and higher
numbered rows in a cluster are assumed to be later. If |
waves |
a non-negative integer vector identifying components of a cluster. For example, this could be a time ordering. If integers are skipped within a cluster,
then dummy rows with weight 0 are added in an attempt to preserve the correlation structure (except if |
data |
an optional data frame containing the variables in the model. |
family |
will determine the link and variance functions. The argument can be one of three options: a family object, a character string, or a list of functions.
For more information on how to use family objects, see |
corstr |
a character string specifying the correlation structure. Allowed structures are: |
Mv |
for |
weights |
A vector of weights for the inverse of the scale factor each observation.
If an observation is assigned weight 0, it is excluded from the calculations of any parameters.
Observations with a |
corr.mat |
the correlation matrix for |
init.beta |
an optional vector with the initial values of beta. If not specified, then the intercept will be set to |
init.alpha |
an optional scalar or vector giving the initial values for the correlation. If provided along with |
init.phi |
an optional initial scale parameter. If not supplied, initialized to 1. |
scale.fix |
if set to |
nodummy |
if set to |
sandwich |
if |
useP |
if set to |
maxit |
maximum number of iterations. |
tol |
tolerance in calculation of coefficients. |
restriction |
either a contrast matrix or a list of a contrast matrix and a right hand side vector, defining a restriction on the regression coefficients. See details. |
conv.criterion |
convergence criterion, either |
The function is a modification of geem
from the geeM package, such that
additional output is returned that is required for the calculation of covariance
matrix across multiple marginal models. In particular the contributions of each subject to the estimating equation are made available in the output. Internal functions regarding
the calculation of matrix inverses were modified to improve the handling of missing data.
In geem2
, weights are defined as scale weights, similar to most othe software. Note that, in contrast, the current version of geem
(version 0.10.1) uses
residual weights.
The scale parameter phi
is used in estimating the residual working correlation parameters and in estimating the model based (naiv) covariance matrix of the regression coefficients.
Similar as in most other software, requesting scale.fix=TRUE
only has an impact on the latter, while the working correlation is still estimated using an empirical
scale factor for the residuals. In contrast, geem
uses the fixed scale factor also when estimating the working correlation.
geem2
allows for estimation of regression coefficients under linear restrictions , where
is a contrast matrix,
the vector of regression coefficients and
a real valued right hand side vector. Using the argument
restriction
, and
can be specified.
If only
is specified,
is assumed as null vector. The functionality is in particular required to calculate the generalized score test for linear hypotheses about
. Use
conv.criterion="difference"
if any regression coefficient is restricted to 0.
A list with class geem2
, similar to the output
of geem
from the geeM package. The additional slot sandwich.args
contains components to calculate the sandwich variance estimator for the
fitted model and across models if
applied in the multiple marginal model framework.
The option to fit a model with linear restrictions concerning the coefficients is implemented to enable the calculation of a generalized score test.
It may also be used to obtain estimates of the coefficients under restrictions. The model based and robust variance estimates of the restricted
coefficient estimates are found in the slots restricted.naiv.var
and restricted.var
, respectively. Note that the variance of estimates restricted to a single value
is supposed to be zero, however the calculated variance estimate may deviate from zero within machine accuracy.
The geem
function was written by Lee McDaniel and Nick Henderson, modifications for geem2
are by Robin Ristl [email protected]
Lee S. McDaniel, Nicholas C. Henderson, Paul J. Rathouz. Fast pure R implementation of GEE: application of the matrix package. The R journal 5.1 (2013): 181.
data(keratosis) m1<-geem2(clearance~trt,id=id,data=keratosis,family=binomial,corstr="independence") summary(m1) m2<-geem2(pain~trt,id=id,data=keratosis[keratosis$lesion==1,],family=gaussian,corstr="independence") summary(m2) geem2(pain~trt,id=id,data=keratosis[keratosis$lesion==1,],family=gaussian,corstr="exchangeable") # data(datasim) mod1<-geem2(Y.lin~gr.lang+x1,id=id,data=datasim,family="gaussian",corstr="exchangeable") summary(mod1) mod2<-geem2(Y.poi~gr.lang+x2,id=id,data=datasim,family="poisson",corstr="unstructured") summary(mod2) mod3<-geem2(Y.bin~gr.lang+x3,id=id,data=datasim,family="binomial",corstr="user", corr.mat=matrix(c(1,2,3,0, 2,1,2,3, 3,2,1,2, 0,3,2,1),4,4)) summary(mod3)
data(keratosis) m1<-geem2(clearance~trt,id=id,data=keratosis,family=binomial,corstr="independence") summary(m1) m2<-geem2(pain~trt,id=id,data=keratosis[keratosis$lesion==1,],family=gaussian,corstr="independence") summary(m2) geem2(pain~trt,id=id,data=keratosis[keratosis$lesion==1,],family=gaussian,corstr="exchangeable") # data(datasim) mod1<-geem2(Y.lin~gr.lang+x1,id=id,data=datasim,family="gaussian",corstr="exchangeable") summary(mod1) mod2<-geem2(Y.poi~gr.lang+x2,id=id,data=datasim,family="poisson",corstr="unstructured") summary(mod2) mod3<-geem2(Y.bin~gr.lang+x3,id=id,data=datasim,family="binomial",corstr="user", corr.mat=matrix(c(1,2,3,0, 2,1,2,3, 3,2,1,2, 0,3,2,1),4,4)) summary(mod3)
A data set simulated under the planning assumptions for a study comparing four radiation regimens for a photodynamic treatment of actinic keratosis. Each patient receives each treatment in a different skin patch and each patch contains four lesions. Variables are patient identifier (id), treatment (trt), lesion identifier within a patient (lesion), the binary outcome clearance success (1=success, 0=no success) reported for each lesion and the metric outcome pain (larger values indicating more pain) reported for each skin patch. The aim of the study is to compare the treatments B, C and D to the reference treatment A in terms of both outcomes.
data(keratosis)
data(keratosis)
A data frame.
data(keratosis) head(keratosis)
data(keratosis) head(keratosis)
Calculate the covariance matrix for a stacked vector of regression coefficients
from multiple marginal GEE models fitted with geem2
.
mmmgee(x, biascorr = FALSE)
mmmgee(x, biascorr = FALSE)
x |
a list of |
biascorr |
logical, if |
A list with class mmmgee
containing the following components:
beta
The stacked vector of regression coefficient estimates from the models in x
.
V
The estimated covariance matrix of the regression coefficient estimates.
A
The outer component of .
B
The inner component of .
biascorr
The value of the input argument biascorr
(logical).
n
A vector with the number of clusters in each model in x
.
p
A vector with number of regression coefficients in each model in x
.
Robin Ristl, [email protected]
Lloyd A. Mancl, Timothy A. DeRouen. A covariance estimator for GEE with improved small sample properties. Biometrics, 2001, 57(1):126-134.
data(keratosis) m1<-geem2(clearance~trt,id=id,data=keratosis,family=binomial,corstr="independence") m2<-geem2(pain~trt,id=id,data=keratosis[keratosis$lesion==1,],family=gaussian,corstr="independence") mmmgee(x=list(m1,m2),biascorr=TRUE)
data(keratosis) m1<-geem2(clearance~trt,id=id,data=keratosis,family=binomial,corstr="independence") m2<-geem2(pain~trt,id=id,data=keratosis[keratosis$lesion==1,],family=gaussian,corstr="independence") mmmgee(x=list(m1,m2),biascorr=TRUE)
Global hypothesis tests, multiple testing procedures and simultaneous confidence intervals for multiple linear contrasts of regression coefficients in a single generalized estimating equation (GEE) model or across multiple GEE models.
mmmgee.test(x, L, r = NULL, statistic = c("wald", "score"), type = c("maximum", "quadratic"), asymptotic = TRUE, biascorr = FALSE, closed.test = FALSE, conf.int = FALSE, conf.level = 0.95, alternative = c("undirected", "greater", "less"), denomDF = NULL, scaled.F = FALSE, maxit = 20, tol = 10^(-8), ...)
mmmgee.test(x, L, r = NULL, statistic = c("wald", "score"), type = c("maximum", "quadratic"), asymptotic = TRUE, biascorr = FALSE, closed.test = FALSE, conf.int = FALSE, conf.level = 0.95, alternative = c("undirected", "greater", "less"), denomDF = NULL, scaled.F = FALSE, maxit = 20, tol = 10^(-8), ...)
x |
a |
L |
a contrast matrix defining a contrast for the stacked vector of regression coefficients of the marginal models, or a list of contrast matrices.
In the latter case, the list must contain one matrix for each model listed in |
r |
right hand side vector of the null hypothesis or a list of vectors resembling the right hand side of the null hypothesis. If not specified, |
statistic |
either |
type |
either |
asymptotic |
logical, if |
biascorr |
logical indicating whether the Mancl and DeRouen Bias correction should be used when estimating the joint covariance matrix via |
closed.test |
logical, if |
conf.int |
logical. If |
conf.level |
the nominal simultaneous coverage probability of the confidence intervals. |
alternative |
one of |
denomDF |
Defaults to |
scaled.F |
logical. If |
maxit |
maximal number of iterations to be passed to |
tol |
tolerance limit for the convergence criterion to be passed to |
... |
additional arguments that are passed to |
The null hypothesis is where
L
is a contrast matrix,
the stacked vector of regression coefficients rom the marginal models and
r
a real values right hand side vector.
L
can be specified as matrix or, if it is a block diagonal matrix with each block corresponding to
a contrast for one marginal GEE model, as list of the matrices on the diagonal. The right hand side r
can be speficied as vector or as list of
vectors each corresponding to the part of the right hand side vector for one model.
When choosing statistic="wald"
and type="maximum"
, the maximum of the standardized entries of is used as test statistic and the
p-value is calculated from a multivariate normal or t-distribution (depending on
asymptotic
being TRUE
or FALSE
) with correlation matrix
estimated for . For the t-distribution, denominator degrees of freedom are used as specified in
denomDF
.
When choosing statistic="wald"
and type="quadratic"
, a quadratic form of and the inverse of the estimated covariance matrix
of
is used as test statistic and the p-value is calculated from a
chi-squared distribution or an F-distribution (depending on
asymptotic
being TRUE
or FALSE
).
With statistic="score"
, generalized score tests are calculated by replacing by the first order approximation
where
is the stacked estimating equation (the score) and
is the negative inverse of the matrix
of first derivatives of
, both evaluated at the location of constrained estimates for
under the null hypothesis.
Analogous to the Wald statistic, a maximum-type and a quadratic form score test are available. For the score test the option
asymptotic
is ignored
and the reference distribution is multivariate normal or chi-squared.
A list with class mmmgeetest
containing the following components, if required:
test
Contains a data frame with the test statistic, degrees of freedom (depending in the type of test) and the p-value. If closed test was required,
a further data frame is reported with estimates, right hand side vector, unadjusted p-values and adjusted p-values for each line of .
hypothesis
A list containing the contrast matrix and the right hand side vector
.
conf.int
The simultaneous confidence intervals.
denomDF
The type and value of the denominator degrees of freedom used in the procedure.
mmmgee
The mmmgee
object containing in particular the estimated covariance matrix for the coefficents of the models in x
. See mmmgee
.
A single value for the denominator degrees of freedom is calculated for the covariance matrix estimate across all contrasts. In the closed testing procedure, this value is used for the degrees of freedom associated with the covariance matrix of any subset of contrasts.
Usual linear models or generalized linear models can be regarded as special case of GEE models and can be included in the analysis framework.
Note however that mmmgee.test
always uses the robust sandwich covariance matrix estimate, even if the calculation of the sandwich covariance was suppressed
in the model objects in x
.
Robin Ristl, [email protected]
Dennis D. Boos. On generalized score tests. The American Statistician, 1992, 46(4):327-333.
Lloyd A. Mancl, Timothy A. DeRouen. A covariance estimator for GEE with improved small sample properties. Biometrics, 2001, 57(1):126-134.
data(keratosis) m1<-geem2(clearance~trt,id=id,data=keratosis,family=binomial,corstr="independence") m2<-geem2(pain~trt,id=id,data=keratosis[keratosis$lesion==1,],family=gaussian,corstr="independence") L1<-L2<-diag(1,4)[-1,] mmmgee.test(x=m1,L=list(L1),statistic="wald",type="maximum") mmmgee.test(x=m1,L=list(L1),statistic="score",type="maximum") mmmgee.test(x=list(m1,m2),L=list(L1,L2),type="maximum",asymptotic=FALSE,biascorr=TRUE) mmmgee.test(x=list(m1,m2),L=list(L1,L2),type="maximum",closed.test=TRUE) mmmgee.test(x=list(m1,m2),L=list(L1,L2),type="maximum",asymptotic=FALSE, alternative="less",conf.int=TRUE,denomDF=40) mmmgee.test(x=list(m1,m2),L=list(L1,L2),type="quadratic",asymptotic=TRUE) mmmgee.test(x=list(m1,m2),L=list(L1,L2),statistic="score",type="quadratic") mmmgee.test(x=list(m1,m2),L=list(L1,L2),statistic="score",type="maximum") # ## Not run: data(datasim) mod1<-geem2(Y.lin~gr.lang+x1,id=id,data=datasim,family="gaussian",corstr="exchangeable") mod2<-geem2(Y.poi~gr.lang+x2,id=id,data=datasim,family="poisson",corstr="exchangeable") mod3<-geem2(Y.bin~gr.lang+x3,id=id,data=datasim,family="binomial",corstr="exchangeable") Li<-matrix(c(0,1,0),nrow=1) mmmgee.test(list(mod1,mod2,mod3),L=list(Li,Li,Li),statistic="Wald",type="maximum", biascorr=TRUE,asymptotic=FALSE,closed.test=TRUE) mmmgee.test(list(mod1,mod2,mod3),L=list(Li,Li,Li),statistic="score",closed.test=TRUE) ## End(Not run)
data(keratosis) m1<-geem2(clearance~trt,id=id,data=keratosis,family=binomial,corstr="independence") m2<-geem2(pain~trt,id=id,data=keratosis[keratosis$lesion==1,],family=gaussian,corstr="independence") L1<-L2<-diag(1,4)[-1,] mmmgee.test(x=m1,L=list(L1),statistic="wald",type="maximum") mmmgee.test(x=m1,L=list(L1),statistic="score",type="maximum") mmmgee.test(x=list(m1,m2),L=list(L1,L2),type="maximum",asymptotic=FALSE,biascorr=TRUE) mmmgee.test(x=list(m1,m2),L=list(L1,L2),type="maximum",closed.test=TRUE) mmmgee.test(x=list(m1,m2),L=list(L1,L2),type="maximum",asymptotic=FALSE, alternative="less",conf.int=TRUE,denomDF=40) mmmgee.test(x=list(m1,m2),L=list(L1,L2),type="quadratic",asymptotic=TRUE) mmmgee.test(x=list(m1,m2),L=list(L1,L2),statistic="score",type="quadratic") mmmgee.test(x=list(m1,m2),L=list(L1,L2),statistic="score",type="maximum") # ## Not run: data(datasim) mod1<-geem2(Y.lin~gr.lang+x1,id=id,data=datasim,family="gaussian",corstr="exchangeable") mod2<-geem2(Y.poi~gr.lang+x2,id=id,data=datasim,family="poisson",corstr="exchangeable") mod3<-geem2(Y.bin~gr.lang+x3,id=id,data=datasim,family="binomial",corstr="exchangeable") Li<-matrix(c(0,1,0),nrow=1) mmmgee.test(list(mod1,mod2,mod3),L=list(Li,Li,Li),statistic="Wald",type="maximum", biascorr=TRUE,asymptotic=FALSE,closed.test=TRUE) mmmgee.test(list(mod1,mod2,mod3),L=list(Li,Li,Li),statistic="score",closed.test=TRUE) ## End(Not run)