The analysis of binary longitudinal data with time-dependent covariates


Justine Shults, Department of Biostatistics and Epidemiology, University of Pennsylvania School of Medicine, 423 Guardian Drive, 632 Blockley Hall, Philadelphia, PA 19104, U.S.A.



We consider longitudinal studies with binary outcomes that are measured repeatedly on subjects over time. The goal of our analysis was to fit a logistic model that relates the expected value of the outcomes with explanatory variables that are measured on each subject. However, additional care must be taken to adjust for the association between the repeated measurements on each subject. We propose a new maximum likelihood method for covariates that may be fixed or time varying. We also implement and make comparisons with two other approaches: generalized estimating equations, which may be more robust to misspecification of the true correlation structure, and alternating logistic regression, which models association via odds ratios that are subject to less restrictive constraints than are correlations. The proposed estimation procedure will yield consistent and asymptotically normal estimates of the regression and correlation parameters if the correlation on consecutive measurements on a subject is correctly specified. Simulations demonstrate that our approach can yield improved efficiency in estimation of the regression parameter; for equally spaced and complete data, the gains in efficiency were greatest for the parameter associated with a time-by-group interaction term and for stronger values of the correlation. For unequally spaced data and with dropout according to a missing-at-random mechanism, MARK1ML with correctly specified consecutive correlations yielded substantial improvements in terms of both bias and efficiency. We present an analysis to demonstrate application of the methods we consider. We also offer an R function for easy implementation of our approach. Copyright © 2012 John Wiley & Sons, Ltd.