Marginal Analyses of Clustered Data When Cluster Size Is Informative

Authors

  • John M. Williamson,

    1. Division of HIV/AIDS Prevention, National Center for HIV, STD and TB Prevention, Centers for Disease Control and Prevention, MS E-37, 1600 Clifton Road, NE, Atlanta, Georgia 30333, U.S.A.
    2. email:jow5@cdc.gov
    Search for more papers by this author
  • Somnath Datta,

    1. Department of Statistics, University of Georgia, Athens, Georgia 30602, U.S.A.
    Search for more papers by this author
  • Glen A. Satten

    1. Division of Laboratory Science, National Center for Environmental Health, Centers for Disease Control and Prevention, MS F-24, 1600 Clifton Road, NE, Atlanta, Georgia 30333, U.S.A.
    Search for more papers by this author

Abstract

Summary.  We propose a new approach to fitting marginal models to clustered data when cluster size is in- formative. This approach uses a generalized estimating equation (GEE) that is weighted inversely with the cluster size. We show that our approach is asymptotically equivalent to within-cluster resampling (Hoffman, Sen, and Weinberg, 2001, Biometrika73, 13–22), a computationally intensive approach in which replicate data sets containing a randomly selected observation from each cluster are analyzed, and the resulting estima- tes averaged. Using simulated data and an example involving dental health, we show the superior performa- nce of our approach compared to unweighted GEE, the equivalence of our approach with WCR for large sam- ple sizes, and the superior performance of our approach compared with WCR when sample sizes are small.

Ancillary