Candidate-gene association studies with pedigree data: Controlling for environmental covariates



Case-control studies provide an important epidemiological tool to evaluate candidate genes. There are many different study designs available. We focus on a more recently proposed design, which we call a multiplex case-control (MCC) design. This design compares allele frequencies between related cases, each of whom are sampled from multiplex families, and unrelated controls. Since within-family genotype correlations will exist, statistical methods will need to take this into account. Moreover, there is a need to develop methods to simultaneously control for potential confounders in the analysis. Generalized estimating equations (GEE) are one approach to analyze this type of data; however, this approach can have singularity problems when estimating the correlation matrix. To allow for modeling of other covariates, we extend our previously developed method to a more general model-based approach. Our proposed methods use the score statistic, derived from a composite likelihood. We propose three different approaches to estimate the variance of this statistic. Under random ascertainment of pedigrees, score tests have correct type I error rates; however, pedigrees are not randomly ascertained. Thus, through simulations, we test the validity and power of the score tests under different ascertainment schemes, and an illustration of our methods, applied to data from a prostate cancer study, is presented. We find that our robust score statistic has estimated type I error rates within the expected range for all situations we considered whereas the other two statistics have inflated type I error rates under nonrandom ascertainment schemes. We also find GEE to fail at least 5% of the time for each simulation configuration; at times, the failure rate reaches above 80%. In summary, our robust method may be the only current regression analysis method available for MCC data. Genet Epidemiol 24:273–283, 2003. © 2003 Wiley-Liss, Inc.