Is It Rare or Common?


  • Contract grant sponsor: National Institutes of Health; Contract grant numbers: R01MH081862, R01MH087590, U01HL089897, and U01HL089856.

Correspondence to: Kaustubh Adhikari, Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue, Boston, MA, 02115, USA. E-mail:


Many genome-wide association studies (GWAS) have signals with unknown etiology. This paper addresses the question—is such an association signal caused by rare or common variants that lead to increased disease risk? For a genomic region implicated by a GWAS, we use single nucleotide polymorphism (SNP) data in a case-control setting to predict how many common or rare variants there are, using a Bayesian analysis. Our objective is to compute posterior probabilities for configurations of rare and/or common variants. We use an extension of coalescent trees—the ancestral recombination graphs—to model the genealogical history of the samples based on marker data. As we expect SNPs to be in linkage disequilibrium with common disease variants, we can expect the trees to reflect the type of variants. To demonstrate the application, we apply our method to candidate gene sequencing data from a German case-control study on nonsyndromic cleft lip with or without cleft palate. Genet. Epidemiol. 36:419-429, 2012. © 2012 Wiley Periodicals, Inc.