A coalescent simulation of marker selection strategy for candidate gene association studies


  • Please cite this article as follows: Cole SM, Long JC. 2007. A Coalescent Simulation of Marker Selection Strategy for Candidate Gene Association Studies. Am J Med Genet Part B 147B:86–93.


Recent efforts have focused on the challenges of finding alleles that contribute to health-related phenotypes in genome-wide association studies. However, in candidate gene studies, where the genomic region of interest is small and recombination is limited, factors that affect the ability to detect disease-susceptibility alleles remain poorly understood. In particular, it is unclear how varying the number of markers on a haplotype, the type of marker (e.g., single nucleotide polymorphism (SNP), short tandem repeat (STR)), including the causative site (cs) as a genetic marker, or population demographics influences the power to detect a candidate gene. We evaluated the power of association tests using coalescent-modeled computer simulations. Results show that an effective number of markers on a haplotype is dependent on whether the cs is included as a marker. When the analyses include the cs, highest power is achieved with a single-marker association test. However, when the cs is excluded from analyses, the addition of more nonfunctional SNPs on the haplotype increases power to a certain point under most scenarios. We find a rapidly expanding population always has lower power compared to a population of constant size; although utilizing markers with a frequency of at least 5% improves the chance of detecting an association. Comparing the mutational properties of a nonfunctional SNP versus an STR, multi-allelic STRs provide more or comparable power than a bi-allelic SNP unless SNP frequencies are constrained to 10% or more. Similarly, including an STR with SNPs on a haplotype improves power unless SNP frequencies are 5% or more. © 2007 Wiley-Liss, Inc.