SNPs, haplotypes, and model selection in a candidate gene region: The SIMPle analysis for multilocus data
Article first published online: 12 NOV 2004
© 2004 Wiley-Liss, Inc.
Volume 27, Issue 4, pages 429–441, December 2004
How to Cite
Conti, D. V. and Gauderman, W. J. (2004), SNPs, haplotypes, and model selection in a candidate gene region: The SIMPle analysis for multilocus data. Genet. Epidemiol., 27: 429–441. doi: 10.1002/gepi.20039
- Issue published online: 12 NOV 2004
- Article first published online: 12 NOV 2004
- Manuscript Accepted: 31 AUG 2004
- Manuscript Received: 17 MAY 2004
- Zilka Neurogenetics Institute
- NIH. Grant Numbers: ES10421, 5P30ES07048
- Bayes model averaging;
- association analysis
Modern molecular techniques make discovery of numerous single nucleotide polymorphims (SNPs) in candidate gene regions feasible. Conventional analysis relies on either independent tests with each variant or the use of haplotypes in association analysis. The first technique ignores the dependencies between SNPs. The second, though it may increase power, often introduces uncertainty by estimating haplotypes from population data. Additionally, as the number of loci expands for a haplotype, ambiguity in interpretation increases for determining the underlying genetic components driving a detected association. Here, we present a genotype-level analysis to jointly model the SNPs via a SNP interaction model with phase information (SIMPle) to capture the underlying haplotype structure. This analysis estimates both the risk associated with each variant and the importance of phase between pairwise combinations of SNPs. Thus, rather than selecting between genotype- or haplotype-level approaches, the SIMPle method frames the analysis of multilocus data in a model selection paradigm, the aim to determine which SNPs, phase terms, and linear combinations best describe the relation between genetic variation and a trait of interest. To avoid unstable estimation due to sparse data and to incorporate both the dependencies among terms and the uncertainty in model selection, we propose a Bayes model averaging procedure. This highlights key SNPs and phase terms and yields a set of best representative models. Using simulations, we demonstrate the utility of the SIMPle model to identify crucial SNPs and underlying haplotype structures across a variety of causal models and genetic architectures. Genet. Epidemiol. © 2004 Wiley-Liss, Inc.