Powerful multi-marker association tests: unifying genomic distance-based regression and logistic regression



To detect genetic association with common and complex diseases, many statistical tests have been proposed for candidate gene or genome-wide association studies with the case-control design. Due to linkage disequilibrium (LD), multi-marker association tests can gain power over single-marker tests with a Bonferroni multiple testing adjustment. Among many existing multi-marker association tests, most target to detect only one of many possible aspects in distributional differences between the genotypes of cases and controls, such as allele frequency differences, while a few new ones aim to target two or three aspects, all of which can be implemented in logistic regression. In contrast to logistic regression, a genomic distance-based regression (GDBR) approach aims to detect some high-order genotypic differences between cases and controls. A recent study has confirmed the high power of GDBR tests. At this moment, the popular logistic regression and the emerging GDBR approaches are completely unrelated; for example, one has to choose between the two. In this article, we reformulate GDBR as logistic regression, opening a venue to constructing other powerful tests while overcoming some limitations of GDBR. For example, asymptotic distributions can replace time-consuming permutations for deriving P-values and covariates, including gene-gene interactions, can be easily incorporated. Importantly, this reformulation facilitates combining GDBR with other existing methods in a unified framework of logistic regression. In particular, we show that Fisher's P-value combining method can boost statistical power by incorporating information from allele frequencies, Hardy–Weinberg disequilibrium, LD patterns, and other higher-order interactions among multi-markers as captured by GDBR. Genet. Epidemiol. 34:680–688, 2010. © 2010 Wiley-Liss, Inc.