Relationship between genomic distance-based regression and kernel machine regression for multi-marker association testing

Authors

  • Wei Pan

    Corresponding author
    1. Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota
    • Division of Biostatistics, MMC 303, School of Public Health, University of Minnesota, Minneapolis, MN 55455–0392
    Search for more papers by this author

Abstract

To detect genetic association with common and complex diseases, two powerful yet quite different multimarker association tests have been proposed, genomic distance-based regression (GDBR) (Wessel and Schork [2006] Am J Hum Genet 79:821–833) and kernel machine regression (KMR) (Kwee et al. [2008] Am J Hum Genet 82:386–397; Wu et al. [2010] Am J Hum Genet 86:929–942). GDBR is based on relating a multimarker similarity metric for a group of subjects to variation in their trait values, while KMR is based on nonparametric estimates of the effects of the multiple markers on the trait through a kernel function or kernel matrix. Since the two approaches are both powerful and general, but appear quite different, it is important to know their specific relationships. In this report, we show that, under the condition that there is no other covariate, there is a striking correspondence between the two approaches for a quantitative or a binary trait: if the same positive semi-definite matrix is used as the centered similarity matrix in GDBR and as the kernel matrix in KMR, the F-test statistic in GDBR and the score test statistic in KMR are equal (up to some ignorable constants). The result is based on the connections of both methods to linear or logistic (random-effects) regression models. Genet. Epidemiol 35: 211-216, 2011   © 2011 Wiley-Liss, Inc.

Ancillary