This article is a US Government work and, as such, is in the public domain in the United States of America.
Genetic background comparison using distance-based regression, with applications in population stratification evaluation and adjustment†
Article first published online: 12 JAN 2009
Published 2009 Wiley-Liss, Inc. This article is a US government work and, as such, is in the public domain in the United States of America
Volume 33, Issue 5, pages 432–441, July 2009
How to Cite
Li, Q., Wacholder, S., Hunter, D. J., Hoover, R. N., Chanock, S., Thomas, G. and Yu, K. (2009), Genetic background comparison using distance-based regression, with applications in population stratification evaluation and adjustment. Genet. Epidemiol., 33: 432–441. doi: 10.1002/gepi.20396
- Issue published online: 12 JUN 2009
- Article first published online: 12 JAN 2009
- Manuscript Accepted: 14 NOV 2008
- Manuscript Revised: 7 OCT 2008
- Manuscript Received: 6 JUL 2008
- population stratification;
- pseudo F statistic;
- distance-based regression
Population stratification (PS) can lead to an inflated rate of false-positive findings in genome-wide association studies (GWAS). The commonly used approach of adjustment for a fixed number of principal components (PCs) could have a deleterious impact on power when selected PCs are equally distributed in cases and controls, or the adjustment of certain covariates, such as self-identified ethnicity or recruitment center, already included in the association analyses, correctly maps to major axes of genetic heterogeneity. We propose a computationally efficient procedure, PC-Finder, to identify a minimal set of PCs while permitting an effective correction for PS. A general pseudo F statistic, derived from a non-parametric multivariate regression model, can be used to assess whether PS exists or has been adequately corrected by a set of selected PCs. Empirical data from two GWAS conducted as part of the Cancer Genetic Markers of Susceptibility (CGEMS) project demonstrate the application of the procedure. Furthermore, simulation studies show the power advantage of the proposed procedure in GWAS over currently used PS correction strategies, particularly when the PCs with substantial genetic variation are distributed similarly in cases and controls and therefore do not induce PS. Genet. Epidemiol. 33:432–441, 2009. © 2009 Wiley-Liss, Inc.