Original Article
Comparison of statistical tests for disease association with rare variants
Article first published online: 18 JUL 2011
DOI: 10.1002/gepi.20609
© 2011 Wiley Periodicals, Inc.
Additional Information
How to Cite
Basu, S. and Pan, W. (2011), Comparison of statistical tests for disease association with rare variants. Genet. Epidemiol., 35: 606–619. doi: 10.1002/gepi.20609
Publication History
- Issue published online: 17 OCT 2011
- Article first published online: 18 JUL 2011
- Manuscript Accepted: 3 JUN 2011
- Manuscript Revised: 23 MAR 2011
- Manuscript Received: 30 NOV 2010
Funded by
- NIH. Grant Numbers: R21DK089351, R01HL65462, R01HL105397
- Abstract
- Article
- References
- Cited By
Keywords:
- C-alpha test;
- kernel machine regression;
- logistic regression;
- model selection;
- permutation;
- pooled association tests;
- random-effects models;
- SSU test;
- Sum test;
- statistical power
Abstract
In anticipation of the availability of next-generation sequencing data, there is increasing interest in investigating association between complex traits and rare variants (RVs). In contrast to association studies for common variants (CVs), due to the low frequencies of RVs, common wisdom suggests that existing statistical tests for CVs might not work, motivating the recent development of several new tests for analyzing RVs, most of which are based on the idea of pooling/collapsing RVs. However, there is a lack of evaluations of, and thus guidance on the use of, existing tests. Here we provide a comprehensive comparison of various statistical tests using simulated data. We consider both independent and correlated rare mutations, and representative tests for both CVs and RVs. As expected, if there are no or few non-causal (i.e. neutral or non-associated) RVs in a locus of interest while the effects of causal RVs on the trait are all (or mostly) in the same direction (i.e. either protective or deleterious, but not both), then the simple pooled association tests (without selecting RVs and their association directions) and a new test called kernel-based adaptive clustering (KBAC) perform similarly and are most powerful; KBAC is more robust than simple pooled association tests in the presence of non-causal RVs; however, as the number of non-causal CVs increases and/or in the presence of opposite association directions, the winners are two methods originally proposed for CVs and a new test called C-alpha test proposed for RVs, each of which can be regarded as testing on a variance component in a random-effects model. Interestingly, several methods based on sequential model selection (i.e. selecting causal RVs and their association directions), including two new methods proposed here, perform robustly and often have statistical power between those of the above two classes. Genet. Epidemiol. 2011. © 2011 Wiley Periodicals, Inc. 35:606-619, 2011

1098-2272/asset/olbannerleft.jpg?v=1&s=7594b96a41be6d121ac42d260a9e61edb86678af)
1098-2272/asset/olbannerright.jpg?v=1&s=1d0f9dc8797dd9336c05d8f1153e82ab87b3bbfa)
