Get access

Comparison of methods and sampling designs to test for association between rare variants and quantitative traits



Genome-wide association studies succeeded in finding genetic variants associated with various phenotypes, but a large portion of the predicted genetic contribution to many traits remains unknown. One plausible explanation is that some missing variation is due to rare variants. Latest sequencing technology facilitates the investigation of such rare variants, but their statistical analysis remains challenging. For quantitative traits, a commonly used approach is to contrast the frequency of putatively functional rare variants between subjects in the two tails of the trait distribution. The contrast is usually performed by Fisher's exact or similar test. These tests are conservative as they discard trait rank information and are most useful under the unrealistic homogeneity assumption (i.e., variants have similar effects). We propose, and investigate via simulations, various designs for resequencing studies and statistical methods that incorporate information about rank, predicted function and allow for heterogeneity of effects. We propose designs which accommodate heterogeneity by sequencing both tails and the middle of the trait and novel statistical tests for trend, for heterogeneity and for a combination of the two. The conclusions of the simulations are four fold: (1) sequencing both tails and the middle of the trait distributions is desirable when heterogeneity is suspected, (2) trend and heterogeneity statistics should be used alongside other methods, (3) using rank information improves power over Fisher's exact test when the number of rare variants is not very large and (4) due to high misclassification rates, incorporating current predictions of a variant's function does not improve power. Genet. Epidemiol. 35: 226-235, 2011.  © 2011 Wiley-Liss, Inc.