Robust and Powerful Tests for Rare Variants Using Fisher's Method to Combine Evidence of Association From Two or More Complementary Tests

Authors

  • Andriy Derkach,

    1. Department of Statistics, University of Toronto, Toronto, Ontario, Canada
    Search for more papers by this author
  • Jerry F. Lawless,

    1. Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, Canada
    2. Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
    Search for more papers by this author
  • Lei Sun

    Corresponding author
    1. Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
    • Department of Statistics, University of Toronto, Toronto, Ontario, Canada
    Search for more papers by this author

Correspondence to: Lei Sun, Dalla Lana School of Public Health, 155 College Street, University of Toronto, Toronto, ON M5T 3M7, Canada. E-mail: sun@utstat.toronto.edu

Abstract

Many association tests have been proposed for rare variants, but the choice of a powerful test is uncertain when there is limited information on the underlying genetic model. Proposed methods use either linear statistics, which are powerful when most variants are causal and have the same direction of effect, or quadratic statistics, which are more powerful in other scenarios. To achieve robustness, it is natural to combine the evidence of association from two or more complementary tests. To this end, we consider the minimum-p and Fisher's methods of combining P-values from linear and quadratic statistics. Extensive simulation studies show that both methods are robust across models with varying proportions of causal, deleterious, and protective rare variants, allele frequencies, and effect sizes. When the majority (>75%) of the causal effects are in the same direction (deleterious or protective), Fisher's method consistently outperforms the minimum-p and the individual linear and quadratic tests, as well as the optimal sequence kernel association test, SKAT-O. When the individual test has moderate power, Fisher's test has improved power for 90% of the ∼5000 models considered, with >20% relative efficiency gain for 40% of the models. The maximum absolute power loss is 8% for the remaining 10% of the models. An application to the GAW17 quantitative trait Q2 data based on sequence data of the 1000 Genomes Project shows that, compared with linear and quadratic tests, Fisher's test has comparable power for all 13 functional genes and provides the best power for more than half of them.

Ancillary