Multiple Testing in the Context of Haplotype Analysis Revisited: Application to Case-Control Data

Authors


Corresponding author: Dr. T. Becker, Institute for Medical Biometry, Informatics and Epidemiology, University of Bonn, Sigmund-Freud-Str. 25, D-53105 Bonn (Germany), Phone: ++49-228-287-5564, Fax: ++49-228-287-5854.
E-mail: becker@imbie.meb.uni-bonn.de

Summary

We have lately presented a testing procedure for family data which accounts for the multiple testing problem that is induced by the enormous number of different marker combinations that can be analyzed in a set of tightly linked markers. Most methods of haplotype based association analysis already require simulations to obtain an uncorrected P value for a specific marker combination. As shown before, it is nevertheless not necessary to carry out nested simulations to obtain a global P value that properly corrects for the multiple testing of different marker combinations without neglecting the dependency of the tests. We have now implemented this approach for case-control data in our program FAMHAP, as this data structure currently plays a dominant role in the field. We consider different ways to deal with phase ambiguities and two different statistical tests for the underlying single marker combinations to obtain uncorrected P values. One test statistic is chi-square based, the other is a haplotype trend regression. The performance of these different tests in the multiple testing situation is investigated in a large simulation study. We obtain a considerable gain in power with our global P values as opposed to Bonferroni corrected P values for all suggested test statistics. Good power was obtained both with the haplotype trend regression approach as well as with the simpler chi-square based test. Furthermore, we conclude that the better strategy to deal with phase ambiguities is to assign to each individual its list of weighted haplotype explanations, rather than to assign to each individual its most likely haplotype explanation. Finally, we demonstrate the usefulness of our approach by a real data example.

Ancillary