An evaluation of the replicate pool method: quick estimation of genome-wide linkage peak p-values

Authors

  • Janis E. Wigginton,

    Corresponding author
    1. Center for Statistical Genetics, Department of Biostatistics, University of Michigan, Ann Arbor, MI
    • Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI
    Search for more papers by this author
  • Gonçalo R. Abecasis

    1. Center for Statistical Genetics, Department of Biostatistics, University of Michigan, Ann Arbor, MI
    Search for more papers by this author

Abstract

The calculation of empirical p-values for genome-wide non-parametric linkage tests continues to present significant computational challenges for many complex disease mapping studies. The gold standard approach is to use gene dropping to simulate null genome scans. Unfortunately, this approach is too computationally expensive for many data sets of interest. An alternative, more efficient method for sampling null genome scans is to pre-calculate pools of family-specific statistics and then resample from these replicate pools to generate “pseudo-replicate” genome scans. In this study, we use simulations to explore properties of the replicate pool p-value estimator RP and show that it provides an excellent approximation to the traditional gene-dropping estimator for significantly less computational effort. While the computational efficiency of the replicate pool estimator is noticeable in almost all data sets, by applying the replicate pool method to several previously characterized data sets we show that savings in computational effort can be especially significant (on the order of 10,000-fold or more) when one or more large families are analyzed. We also estimate replicate pool p-values for the schizophrenia data described by Abecasis et al. and show that RP closely approximates gene-drop p-values for all linkage peaks reported for this study. Lastly, we expand upon Song et al.'s previous work by deriving a conservative estimator of the variance for RP that can easily be computed in practical settings.We have implemented the replicate pool method along with our variance estimator in a new program called Pseudo, which is the first widely available automated implementation of the replicate pool method. Genet. Epidemiol. 30, 2006. © 2006 Wiley-Liss, Inc.

Ancillary