Optimal designs for two-stage genome-wide association studies

Authors

  • Andrew D. Skol,

    Corresponding author
    1. Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan
    2. Department of Medicine, Section of Genetic Medicine, University of Chicago, Chicago, Illinois
    • Department of Medicine, Section of Genetic Medicine, University of Chicago, 5841 South Maryland Avenue, W611A – MC6091, Chicago, Illinois 60637
    Search for more papers by this author
  • Laura J. Scott,

    1. Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan
    Search for more papers by this author
  • Gonçalo R. Abecasis,

    1. Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan
    Search for more papers by this author
  • Michael Boehnke

    1. Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan
    Search for more papers by this author

Abstract

Genome-wide association (GWA) studies require genotyping hundreds of thousands of markers on thousands of subjects, and are expensive at current genotyping costs. To conserve resources, many GWA studies are adopting a staged design in which a proportion of the available samples are genotyped on all markers in stage 1, and a proportion of these markers are genotyped on the remaining samples in stage 2. We describe a strategy for designing cost-effective two-stage GWA studies. Our strategy preserves much of the power of the corresponding one-stage design and minimizes the genotyping cost of the study while allowing for differences in per genotyping cost between stages 1 and 2. We show that the ratio of stage 2 to stage 1 per genotype cost can strongly influence both the optimal design and the genotyping cost of the study. Increasing the stage 2 per genotype cost shifts more of the genotyping and study cost to stage 1, and increases the cost of the study. This higher cost can be partially mitigated by adopting a design with reduced power while preserving the false positive rate or by increasing the false positive rate while preserving power. For example, reducing the power preserved in the two-stage design from 99 to 95% that of the one-stage design decreases the two-stage study cost by ∼15%. Alternatively, the same cost savings can be had by relaxing the false positive rate by 2.5-fold, for example from 1/300,000 to 2.5/300,000, while retaining the same power. Genet. Epidemiol. 2007. © 2007 Wiley-Liss, Inc.

Ancillary