The Impact of Improved Microarray Coverage and Larger Sample Sizes on Future Genome-Wide Association Studies

Authors

  • Karla J. Lindquist,

    1. Department of Epidemiology and Biostatistics, University of California, San Francisco, California
    2. Institute for Human Genetics, University of California, San Francisco, California
    Search for more papers by this author
  • Eric Jorgenson,

    1. Kaiser Permanente Division of Research, Oakland, California
    Search for more papers by this author
  • Thomas J. Hoffmann,

    1. Department of Epidemiology and Biostatistics, University of California, San Francisco, California
    2. Institute for Human Genetics, University of California, San Francisco, California
    Search for more papers by this author
  • John S. Witte

    Corresponding author
    1. Institute for Human Genetics, University of California, San Francisco, California
    • Department of Epidemiology and Biostatistics, University of California, San Francisco, California
    Search for more papers by this author

  • Contract grant sponsor: National Institutes of Health; Grant numbers: R01CA88164, U01CA127298, and R25CA112355.

Correspondence to: John S. Witte, Helen Diller Family Comprehensive Cancer Center, University of California at San Francisco, 1450 3rd Street, Box 3110, San Francisco, CA 94158-9001. E-mail: jwitte@ucsf.edu

ABSTRACT

Genome-wide association studies (GWAS) have identified many single nucleotide polymorphisms (SNPs) associated with complex traits. However, the genetic heritability of most of these traits remains unexplained. To help guide future studies, we address the crucial question of whether future GWAS can detect new SNP associations and explain additional heritability given the new availability of larger GWAS SNP arrays, imputation, and reduced genotyping costs. We first describe the pairwise and imputation coverage of all SNPs in the human genome by commercially available GWAS SNP arrays, using the 1000 Genomes Project as a reference. Next, we describe the findings from 6 years of GWAS of 172 chronic diseases, calculating the power to detect each of them while taking array coverage and sample size into account. We then calculate the power to detect these SNP associations under different conditions using improved coverage and/or sample sizes. Finally, we estimate the percentages of SNP associations and heritability previously detected and detectable by future GWAS under each condition. Overall, we estimated that previous GWAS have detected less than one-fifth of all GWAS-detectable SNPs underlying chronic disease. Furthermore, increasing sample size has a much larger impact than increasing coverage on the potential of future GWAS to detect additional SNP-disease associations and heritability.

Ancillary