Detecting false-positive signals in exome sequencing

Authors

  • Karin V. Fuentes Fajardo,

    1. NIH Undiagnosed Diseases Program, NIH Office of Rare Diseases Research and NHGRI, Bethesda, Maryland
    Search for more papers by this author
  • David Adams,

    1. NIH Undiagnosed Diseases Program, NIH Office of Rare Diseases Research and NHGRI, Bethesda, Maryland
    2. Medical Genetics Branch, NHGRI, NIH, Bethesda, Maryland
    Search for more papers by this author
  • NISC Comparative Sequencing Program,

    1. NIH Intramural Sequencing Center, NIH, Bethesda, Maryland
    Search for more papers by this author
  • Christopher E. Mason,

    1. Department of Physiology and Biophysics, Weill Medical College, Cornell University, New York, New York
    2. HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Medical College, Cornell University, New York, New York
    Search for more papers by this author
  • Murat Sincan,

    1. Medical Genetics Branch, NHGRI, NIH, Bethesda, Maryland
    Search for more papers by this author
  • Cynthia Tifft,

    1. NIH Undiagnosed Diseases Program, NIH Office of Rare Diseases Research and NHGRI, Bethesda, Maryland
    2. Medical Genetics Branch, NHGRI, NIH, Bethesda, Maryland
    Search for more papers by this author
  • Camilo Toro,

    1. NIH Undiagnosed Diseases Program, NIH Office of Rare Diseases Research and NHGRI, Bethesda, Maryland
    Search for more papers by this author
  • Cornelius F Boerkoel,

    1. NIH Undiagnosed Diseases Program, NIH Office of Rare Diseases Research and NHGRI, Bethesda, Maryland
    Search for more papers by this author
  • William Gahl,

    1. NIH Undiagnosed Diseases Program, NIH Office of Rare Diseases Research and NHGRI, Bethesda, Maryland
    2. Medical Genetics Branch, NHGRI, NIH, Bethesda, Maryland
    3. Office of the Clinical Director, NHGRI, NIH, Bethesda, Maryland
    Search for more papers by this author
  • Thomas Markello

    Corresponding author
    1. Office of the Clinical Director, NHGRI, NIH, Bethesda, Maryland
    • Thomas C Markello, Medical Genetics Branch, NIH/NHGRI, 10 Center Drive, Building 10/10C103, Bethesda, MD 20892
    Search for more papers by this author

  • For the Focus on the NIH Undiagnosed Diseases Program

Abstract

Disease gene discovery has been transformed by affordable sequencing of exomes and genomes. Identification of disease-causing mutations requires sifting through a large number of sequence variants. A subset of the variants are unlikely to be good candidates for disease causation based on one or more of the following criteria: (1) being located in genomic regions known to be highly polymorphic, (2) having characteristics suggesting assembly misalignment, and/or (3) being labeled as variants based on misleading reference genome information. We analyzed exome sequence data from 118 individuals in 29 families seen in the NIH Undiagnosed Diseases Program (UDP) to create lists of variants and genes with these characteristics. Specifically, we identified several groups of genes that are candidates for provisional exclusion during exome analysis: 23,389 positions with excess heterozygosity suggestive of alignment errors and 1,009 positions in which the hg18 human genome reference sequence appeared to contain a minor allele. Exclusion of such variants, which we provide in supplemental lists, will likely enhance identification of disease-causing mutations using exome sequence data. Hum Mutat 33:609–613, 2012. © 2012 Wiley Periodicals, Inc.

.*

Ancillary