For the Focus on the NIH Undiagnosed Diseases Program
Methods
Detecting false-positive signals in exome sequencing†
Article first published online: 5 MAR 2012
DOI: 10.1002/humu.22033
© 2012 Wiley Periodicals, Inc.
Issue

Human Mutation
Special Issue: Focus on the NIH Undiagnosed Diseases Program
Volume 33, Issue 4, pages 609–613, April 2012
Additional Information
How to Cite
Fuentes Fajardo, K. V., Adams, D., NISC Comparative Sequencing Program, Mason, C. E., Sincan, M., Tifft, C., Toro, C., Boerkoel, C. F., Gahl, W. and Markello, T. (2012), Detecting false-positive signals in exome sequencing. Hum. Mutat., 33: 609–613. doi: 10.1002/humu.22033
- †
Publication History
- Issue published online: 12 MAR 2012
- Article first published online: 5 MAR 2012
- Accepted manuscript online: 31 JAN 2012 12:00AM EST
- Manuscript Accepted: 2 DEC 2011
- Manuscript Received: 11 AUG 2011
Funded by
- NHGRI intramural funding; UDP program through the office of the Director NIH
Keywords:
- Exome sequencing;
- inherited disease;
- false positives;
- next generation sequencing;
- genomics;
- Illumina;
- sequencing errors;
- alignment errors;
- WES;
- SureSelect Human All Exon
Abstract
Disease gene discovery has been transformed by affordable sequencing of exomes and genomes. Identification of disease-causing mutations requires sifting through a large number of sequence variants. A subset of the variants are unlikely to be good candidates for disease causation based on one or more of the following criteria: (1) being located in genomic regions known to be highly polymorphic, (2) having characteristics suggesting assembly misalignment, and/or (3) being labeled as variants based on misleading reference genome information. We analyzed exome sequence data from 118 individuals in 29 families seen in the NIH Undiagnosed Diseases Program (UDP) to create lists of variants and genes with these characteristics. Specifically, we identified several groups of genes that are candidates for provisional exclusion during exome analysis: 23,389 positions with excess heterozygosity suggestive of alignment errors and 1,009 positions in which the hg18 human genome reference sequence appeared to contain a minor allele. Exclusion of such variants, which we provide in supplemental lists, will likely enhance identification of disease-causing mutations using exome sequence data. Hum Mutat 33:609–613, 2012. © 2012 Wiley Periodicals, Inc.
.†

1098-1004/asset/HUMU_left.gif?v=1&s=4065e12063da1c0efe3c1a74d4f13c3cd92fba18)
1098-1004/asset/HUMU_right.gif?v=1&s=58026811b6aa5bee5a3d0e0563a705f8b681f34d)