Rapid Multiplexed Genotyping of Simple Tandem Repeats using Capture and High-Throughput Sequencing

Authors

  • Audrey Guilmatre,

    1. Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York
    Current affiliation:
    1. Institut Pasteur, Human Genetics and Cognitive Functions Unit, Paris, France
    Search for more papers by this author
  • Gareth Highnam,

    1. Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, Virginia
    Search for more papers by this author
  • Christelle Borel,

    1. Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York
    Current affiliation:
    1. Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
    Search for more papers by this author
  • David Mittelman,

    1. Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, Virginia
    2. Department of Biological Sciences, Virginia Tech, Blacksburg, Virginia
    Search for more papers by this author
  • Andrew J. Sharp

    Corresponding author
    1. Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York
    • Address for correspondence: Andrew Sharp, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, Hess Center for Science and Medicine, 1470 Madison Avenue, Room S8-116, Box 1498, New York, NY 10029. Email: andrew.sharp@mssm.edu

    Search for more papers by this author

  • Contract Grant Sponsors: National Institute of Health (DA033660, HG006696, and HD073731); Alzheimer's Association (69983).

  • Communicated by Graham R. Taylor

ABSTRACT

Although simple tandem repeats (STRs) comprise ∼2% of the human genome and represent an important source of polymorphism, this class of variation remains understudied. We have developed a cost-effective strategy for performing targeted enrichment of STR regions that utilizes capture probes targeting the flanking sequences of STR loci, enabling specific capture of DNA fragments containing STRs for subsequent high-throughput sequencing. Utilizing a capture design targeting 6,243 STR loci <94 bp and multiplexing eight individuals in a single Illumina HiSeq2000 sequencing lane we were able to call genotypes in at least one individual for 67.5% of the targeted STRs. We observed a strong relationship between (G+C) content and genotyping rate. STRs with moderate (G+C) content were recovered with >90% success rate, whereas only 12% of STRs with ≥80% (G+C) were genotyped in our assay. Analysis of a parent-offspring trio, complete hydatidiform mole samples, repeat analyses of the same individual, and Sanger sequencing-based validation indicated genotyping error rates between 7.6% and 12.4%. The majority of such errors were a single repeat unit at mono- or dinucleotide repeats. Altogether, our STR capture assay represents a cost-effective method that enables multiplexed genotyping of thousands of STR loci suitable for large-scale population studies.

Ancillary