SNPMeta: SNP annotation and SNP metadata collection without a reference genome

Authors

  • Thomas J. Y. Kono,

    1. Department of Agronomy & Plant Genetics, University of Minnesota, St. Paul, MN, USA
    Search for more papers by this author
  • Kiran Seth,

    1. Department of Agronomy & Plant Genetics, University of Minnesota, St. Paul, MN, USA
    Search for more papers by this author
  • Jesse A. Poland,

    1. Hard Winter Wheat Genetics Research Unit, USDA ARS, 2021 Claflin Road, 4008 Throckmorton Hall, Kansas State University, Manhattan, KS, USA
    2. Department of Agronomy, 2021 Claflin Road, 2004 Throckmorton Plant Sciences Center, Kansas State University, Manhattan, KS, USA
    Search for more papers by this author
  • Peter L. Morrell

    Corresponding author
    1. Department of Agronomy & Plant Genetics, University of Minnesota, St. Paul, MN, USA
    Search for more papers by this author

Abstract

The increase in availability of resequencing data is greatly accelerating SNP discovery and has facilitated the development of SNP genotyping assays. This, in turn, is increasing interest in annotation of individual SNPs. Currently, these data are only available through curation, or comparison to a reference genome. Many species lack a reference genome, but are still important genetic models or are significant species in agricultural production or natural ecosystems. For these species, it is possible to annotate SNPs through comparison with cDNA, or data from well-annotated genes in public repositories. We present SNPMeta, a tool which gathers information about SNPs by comparison with sequences present in GenBank databases. SNPMeta is able to annotate SNPs from contextual sequence in SNP assay designs, and SNPs discovered through genotyping by sequencing (GBS) approaches. However, SNPs discovered through GBS occur throughout the genome, rather than only in gene space, and therefore do not annotate at high rates. SNPMeta can therefore be used to annotate SNPs in nonmodel species or species that lack a reference genome. Annotations generated by SNPMeta are highly concordant with annotations that would be obtained from a reference genome.

Ancillary