Unit

UNIT 13.23 PepArML: A Meta-Search Peptide Identification Platform for Tandem Mass Spectra

  1. Nathan J. Edwards

Published Online: 12 DEC 2013

DOI: 10.1002/0471250953.bi1323s44

Current Protocols in Bioinformatics

Current Protocols in Bioinformatics

How to Cite

Edwards, N. J. 2013. PepArML: A Meta-Search Peptide Identification Platform for Tandem Mass Spectra. Current Protocols in Bioinformatics. 13.23.1–13.23.23.

Author Information

  1. Department of Biochemistry and Molecular & Cellular Biology, Georgetown University Medical Center, Washington, D.C.

Publication History

  1. Published Online: 12 DEC 2013

Abstract

The PepArML meta-search peptide identification platform for tandem mass spectra provides a unified search interface to seven search engines; a robust cluster, grid, and cloud computing scheduler for large-scale searches; and an unsupervised, model-free, machine-learning-based result combiner, which selects the best peptide identification for each spectrum, estimates false-discovery rates, and outputs pepXML format identifications. The meta-search platform supports Mascot; Tandem with native, k-score and s-score scoring; OMSSA; MyriMatch; and InsPecT with MS-GF spectral probability scores—reformatting spectral data and constructing search configurations for each search engine on the fly. The combiner selects the best peptide identification for each spectrum based on search engine results and features that model enzymatic digestion, retention time, precursor isotope clusters, mass accuracy, and proteotypic peptide properties, requiring no prior knowledge of feature utility or weighting. The PepArML meta-search peptide identification platform often identifies two to three times more spectra than individual search engines at 10% FDR. Curr. Protoc. Bioinform. 44:13.23.1-13.23.23. © 2013 by John Wiley & Sons, Inc.

Keywords:

  • proteomics;
  • tandem mass spectra;
  • machine learning;
  • cloud computing