How many signal peptides are there in bacteria?
Article first published online: 4 APR 2013
Published 2013. This article is a U.S. Government work and is in the public domain in the USA
Volume 15, Issue 4, pages 983–990, April 2013
How to Cite
Ivankov, D. N., Payne, S. H., Galperin, M. Y., Bonissone, S., Pevzner, P. A. and Frishman, D. (2013), How many signal peptides are there in bacteria?. Environmental Microbiology, 15: 983–990. doi: 10.1111/1462-2920.12105
- Issue published online: 4 APR 2013
- Article first published online: 4 APR 2013
- DFG International Training and Research Group RECESS (Regulation and Evolution of Cellular Systems)
- NSF. Grant Number: EF-0949047
- NIH Intramural Research Program at the National Library of Medicine
- NIAID. Grant Number: IAA-Y1-A1-8401
Over the last 5 years proteogenomics (using mass spectroscopy to identify proteins predicted from genomic sequences) has emerged as a promising approach to the high-throughput identification of protein N-termini, which remains a problem in genome annotation. Comparison of the experimentally determined N-termini with those predicted by sequence analysis tools allows identification of the signal peptides and therefore conclusions on the cytoplasmic or extracytoplasmic (periplasmic or extracellular) localization of the respective proteins. We present here the results of a proteogenomic study of the signal peptides in Escherichia coli K-12 and compare its results with the available experimental data and predictions by such software tools as SignalP and Phobius. A single proteogenomics experiment recovered more than a third of all signal peptides that had been experimentally determined during the past three decades and confirmed at least 31 additional signal peptides, mostly in the known exported proteins, which had been previously predicted but not validated. The filtering of putative signal peptides for the peptide length and the presence of an eight-residue hydrophobic patch and a typical signal peptidase cleavage site proved sufficient to eliminate the false-positive hits. Surprisingly, the results of this proteogenomics study, as well as a re-analysis of the E. coli genome with the latest version of SignalP program, show that the fraction of proteins containing signal peptides is only about 10%, or half of previous estimates.