research-article
Forbidden penta-peptides
Article first published online: 1 JAN 2009
DOI: 10.1110/ps.073067607
Copyright © 2007 The Protein Society
Additional Information
How to Cite
Tuller, T., Chor, B. and Nelson, N. (2007), Forbidden penta-peptides. Protein Science, 16: 2251–2259. doi: 10.1110/ps.073067607
Publication History
- Issue published online: 1 JAN 2009
- Article first published online: 1 JAN 2009
- Manuscript Revised: 9 JUL 2007
- Manuscript Accepted: 9 JUL 2007
- Manuscript Received: 12 JUN 2007
Keywords:
- short peptides;
- proteomes;
- evolutionary selection;
- protein grammar;
- phylogenetic groups
Abstract
There are 3,200,000 amino acid sequences of length 5 (penta-peptides). Statistically, we expect to see a distribution of penta-peptides that is determined by the frequency of the participating amino acids. We show, however, that not only are there thousands of such penta-peptides that are absent from all known proteomes, but many of them are coded for multiple times in the non-coding genomic regions. This suggests a strong selection process that prevents these peptides from being expressed. We also show that the characteristics of these forbidden penta-peptides vary among different phylogenetic groups (e.g., eukaryotes, prokaryotes, and archaea). Our analysis provides the first steps toward understanding the “grammar” of the forbidden penta-peptides.

1469-896X/asset/olbannerleft.gif?v=1&s=d218899ae53b2862ab119790ed504b8d72122fb3)
1469-896X/asset/olbannerright.gif?v=1&s=59470eb9a1d9b7b13b1be75e9445e6c46ee2214f)
1469-896X/asset/cover.gif?v=1&s=0ff483325083001100314f63c633a394aff24478)