Large-scale, classification-driven, rule-based functional annotation of proteins
Part 4. Bioinformatics
4.3. Protein Function and Annotation
Short Specialist Review
Published Online: 15 NOV 2005
Copyright © 2005 John Wiley & Sons, Ltd
Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics
How to Cite
Natale, D. A., Vinayaka, C. R. and Wu, C. H. 2005. Large-scale, classification-driven, rule-based functional annotation of proteins. Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics.
- Published Online: 15 NOV 2005
Experimentally verified information on protein function lags far behind the rapid accumulation of protein sequences. The simple approach to propagating information from characterized proteins to unknown proteins – namely, by sequence similarity search against databases of individual proteins – may fail to produce accurate results, and typically is used to transfer only protein name information. A more accurate, consistent, and comprehensive approach for large-scale automated annotation makes use of protein family classification-driven rules. Unannotated proteins that satisfy a set of conditions for a particular rule can be annotated with the information appropriate for that rule. The approach leads to facile, accurate prediction and functional inference for uncharacterized proteins, allows systematic detection of genome annotation errors, and provides sensible propagation and standardization of protein annotation, including position-specific sequence features, protein names and synonyms, and Gene Ontology terms. Rule-based annotation will be discussed in the context of the PIRSF protein classification system, PIRNR Name Rule system, and the PIRSR Site Rule system.
- protein classification;
- rule-based annotation;
- sequence analysis