Get access

Predicting protease types by hybridizing gene ontology and pseudo amino acid composition


  • Guo-Ping Zhou,

    Corresponding author
    1. Center for Vascular Biology Research, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts
    • Beth Israel Deaconess Medical Center, Harvard Medical School, 330 Brookline Avenue, Division of Molecular and Vascular Medicine, E/RW 759, Boston, MA, 02115
    Search for more papers by this author
  • Yu-Dong Cai

    1. Department of Chemistry, College of Sciences, Shanghai University, Shanghai, China
    2. Biomedical Science Department, University of Manchester of Science and Technology, Manchester, United Kingdom
    Search for more papers by this author


Proteases play a vitally important role in regulating most physiological processes. Different types of proteases perform different functions with different biological processes. Therefore, it is highly desired to develop a fast and reliable means to identify the types of proteases according to their sequences, or even just identify whether they are proteases or nonproteases. The avalanche of protein sequences generated in the postgenomic era has made such a challenge become even more critical and urgent. By hybridizing the gene ontology approach and pseudo amino acid composition approach, a powerful predictor called GO-PseAA predictor was introduced to address the problems. To avoid redundancy and bias, demonstrations were performed on a dataset where none of proteins has ≥ 25% sequence identity to any other. The overall success rates thus obtained by the jackknife cross-validation test in identifying protease and nonprotease was 91.82%, and that in identifying the protease type was 85.49% among the following five types: (1) aspartic, (2) cysteine, (3) metallo, (4) serine, and (5) threonine. The high jackknife success rates yielded for such a stringent dataset indicate the GO-PseAA predictor is very powerful and might become a useful tool in bioinformatics and proteomics. Proteins 2006. © 2006 Wiley-Liss, Inc.

Get access to the full text of this article