Prediction of functional sites by analysis of sequence and structure conservation

Authors

  • Anna R. Panchenko,

    1. Computational Biology Branch, National Center for Biotechnology Information (NCBI), National Institutes of Health (NIH), Bethesda, Maryland 20894, USA
    Search for more papers by this author
  • Fyodor Kondrashov,

    1. Section of Evolution and Ecology, University of California, Davis, California 95616, USA
    Search for more papers by this author
  • Stephen Bryant

    Corresponding author
    1. Computational Biology Branch, National Center for Biotechnology Information (NCBI), National Institutes of Health (NIH), Bethesda, Maryland 20894, USA
    • Computational Biology Branch, NCBI, Bldg. 38A, Rm. 8N805, NIH, Bethesda, MD 20894, USA; fax (301) 435-7794.
    Search for more papers by this author

Abstract

We present a method for prediction of functional sites in a set of aligned protein sequences. The method selects sites which are both well conserved and clustered together in space, as inferred from the 3D structures of proteins included in the alignment. We tested the method using 86 alignments from the NCBI CDD database, where the sites of experimentally determined ligand and/or macromolecular interactions are annotated. In agreement with earlier investigations, we found that functional site predictions are most successful when overall background sequence conservation is low, such that sites under evolutionary constraint become apparent. In addition, we found that averaging of conservation values across spatially clustered sites improves predictions under certain conditions: that is, when overall conservation is relatively high and when the site in question involves a large macromolecular binding interface. Under these conditions it is better to look for clusters of conserved sites than to look for particular conserved sites.

Ancillary