Toward a “Structural BLAST”: Using structural relationships to infer function

Authors

  • Fabian Dey,

    1. Department of Biochemistry and Molecular Biophysics, Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics and Initiative in Systems Biology, Columbia University, New York, New York 10032
    Search for more papers by this author
  • Qiangfeng Cliff Zhang,

    1. Department of Biochemistry and Molecular Biophysics, Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics and Initiative in Systems Biology, Columbia University, New York, New York 10032
    Search for more papers by this author
  • Donald Petrey,

    1. Department of Biochemistry and Molecular Biophysics, Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics and Initiative in Systems Biology, Columbia University, New York, New York 10032
    Search for more papers by this author
  • Barry Honig

    Corresponding author
    1. Department of Biochemistry and Molecular Biophysics, Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics and Initiative in Systems Biology, Columbia University, New York, New York 10032
    • Department of Biochemistry and Molecular Biophysics, Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics and Initiative in Systems Biology, Columbia University, 1130 St. Nicholas Ave., Room 815, New York, New York 10032
    Search for more papers by this author

  • Barry Honig is the recipient of the Protein Society 2012 Christian B. Anfinsen Award

Abstract

We outline a set of strategies to infer protein function from structure. The overall approach depends on extensive use of homology modeling, the exploitation of a wide range of global and local geometric relationships between protein structures and the use of machine learning techniques. The combination of modeling with broad searches of protein structure space defines a “structural BLAST” approach to infer function with high genomic coverage. Applications are described to the prediction of protein–protein and protein–ligand interactions. In the context of protein–protein interactions, our structure-based prediction algorithm, PrePPI, has comparable accuracy to high-throughput experiments. An essential feature of PrePPI involves the use of Bayesian methods to combine structure-derived information with non-structural evidence (e.g. co-expression) to assign a likelihood for each predicted interaction. This, combined with a structural BLAST approach significantly expands the range of applications of protein structure in the annotation of protein function, including systems level biological applications where it has previously played little role.

Ancillary