Advertisement

A simple probabilistic model of multibody interactions in proteins

Authors

  • Kristoffer Enøe Johansson,

    1. Section for Biomolecular Sciences, Department of Biology, University of Copenhagen, Copenhagen N, Denmark
    Search for more papers by this author
  • Thomas Hamelryck

    Corresponding author
    1. Section for Computational and RNA biology, Department of Biology, University of Copenhagen, DK-2200 Copenhagen N, Denmark
    • Section for Biomolecular Sciences, Department of Biology, University of Copenhagen, Copenhagen N, Denmark
    Search for more papers by this author

Correspondence to: Thomas Hamelryck, Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Ole Maaloes Vej 5, DK-2200, Copenhagen N, Denmark. E-mail: thamelry@binf.ku.dk

Abstract

Protein structure prediction methods typically use statistical potentials, which rely on statistics derived from a database of know protein structures. In the vast majority of cases, these potentials involve pairwise distances or contacts between amino acids or atoms. Although some potentials beyond pairwise interactions have been described, the formulation of a general multibody potential is seen as intractable due to the perceived limited amount of data. In this article, we show that it is possible to formulate a probabilistic model of higher order interactions in proteins, without arbitrarily limiting the number of contacts. The success of this approach is based on replacing a naive table-based approach with a simple hierarchical model involving suitable probability distributions and conditional independence assumptions. The model captures the joint probability distribution of an amino acid and its neighbors, local structure and solvent exposure. We show that this model can be used to approximate the conditional probability distribution of an amino acid sequence given a structure using a pseudo-likelihood approach. We verify the model by decoy recognition and site-specific amino acid predictions. Our coarse-grained model is compared to state-of-art methods that use full atomic detail. This article illustrates how the use of simple probabilistic models can lead to new opportunities in the treatment of nonlocal interactions in knowledge-based protein structure prediction and design. Proteins 2013; 81:1340–1350. © 2013 Wiley Periodicals, Inc.

Ancillary