• reduced energy;
  • protein simulation;
  • statistical potential;
  • generalized ensembles;
  • empirical force field;
  • fold recognition


A coarse-grained potential for protein simulations and fold ranking is presented. The potential is based on a two-point model of individual amino acids and a specific implementation of hydrogen bonding. Parameters are determined for distance dependent pair interactions, pseudo bonds, angles, and torsions. A scaling factor for a hydrogen bonding term is also determined. Iterative sampling for 4867 proteins reproduces distributions of internal coordinates and distances observed in the Protein Data Bank. The adjustment of the potential and resampling are in the spirit of the generalized ensemble approach. No native structure information (e.g., secondary structure) is used in the calculation of the potential or in the simulation of a particular protein. The potential is subject to two tests as follows: (i) simulations of 956 globular proteins in the neighborhood of their native folds (these proteins were not used in the training set) and (ii) discrimination between native and decoy structures for 2470 proteins with 305,000 decoys and the “Decoys ‘R’ Us” dataset. In the first test, 58% of tested proteins stay within 5 Å from the native fold in Molecular Dynamics simulations of more than 20 nanoseconds using the new potential. The potential is also useful in differentiating between correct and approximate folds providing significant signal for structure prediction algorithms. Sampling with the potential consistently regenerates the distribution of distances and internal coordinates it learned. Nevertheless, during Molecular Dynamics simulations structures are found that reproduce the learned distributions but are far from the native fold. Proteins 2009. © 2009 Wiley-Liss, Inc.