Using distances between α-carbons to predict protein structure

Authors

  • Christina R. Crecca,

    1. Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL 32611
    Search for more papers by this author
  • Adrian E. Roitberg

    Corresponding author
    1. Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL 32611
    • Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL 32611
    Search for more papers by this author

Abstract

Knowledge of a protein's structure is important in understanding its function. The usual experimental structure determination methods can be costly and time-consuming. We present an idea for a fast and inexpensive protein structure prediction method that combines modeling with less expensive experimental data. Our method involves three steps: (1) building a decoy set, (2) measuring inter-residue distances in a target protein, and (3) comparing the measured distances with those calculated in each decoy. We postulate that structures with a small number of similar inter-residue distances will also have similar three-dimensional structure. We further hypothesize that the minimum number of distances needed to determine structure is much less than the total number of inter-residue distances in the protein. To develop our protocol, we apply our method to target proteins whose structures have been solved experimentally but have not been included in the set. We simulate experimental data by calculating α-carbon distances from the experimentally determined structures of our target proteins. We have created a large, generalized decoy set using most of the structures in the Protein Data Bank. It can be used to study any protein composed of 100 residues or less. Using this decoy set, we searched for four proteins; our predicted structures ranged in RMSD from 3.6 to 7.7 Å. We have also analyzed the RMSD distributions of the decoys using the search proteins as references and found the distributions to be similar for each protein. Of the nearly 5,000 Cα[BOND]Cα distances in a 100 residue protein, knowledge of only twenty-five distances will usually result in predicting a reliable model. © 2008 Wiley Periodicals, Inc. Int J Quantum Chem, 2008

Ancillary