Reduced representation model of protein structure prediction: Statistical potential and genetic algorithms


  • Shaojian Sun

    Corresponding author
    1. Department of Biophysical Science, State University of New York at Buffalo, Buffalo, New York 14214
    • Department of Pharmaceutical Chemistry, University of California at San Francisco-Laurel Heights Campus, 3333 California Street, Room 102, San Francisco, California 94118–1204
    Search for more papers by this author


A reduced representation model, which has been described in previous reports, was used to predict the folded structures of proteins from their primary sequences and random starting conformations. The molecular structure of each protein has been reduced to its backbone atoms (with ideal fixed bond lengths and valence angles) and each side chain approximated by a single virtual united-atom. The coordinate variables were the backbone dihedral angles ϕ and Ψ. A statistical potential function, which included local and nonlocal interactions and was computed from known protein structures, was used in the structure minimization. A novel approach, employing the concepts of genetic algorithms, has been developed to simultaneously optimize a population of conformations. With the information of primary sequence and the radius of gyration of the crystal structure only, and starting from randomly generated initial conformations, I have been able to fold melittin, a protein of 26 residues, with high computational convergence. The computed structures have a root mean square error of 1.66 Å (distance matrix error = 0.99 Å) on average to the crystal structure. Similar results for avian pancreatic polypeptide inhibitor, a protein of 36 residues, are obtained. Application of the method to apamin, an 18-residue polypeptide with two disulfide bonds, shows that it folds apamin to native-like conformations with the correct disulfide bonds formed.