Protein structure modeling for CASP10 by multiple layers of global optimization



In the template-based modeling (TBM) category of CASP10 experiment, we introduced a new protocol called protein modeling system (PMS) to generate accurate protein structures in terms of side-chains as well as backbone trace. In the new protocol, a global optimization algorithm, called conformational space annealing (CSA), is applied to the three layers of TBM procedure: multiple sequence-structure alignment, 3D chain building, and side-chain re-modeling. For 3D chain building, we developed a new energy function which includes new distance restraint terms of Lorentzian type (derived from multiple templates), and new energy terms that combine (physical) energy terms such as dynamic fragment assembly (DFA) energy, DFIRE statistical potential energy, hydrogen bonding term, etc. These physical energy terms are expected to guide the structure modeling especially for loop regions where no template structures are available. In addition, we developed a new quality assessment method based on random forest machine learning algorithm to screen templates, multiple alignments, and final models. For TBM targets of CASP10, we find that, due to the combination of three stages of CSA global optimizations and quality assessment, the modeling accuracy of PMS improves at each additional stage of the protocol. It is especially noteworthy that the side-chains of the final PMS models are far more accurate than the models in the intermediate steps. Proteins 2014; 82(Suppl 2):188–195. © 2013 Wiley Periodicals, Inc.