## Introduction

Tertiary structural information is critical for our understanding of a protein's biological function. However, experimental structure determination is far too expensive and time consuming to be applied to all proteins of interest. Computational approaches are, thus, expected to play a major role in determining protein structures in the future.1 Over the last two decades, great strides have been made in exploiting distant evolutionary relationships to known structures to derive spatial restraints for comparative models.2–4 One of the remaining major challenges is in the refinement of such models to near experimental accuracy.5 This challenge, in turn, demands the development of more accurate force fields that can be deployed in molecular mechanics simulations.

Physiochemical force fields such as CHARMM,6 AMBER,7 and GROMOS,8 parameterized for use in protein simulations, are routinely applied to the refinement of comparative models. However, overall improvement in the accuracy of comparative models by such methods has not been achieved.5 Knowledge-based potential energy functions are derived from either statistical analysis of observed protein structures9–16 or optimization of parameters such that native structures are discriminated from non-native decoys.17–20 They usually outperform9, 21 physiochemical force fields that lack some physical terms such as cation-π interactions and entropic effects. However, the discrete nature of statistical energy functions makes it difficult to be used directly in energy minimization or molecular dynamics for protein-structure refinement. Moreover, most knowledge-based energy functions derived from parameter optimization are coarse grained (i.e., at the residue level or using simplified side chains) to minimize the number of adjustable parameters. Parameter optimization was considered inappropriate to derive distance-dependent energy functions of all atom types,13 not to mention orientation dependence. Thus, it is more practical to optimize a small number of weights for mixing physiochemical terms with statistics-based potentials.22, 23 As more and more experimental protein structures become available, knowledge-based potential energy functions derived from parameter optimization may prove optimal even for all-atom force fields.

Any complicated function, including the force fields between atoms in a protein, can be decomposed as a mathematical series. For example, power series expansions of a diatomic potential energy function are the most useful means for its analytical representation in quantum chemistry.24 Miyazawa and Jernigan used series expansions of spherical harmonic functions to represent the fully anisotropic distribution of the relative orientation of two residues and increased the discrimination power in fold recognition.25 Here, we expanded atomic force fields as series. The parameters were optimized by maximizing the gap between native and non-native side chain conformations and by minimizing the root mean square deviation (RMSD) of low-energy rotamers. A total of 5798 nonhomologous proteins were used for optimizing 1889 parameters. The energy functions with optimized parameters were used to predict side chain conformations for 218 independent test proteins. The prediction accuracies of χ_{1} and χ_{1 + 2} were improved by 2.2 and 4.0%, respectively, compared with the next best side chain modeling program. Because the expansions used here are continuous, the resulting energy functions can be used directly in gradient-based search algorithms to address the comparative model refinement problem.