Motional timescale predictions by molecular dynamics simulations: Case study using proline and hydroxyproline sidechain dynamics

We propose a new approach for force field optimizations which aims at reproducing dynamics characteristics using biomolecular MD simulations, in addition to improved prediction of motionally averaged structural properties available from experiment. As the source of experimental data for dynamics fittings, we use 13C NMR spin-lattice relaxation times T1 of backbone and sidechain carbons, which allow to determine correlation times of both overall molecular and intramolecular motions. For structural fittings, we use motionally averaged experimental values of NMR J couplings. The proline residue and its derivative 4-hydroxyproline with relatively simple cyclic structure and sidechain dynamics were chosen for the assessment of the new approach in this work. Initially, grid search and simplexed MD simulations identified large number of parameter sets which fit equally well experimental J couplings. Using the Arrhenius-type relationship between the force constant and the correlation time, the available MD data for a series of parameter sets were analyzed to predict the value of the force constant that best reproduces experimental timescale of the sidechain dynamics. Verification of the new force-field (termed as AMBER99SB-ILDNP) against NMR J couplings and correlation times showed consistent and significant improvements compared to the original force field in reproducing both structural and dynamics properties. The results suggest that matching experimental timescales of motions together with motionally averaged characteristics is the valid approach for force field parameter optimization. Such a comprehensive approach is not restricted to cyclic residues and can be extended to other amino acid residues, as well as to the backbone. Proteins 2014; 82:195–215. © 2013 Wiley Periodicals, Inc.


INTRODUCTION
Molecular dynamics (MD) simulations are widely employed for structural and dynamics characterizations of peptides and proteins. 1-4 These simulations rely mainly on classical force field parameters, such as AMBER, 5-7 CHARMM, 8 GROMOS, 9,10 and OPLS-AA. 11,12 Amongst different types of force field parameters, backbone, and sidechain torsional potentials have been the subject of extensive reoptimizations, leading to improved modifications of AMBER 13-16 and CHARMM 17,18 force fields. Based on a number of detailed benchmark studies, 19-33 AMBER99SB 14 has emerged as one of the force fields which reproduces experimentally measured parameters with better accuracy compared to other force fields. This force field has undergone further useful refinements in recent years. 15,16 To predict the correct balance of secondary structure propensities in proteins, a simple backbone energy correction was introduced to reproduce the fraction of helix measured in short peptides at 300 K, with the modified force field known as AMBER99SB*. 15 Recently, the AMBER99SB force field has been improved further (known as AMBER99SB-ILDN) 16 by refitting the amino acid sidechain torsion potentials of the AMBER99SB force field for four residues: isoleucine, leucine, aspartic acid, and asparagine.
One of the important properties not exploited in the force field optimizations for biomolecular MD simulations is the timescale of motion for a given backbone or sidechain fragment. As a result, while the motionally averaged experimental NMR parameters can be reproduced well by new force fields, the timescale over which this averaging is achieved may deviate significantly from experiment. The reason for the lack of timescale verifications is that either experimental data is not available or it is not clear how the force field parameters can be modified to reproduce better the experimental data. To explore the possibilities that involve experimentally known motional timescales in force field optimizations, we have selected a relatively simple example of the proline (Pro) sidechain dynamics in this work. The simplicity of the Pro dynamics arises from the fact that unlike other amino acid residues the Pro residue has a unique cyclic structure, which interconverts continuously between two conformers, known as C g -endo and C gexo. 34 Another factor in favor of the Pro residue is that numerous theoretical and experimental studies have been undertaken in the past focusing mainly on the pyrrolidine ring dynamics. Furthermore, the torsional parameters of the Pro residue have not been optimized in the past and standard force field parameters obtained for open chain fragments are used for proline. The result is that the predicted geometry of the pyrrolidine ring by AMBER force fields is relatively flat compared to singlecrystal X-ray diffraction data or quantum-mechanical (QM) calculations, as judged by the value of the endocyclic torsion v 2 ( Fig. 1) or the pseudorotation amplitude v m , also known as the maximum puckering angle. 34 In the first approximation, the nonplanarity of the pyrrolidine ring can be assessed by how far atoms C b and C g are placed from the plane formed by the remaining three atoms. The further they are from the plane, the higher the absolute values of v 2 and v m are. We note that changes in geometry of the ring have also further energetic implications, and, as shown previously, the larger the maximum puckering angle the larger the pyrrolidine ring interconversion barrier in Pro and hydroxyproline (Hyp) residues. 35 The increase in the energy barrier implies less frequent transitions or longer motional timescales. Based on these considerations, force field optimizations may potentially improve the accuracy of MD simulations for predicting both the structure and dynamics of the Pro residue in proteins. Note that one of the important attributes of the Pro residue is its hinge-like function, which enhances the probability of b-turns in proteins. Therefore, accurate predictions of the proline structure and dynamics may have critical implication on the outcome of MD descriptions of proteins.
Returning to the original problem of force field optimizations, we expect that the introduction of an additional dynamics constraint into force field optimizations should be advantageous from a methodological point of view, as multiple solutions are often found in force field optimizations which fit equally well experimental data. This is not surprising, as the experimental data consists of motionally averaged values of NMR J couplings and chemical shifts, which are dependent on the relative populations of conformers, but not on how fast they exchange. Timescale fittings combined with fittings of NMR J couplings and/or chemical shifts are expected to select a correct solution in such cases. Unlike previous optimizations based on the quantum-mechanical calculations, we will use experimentally measured NMR Jcouplings in our initial re-optimization of the Pro sidechain torsion potentials. The approach used by us is based on either simple grid search or iterative fittings of experimental NMR data, in which a figure-of-merit function is evaluated using MD trajectories calculated for each trial set of parameters. Once torsional force fields reproducing experimental NMR J-couplings have been identified, we will probe MD-predicted timescales of motions which best match experimental data. 13 C NMR spin-lattice relaxation times will be used to estimate both overall and intramolecular timescales of motions. In addition to the Pro residue, we will also reoptimize torsional force field parameters for the trans-4-hydroxy-L-proline residue (Hyp) to match experimental dynamics data.

NMR Data
Apart from Ace-Hyp-NHMe (AHM) and Ace-Hyp-Gly (AHG), all other peptides were used as received from Sigma Aldrich and Cambridge Bioscience. The synthesis of AHM and AHG is described in Supporting Infromation. Experimental values of proton 3 J HH couplings and internuclear proton distances for N-acetyl-L-proline (NAcPro), Gly-Pro-Gly-Gly (GPGG) and Val-Ala-Pro-Gly (VAPG) in D 2 O solutions at 298 K were taken from Refs. 34-36. The experimental data for angiotensin II, AHM and AHG was determined in this work (see below) using full lineshape analysis. 34 Unless otherwise specified, the trans-orientation about the amide bond preceding the Pro (or Hyp) residue is assumed for a given peptide. For the values of 3 J-couplings determined from the full lineshape analysis, the standard deviation was estimated to be <0.1 Hz. 34 13 C chemical shifts were typically better than 60.01 ppm. Unless otherwise specified, NMR measurements were carried out at 298 K. High (>300 K) and low (<300 K) temperature calibrations were carried out using standard samples of 80% 1,2ethanediol in DMSO-d 6 and 4% CH 3 OH in CD 3 OD, respectively.
The 13 C spin-lattice relaxation times were measured for solutions of peptides in either D 2 O or H 2 O:D 2 O (9:1) using a standard inversion-recovery technique with the 13 C observation in the presence of proton decoupling. To minimize errors associated with low signal-to-noise ratios, these experiments were carried out on a 600 spectrometer with a dual channel 1 H/ 13 C cryoprobe with the sensitivity optimised for 13 C measurements. From five independent measurements carried out at probe ambient temperature (293 K) for the 214 mM solution of GPGG in D 2 O at different dates over 60 days, the standard deviations for 13 C T 1 measurements were within 0.4-1.4% of the corresponding mean values. From three independent measurements carried out at 298 K for the 77 mM VAPG in H 2 O:D 2 O (9:1), the standard deviations for 13 C T 1 measurements were within 0.2-1.1% of the corresponding mean values.
Chemical shift anisotropies (Dr, in ppm) of aliphatic carbons were measured using slow MAS measurements (2.5 kHz) on a Bruker AVANCE III 850 spectrometer equipped with a 4 mm CPMAS probe and a solid sample of L-proline. The estimated Dr values (243 ppm for C a and 230 ppm for C g of L-proline) were used in calculations of correlation times using 13 C T 1 values. From the 13 C T 1 calculations, at Dr 5 243 ppm, the dipolar relaxation mechanism remains the dominant factor determining 13 C T 1 relaxation at 14.1 T, while chemical shift anisotropy accounts for <1% of T 1 values.

MD calculations and simplex fittings
All MD simulations were carried out using GROMACS (version 4.5.5). 39 One molecule of NAcPro molecule (terminated with CO 2 2 and with a Na 1 cation added for neutralization) was solvated with 147 water molecules in a dodecahedral box with a volume of 4.7 nm 3 in MD simulations. Periodic boundary conditions and the TIP3P water model 40 were employed in all MD simulations. An integration step of 2 fs was used and neighbor lists were updated every 5th step. The particle mesh Ewald (PME) 41 method was employed for the electrostatics with fourth-order interpolation. The neighbor list and the real-space cutoff distances were set to 0.9 nm, which is similar to that used in optimizations of the original force field and its recent modifications. [5][6][7][13][14][15][16] The van der Waals interactions in all MD simulations were treated with a twin-range cutoff method using the neighbor list and van der Waals cutoff distances. The value of the van der Waals cutoff distance was 0.9 nm. 5-7,13-16 The temperature at 298 K was controlled using velocity rescaling with a stochastic term (V-rescale) 42 and a time constant of 0.1 ps. A Parrinello-Rahman scheme was employed for pressure control at 1 bar using a coupling constant of 2 ps and an isothermal compressibility of 4.5 3 10 -5 bar 21 . 43 Prior to production MD runs, including those implemented within downhill simplex optimizations, 44,45 the system was minimized using steepest-descent and conjugate gradient algorithms. Minimization steps were followed by four steps of equilibration. The system was equilibrated for 40 ps with the positionally restrained solute molecule to allow water molecules to equilibrate around it, followed by a NVT molecular dynamics for 100 ps, NPT dynamics for 200 ps and another NVT dynamics for 200 ps. Reproducible production MD simulations at each step of simplex fittings were performed for 7.5-40.5 ns using NVT ensemble, the first 0.5 ns of which was discarded from the calculations of averaged NMR parameters. For the selected set of parameters from simplex fittings additional 200 ns long MD simulations were carried out.
The vicinal 3 J couplings of the five-membered pyrrolidine ring in NAcPro (as well as in other peptides, see below) in each frame of MD simulations were calculated using empirically optimized Karplus-type equations 8C and 8D of Haasnoot et al. 46 These equations contain terms accounting for the differences in electronegativities of aand b-substituents, and hence are better suited for the analysis of the 3 J couplings of the pyrrolidine ring than the original Karplus equation. 47 The precision of equation 8C of Haasnoot et al. (expressed as the rms deviation) for a structural fragment containing 2 substituents (-CH 2 X-CH 2 Y-) is estimated as 0.367 Hz using a set of 45 experimental 3 J HH couplings. 46 The precision of equation 8D of Haasnoot et al. for a structural fragment containing 3 substituents (-CHXY-CH 2 Z-) is estimated as 0.485 Hz using a set of 100 experimental 3 J HH couplings. 46 To analyze MD trajectories, including those obtained at each step of simplex fittings, dihedral angles were extracted for each frame recorded every 0.01 ps during the MD simulation. The calculated values of 3 J couplings using the corresponding dihedral angles in each frame were used to calculate the averaged values of 3 J couplings over the duration of the MD simulation. The rms deviation defined as for the 3 J HH -couplings of the pyrrolidine ring) was used as a figure-of-merit function in simplex fittings, where J i exp and J i calc are conformationally averaged experimental and calculated couplings, respectively, and N is the number of different J couplings available (N 5 10 for the Pro sidechain). As simplex may in principle lead to a local minimum of the merit function, 44,45 it is important to consider several sets of starting values of the optimized parameters x j . This was achieved by varying the factor c, by which one of the optimized parameters x j is varied within the first n 1 1 steps of the simplex run using the following expression: x j 1 c x j (i.e., at step n 5 1 the initial values of x j from the original AMBER99SB force field are used followed by x 1 1c x 1 , x 2 , . . . x n at step n 5 2, etc.). Several simplex fittings were considered with c varied between 0.2 and 5 (see the main text for further details). In addition, for jcj < 1, both positive and negative values were considered. An additional constraint requiring x j > 0 was imposed in simplex fittings.
For further optimization and validation of newly derived force field parameters, 800 ns MD simulations of GPGG, VAPG, Gly-Pro-Phe (GPF), 1.5 ls MD simulations of angiotensin II, 1 ls MD simulations of human ubiquitin (PDB entry 1UBQ), 48 600 ns and 1.5 ls MD simulations of AHM and 1.5 ls MD simulations of AHG were carried out. One molecule of zwitterionic GPGG was solvated with 253 water molecules in a dodecahedral box with a volume of 8.3 nm 3 . For VAPG, one molecule of zwitterionic peptide was solvated with 260 water molecules in a dodecahedral box with a volume of 8.4 nm 3 . In the case of GPF, one molecule of zwitterionic peptide was solvated with 292 water molecules in a dodecahedral box with a volume of 9.3 nm 3 . Similarly, one molecule of angiotensin II (with a Clanion added for neutralization) was solvated with 1201 water molecules in a dodecahedral box with a volume of 40.8 nm 3 . One molecule of ubiquitin (with six Na 1 cations added for neutralization) was solvated with 2605 water molecules in a cubic box with a volume of 91.1 nm 3 . For the Hyp parameter optimizations, one molecule of AHM was solvated with 225 water molecules in a dodecahedral box with a volume of 7.4 nm 3 and one molecule of AHG (with a Na 1 cation added for neutralization) was solvated with 300 water molecules in a dodecahedral box with a volume of 9.4 nm 3 . Other conditions and parameters of MD simulations were the same as described above for NAcPro. Frames recorded every 1 ps were used in estimating averaged 3 J-couplings from MD simulations of GPGG and ubiquitin.
The calculated 3 J HH couplings are expected to depend on the length of the MD simulation. To estimate the significance of this dependence, we have considered MD simulations of varying lengths. Calculations of 3 J HH couplings using 600, 700, and 800 ns long MD simulations of GPGG using the modified force field (referred to as (25), Table I) showed the largest variation of less than 60.023 Hz in the calculated 3 J HH values over 200 ns change in the length of the MD simulation (<0.5% of the value of the 3 J HH coupling). Two MD simulation of GPGG with 800 ns and 3 ls lengths were available for the parameter set (19), with the third largest value of V 3 considered (6.92437 kJ mol 21 , Table I). These were used for error estimates in MDpredicted quantities. The changes were (see Tables (I-IV (19) is higher than that in the final selected set (25), hence requiring longer MD simulations for better convergence in calculated parameters in the case of (19).
The motionally averaged 3 J-couplings of the peptide backbone of GPGG and ubiquitin were calculated using quantum-mechanically derived Karplus relationships 31,49 and empirically parameterized Karplus equations. 50,51 Interatomic distances from the MD simulations of GPGG were calculated in a manner similar to that used in NMR measurements 36 : (i) internuclear distances (r i ) for pairs of hydrogen atoms were calculated in each MD frame i; (ii) a quantity equal to r 26 was calculated as a measure of the expected NOE in each frame, g i ; (iii) the sum of r i 26 were used as a measure of the expected total NOE over the full length of the MD run; (iv) using r 5 2.4 Å as the reference H a -H b3 distance in the Pro residue, 36 internuclear distances for other proton pairs were calculated using the g $r 26 relationship.
As shown by Tropp, 52 when overall molecular motions are relatively slow and intramolecular motions are relatively fast, NOEs may show a r 23 dependence, for example, in globular macromolecules. In the case of the tetrapeptide GPGG used in this work for the NOE analysis, timescales of overall and intramolecular motions are both relatively fast. We have therefore used the r 26 dependence of NOEs. This is consistent with the simplified growth rates method used widely for interproton distance measurements in small molecules. [53][54][55][56][57] To determine autocorrelation times for the intramolecular motions of the C-H bond from MD simulations, the corresponding internal autocorrelation functions were calculated using the following equation 58 : where h:::i denotes an average over the MD trajectory,l is a unit vector along the CAH bond direction and P 2 is the second order Legendre polynomial. Prior to the CðtÞ calculations, the overall rotational and translational motions of the solute molecule were removed from the MD trajectory. This was accomplished by superimposing the sequence of four bonded peptide backbone atoms C(Pro)-C a (Pro)-N(Pro)-C(i11) on the corresponding atoms of the snapshot at the midpoint of the production run, chosen as the reference structure. A similar approach was used by Showalter and Br€ uschweiler in their detailed analysis of NMR relaxation data (for a detailed discussion see Section 2.3 of Ref. 25). The Lipari-Szabo model was used to fit the initial 20 ns of the autocorrelation CðtÞ functions 59 : In Eq. (2) above, S 2 denotes the order parameter and s e is the autocorrelation time for the intramolecular CAH bond reorientations. 59

Quantum-mechanical calculations
All quantum-mechanical calculations were carried out using Gaussian 09. 60 Geometry optimizations were carried using various combinations of QM methods and basis sets, as described in the main text. The "nosymm" keyword of Gaussian 09 was employed to carry out QM calculations with the symmetry of molecules disabled. For DFT M06-2X 61,62 geometry optimizations, the ultrafine numerical integration grid (with 99 radial shells and 590 angular points per shell) was used, combined with the "verytight" convergence condition (requesting the root-mean-square forces to be smaller than 1 3 10 26  Hartree Bohr 21 ). Additional frequency calculations were also undertaken to verify that the optimized geometries correspond to true minima. The reaction field method IEFPCM 63,64 was used to account for water solvent effects. The jump angles Dh of the CAH bonds as a result of the pyrrolidine ring interconversion were determined using Python Molecular Viewer (version 1.5.4). 65 Calculations employing MP2 and M06-2X methods were also carried out in which a selected dihedral angle was incremented or decremented in 5 steps. Basis sets considered are specified in the main text. At each step the selected dihedral angle was fixed with all the remaining degrees of freedom optimized using MP2 or M06-2X QM calculations. A relaxed 1D potential energy surface scan was performed in this manner and minimized QM energies at each step were obtained. The QM-optimized structures were then used in molecular mechanics (MM) calculations using AMBER99SB force field to obtain the corresponding MM energies (see the main text for further details).

Conformational notation
The original conformational notation proposed by Haasnoot et al. for L-prolines are used in this work. 66 The exo-and endo-orientations of the Pro ring carbon C g are defined relative to the substituent (COO or CONH groups) at the C a carbon of the Pro ring. The definition of endo-and exocyclic torsional angles is shown in Figure 1.
The pseudorotation phase angle, P, which identifies a given conformation on the pseudorotation circle, 66 and the pseudorotation amplitude v m , which is the maximum value attained by v 1 -v 5 . 66 The calculations of P and v m were done using equations by Westhof-Sundaralingam 67 : where and B52 2 5 Note that 180 is added to the calculated value of P if v 2 < 0. From the distributions of endocyclic torsional angles, a two-site exchange between C g -endo and C g -exo conformations of the pyrrolidine ring of Pro and Hyp residues was observed in MD simulations of the peptides considered. The populations of these ring conformations are denoted as x endo and x exo (in % with x endo 1 x exo 5 100%).

Initial simplexed MD fittings of experimental NMR data
In our initial revision of the AMBER99SB force field we undertook simplex fittings of 3 J HH -couplings, which comprised the optimization of the C-C-C-C dihedral parameters for the endocyclic carbons in the Pro residue of Nacetyl-L-proline (NAcPro) and Gly-Pro-Gly-Gly (GPGG). The choice here is dictated by the fact that accurate experimental data is available for NAcPro and GPGG. 34-36 In particular, full lineshape analysis was employed to derive accurate experimental values of 3 J HH -couplings in D 2 O solutions, with the estimated standard deviation 0.03 Hz for vicinal couplings. 34 As for the choice of the force field, the analysis of >10 different force fields applied to GPGG, identified AMBER99SB as the force field which reproduces best experimentally measured NMR parameters in aqueous solutions. 31 Thus, further improvement of this force field presents a challenging task for the simplex fittings of 3 J HH couplings.
While AMBER99SB predicts satisfactorily the relative energies of C g 2exo and C g 2endo conformations (as judged by their populations predicted by AMBER99SB MD simulations and those determined experimentally from least squares fittings of 3 J HH -couplings), 31,35,36 the predicted number of v 2 transitions in the Pro-2 residue of GPGG is nearly four times higher than the number of the backbone w transitions of Gly-3 (see Table II in Ref. 31). This is in disagreement with the available experimental data. In particular, from the auto-correlation times and activation parameters reported for GPGG in water based on 13 C spin-lattice relaxation time measurements, 68 the frequency of the torsional transitions involving the C g atom of Pro-2 is of similar order of magnitude as the frequency of the torsional transitions involving the C a atom of Gly-3 (see Tables I and II in  Ref. 68). Thus, the Pro force field parameters must be optimized such that they reproduce experimentally observed timescale of the Pro sidechain motions. As discussed above, apart from dynamics aspects, there is also need for improving the predicted structure of the pyrrolidine ring. The geometry of the pyrrolidine ring as predicted by AMBER99SB MD simulations is flatter (v m % 35 , where v m is approximately the same as the largest of the ring endocyclic torsions v 1 2v 5 , which is usually v 2 ) compared to NMR, X-ray and QM calculations (v m 5 37 242 ). 34, 35 The reason for such a difference is that the same set of dihedral CACACAC parameters is used in AMBER force fields for both the cyclic (e.g., C a 2C b 2C g 2C d in Pro corresponding to the endocyclic torsion v 2 ) and open chain systems (see Ref. 69 for details of how the CACACAC parameter was derived). For our initial simplex optimizations, a standard AMBER dihedral energy term of the following form was used: where V n represents dihedral force constant (amplitude), n is dihedral periodicity and g n with the value of either 0 or 180 is a phase of the dihedral angle u. The dihedral force constants, V n , were optimized to obtain the best agreement between experimental and MD-predicted values of 3 J-couplings of NAcPro. These are optimized for the angle v 2 (Fig. 1), which is usually the largest amongst the endocyclic dihedral angles v 1 2v 5 for the Pro sidechain in peptides and proteins. There are three non-zero V n values (V 1 , V 2 , and V 3 ) for the v 2 5 CT-CT-CT-CT torsion (CT denotes tetrahedral carbon) in the original AMBER99SB force field. Thus, three parameters V 1 , V 2 , and V 3 were optimized in our simplex fittings, each step of which consisted of MD simulation followed by the calculation of the MD-averaged 3 J HH  Prior to deciding the length of MD simulation within simplex fittings, we examined the convergence of the population of endo conformation (x endo , in %) using a 500 ns long MD simulation (Fig. S1, Supporting Information). The results indicate that after the initial $10 ns the populations of the two conformers have converged sufficiently. In particular, after 10 ns MD run the population of the endo conformer is 56.6% compared to 56.3% after 20 ns, 56.5 after 100 ns and 56.7% after 500 ns. Even in the region between 1.5 and 10 ns, the population deviations are within less than 62.0% (Fig. S1, Supporting Information). We have therefore used 7.5 ns long MD simulations at each step in our simplex fittings. The first 0.5 ns were considered as equilibration period and the corresponding data were discarded from calculations of averaged 3 J HH -couplings. Up to 10 different simplexed MD simulations were carried out using different scaling factors c between 20.5 and 5, with 50-200 steps of 7.5 ns long MD simulations in each case.
The original AMBER99SB values of force field parameters, together with those derived from our simplex fittings of experimental 3 J HH -couplings are shown in Table I. Five sets of optimized parameters (1)-(5) were selected from simplex fittings, showing the rms deviations from the experimental 3 J HH -couplings (rms Jp , in Hz) less than 0.8 Hz based on 7 ns long MD simulations. For comparison, rms Jp 5 0.96 Hz for the original AMBER99SB force field. Considering that the increase in force constants during simplex optimizations may lead to longer convergence times, we used additional 200-ns long MD simulations for final estimates of merit functions (rms Jp ) for parameter sets (1)-(5) and AMBER99SB. The results of these simulations are summarized in Table I.
As can be seen from Table I, parameter sets (1)-(5) obtained from simplexed MD simulations show 5-14% improvements in rms values compared to the original AMBER99SB force field. The v m values in (1)-(5) have slightly increased compared to that in the original force field, which are in better agreement with the NMR, XRD and QM results (37)(38)(39)(40)(41)(42). 34,35 From the E dih (v 2 ) graphs for the CT-CT-CT-CT fragment (Fig. S2, Supporting Information), it can be seen that the E dih (v 2 ) graphs for the parameter sets (1)-(4) show higher maxima at v 2 5 0 , the values of which correspond to the value of V 3 , since n 5 1 and n 5 2 terms of Eq. (5) are zero at v 2 5 0 , as c 1 5 c 2 5 180 . In the transition state between the C g -endo and C g -exo conformations of the pyrrolidine ring, the value of v 2 is 0 . Thus, the increase of the V 3 value here corresponds to the increase of the activation energy of the ring interconversion. Based on the Arrhenius relationship, the increase of the activation energy is expected to lead to the decrease of the frequency of transitions between the C g -endo and C g -exo states.
The above results suggest that relatively short MD simulations combined with subsequent long MD simulations using selected sets can be applied for the refinement of force field parameters provided that the force constants do not increase significantly. Note that the simplex fittings described in this work generate a new MD trajectory for each trial set of parameters to evaluate the rms deviation between experimental and MD-predicted NMR data, that is, new conformations are created at each step of fittings (see Single Trajectory Reweighting Approach section below). However, the disadvantage of the current method is that it is computationally expensive and relatively large increase in optimized parameters may not be described adequately by short MD simulations used in simplex fittings.

QM optimizations of force field parameters
After initial simplexed MD simulations, we considered QM optimizations of force field parameters followed by iterative MD simulations for further refinement of the force field parameters obtained from QM fittings. Four sets of QM calculations were considered to estimate the dependence of the results on the choice of the basis set The sum of V 4 , V 5 and V 6 is shown.
A.E. Aliev et al. and the QM method, as well as to assess the level of uncertainty involved: M06-2X/def2-TZVP, M06-2X/6-31G(d,p), M06-2X/cc-pVTZ and MP2/6-311G(d). Based on previous studies, [35,70] these QM methods and basis sets reproduce relative conformational energies and geometries in good agreement with experimental data. Calculations of 31 conformers of NAcPro were carried out in which the C a -C b -C g -C d dihedral angle (v 2 ) was varied in 5 steps between 275 and 175 . The QM predicted energy profiles in the gas phase and in water (using IEFPCM) 63,64 as a function of v 2 are compared in Figure 2. Considering relative energies of C g -endo and C g -exo conformers (with the corresponding v 2 values at $240 and 140 , respectively), the experimentally measured ratio of two conformers in water (x endo 561% and x exo 539%) 34 are best reproduced by IEFPCM(H 2 O) MP2/6-311G(d) and M06-2X/def2-TZVP calculations [ Fig. 2(c)]. The predicted populations of the C g -endo were 66 and 71%, respectively, by IEFPCM(H 2 O) MP2/ 6-311G(d) and M06-2X/def2-TZVP calculations. Thus, the results from these two sets of calculations were used in our further analysis.
The following merit function of Lindorff-Larsen et al. was used in our fittings 16 : where   AMBER99SB energy, E A99SB , plus a new torsion term, that replaces the existing AMBER99SB torsion, V A99SB (u): where k 0 is a constant, the V n s are force constants in the cosine expansion including N terms and the g n s are corresponding phases of the dihedral angle u. Simulated annealing fittings were employed to minimize U as a function of u 5 v 2 with N 5 3 by varying V n and c n values in the torsional force-field term. In line with the approach used to modify the AMBER99SB backbone potential, 14 we have assumed that V n ! 0 kJ mol 21 and g n is either 0 or 180 . However, on fitting the gas phase data the predicted values of V 1 , V 2 and V 3 were 0 kJ mol 21 for both the MP2/6-311G(d) and M06-2X/def2-TZVP data. We therefore consider only the IEFPCM(H 2 O) data below and any further reference to MP2/6-311G(d) and M06-2X/def2-TZVP calculations assumes the use of the IEFPCM(H 2 O) method.
The values of the merit function U for the original AMBER99SB force field compared to the MP2/6-311G(d) and M06-2X/def2-TZVP profiles were 2.84 and 2.24 kcal mol 21 (after k 0 -corrections according to Eq. (7)). On using simulated annealing fittings, these reduce to 2.39 kcal mol 21 for the parameter set (6) obtained from fittings to the MP2/6-311G(d) profile and 1.93 kcal mol 21 for the parameter set (7) obtained from fittings to the M06-2X/def2-TZVP profile (Table I and Fig.  2). In both cases, V 1 5 V 2 5 0 and V 3 6 ¼ 0 kJ mol 21 . Such a result with only V 3 6 ¼ 0 kJ mol 21 is not surprising, considering that v 2 in the pyrrolidine ring varies between $240 and 140 . Only the V 3 term (with g 3 5 0 ) will have a maximum equal to V 3 kJ mol 21 at v 2 5 0 , while the V 1 and V 2 terms (with g 1 5 g 2 5180 ) will show minima equal to 0 kJ mol 21 at v 2 5 0 . The value of V 3 increases significantly compared to the original force field, which is in qualitative agreement with earlier results from simplexed MD simulations (parameter sets (1)-(5)) indicating to better agreement with experiment on increasing V 3 . From 200 ns MD simulations of NAc-Pro (Table I), the parameter set (7) from M06-2X calculations shows significantly better agreement with experiment than (6) derived from MP2 calculations.
Using the QM-derived parameter set (7) as a starting point, simplexed MD simulations were carriedout to optimize the value of V 3. Initially, 40 ns long MD simulations were used at each step of simplexed MD simulations for merit function calculations. Parameter sets (8)-(23) were selected from these fittings with lowest merit function values for further 200 ns long MD simulations (Table I).
On increasing the length of MD runs from 40 ns to 200 ns, the rms Jp values increase from 0.65-0.85 Hz to 0.73-1.08 Hz for parameter sets (8)- (23). The MD-averaged v m values predicted by these parameter sets (Table I) show better agreement with the experimental NMR value compared to the original AMBER99SB force field. However, it is likely that at relatively high values of V 3 the short MD simulations used in simplexed MD fittings were not converged sufficiently. We therefore retain all new parameter sets (6)- (23) in our further analysis, as these provide sufficiently fine distribution of V 3 values between 1.9 and 9.3 kJ mol 21 . In addition, parameter sets (1)-(5) were also included in our further analysis.

Single trajectory reweighting approach
The first application of the method relying on the energy-based reweighting approach 71-74 to fittings of 3 J-couplings of NAcPro with optimizations of three parameters V 1 , V 2 , and V 3 led to unusually large values of V 2 and V 3 on using a 500-ns long MD trajectory with frames recorded every 0.04 ps: V 1 5 0.0419, V 2 5 22.3835, and V 3 5 22.5864 kJ mol 21 with the rms of the fitting 0.79 Hz. As the predicted value of V 3 is very high, significantly smaller number of the pyrrolidine ring transitions are expected in MD simulations compared to, for example, the number of peptide backbone transitions, which does not agree with experiment. 68 As discussed by Li and Br€ uschweiler, 73 the effectiveness of the reweighting scheme critically depends on the degree of overlap between the parent and the reweighted trajectories, since the reweighted procedure does not create any new conformations. On introducing a collectivity parameter j with the requirement j > 50% (see Eq. (2) and the discussion following it in Ref. 73), a physically plausible solution was obtained from the 500 ns long parent trajectory of NAcPro in water: V 1 5 0, V 2 5 0.0009, and V 3 5 2.3891 kJ mol 21 . This set of parameters is essentially the same as (14) (Table I) and therefore is not included into our further analysis. On increasing the number of terms from three to six in Eq. (5), an alternative set of parameters was derived using the reweighting approach, which is included in Table I as (24). Based on 200 ns MD simulations of NAcPro, this set of parameters performs slightly better than AMBER99SB and is therefore included into our further analysis.

MD simulations of Gly-Pro-Gly-Gly
For further examination, we carried out MD simulations of GPGG (Fig. 3) using force field parameters (1)-(24) and the original AMBER99SB force field. Note that AMBER99SB* or AMBER99SB-ILDN simulations would be the same in this case as the AMBER99SB simulation, as there are only Gly and Pro residues in GPGG. The recent study verifying different force fields using GPGG used 2 ls long MD simulations, which were sufficient for the majority of the force fields considered. 31 However, on examination of the convergence of the population of the folded form against the length of the MD run for the AMBER99SB force field (Fig. 6 in Ref. 31), it is clear that no significant change occurs in the population of the folded form after 600 ns. Thus, we carried out 800 ns long MD simulations for our analysis.
From the results obtained for the Pro ring in GPGG (Table II) We have also analyzed NMR parameters dependent on the backbone conformation of GPGG. In particular, based on the analysis of NOE data for GPGG internuclear distances for seven proton pairs were measured previously. 36 Averaged values of internuclear distances from MD runs were estimated over 800 ns time length for each of three MD simulations. The rms deviation between experiment and the MD predictions of distances (rms d ) were calculated (Table SI, Supporting Information). In addition, four 3 J CH and two 3 J HH were available from NMR measurements for the GPGG backbone, 31,36 which were used for NMR versus MD comparisons. As described previously, 31 two empirical (corresponding to rms Je1 and rms Je2 in Table SI Table SI confirm that modifications of the Pro v 2 dihedral parameters do not cause any significant changes in the backbone conformations as there is a very good agreement for all the MD simulations when considering parameters averaged over backbone conformations. Similarly, the population of the U-shaped folded conformation of GPGG 36 (p f , %) and the mean terminal N. . .C 0 distance (d ter , Å ) predicted by new parameter sets are in agreement with those predicted by the original AMBER99SB force field (Table III).

Matching relative motional timescales from MD simulations and experiment
To identify which of the new parameter sets is likely to reproduce both structural and dynamics properties of Pro residues more accurately, we have considered timescales of motions in GPGG. First, we consider the number of the w 2 , / 3 , w 3 (see definitions of angles in Fig. 3), v 1 and v 2 (see definitions of angles in Fig. 1) torsional transitions per nanosecond (N w2 , N /3 , N w3 , N v1 , and N v2 in Table III). As expected, the backbone transition numbers (N w2 , N /3 and N w3 ) are not affected by the change of the Pro torsional parameters, whereas moderate (N v2 % 41-66) and significant (N v2 % 1-37) decrease in the N v2 values are observed for parameter sets (1)- (5) and (6)-(24), respectively, compared to the original force field (N v2 % 81). For new force field parameter sets containing only a single V 3 term there is a linear relationship ln (N v2 ) versus V 3 (Fig. 4), as well as ln (N v2 ) versus v m (Fig. S3, Supporting Information) and v m versus V 3 (Fig.  S4, Supporting Information). Thus, we can adjust the Pro sidechain torsional force field such that the timescale of the sidechain dynamics matches that from experiment.
Using 13 C spin-lattice relaxation times measured for GPGG in water at 303 K, 68 Mikhailov et al. have estimated that the auto-correlation time of the C g AH bonds    of Pro-2 is 27 6 1.5 ps. As the accuracy of this type of measurements is critically dependent on the signal-tonoise ratio, we have repeated 13 C spin-lattice relaxation time measurements of GPGG using a higher-field NMR spectrometer (14.1 T, 600 MHz 1 H frequency) and a cryoprobe (Tables SII and SIII, Supporting Information). For the analysis of the T 1 values and deriving correlation times, we have used the approach developed by Ernst et al., 82,83 which is different to that used by Mikhailov et al. 68 The following equations were used to derive the correlation times for the overall (s c ) and the intramolecular ring interconversion (s e ) processes from the measured T 1 relaxation times 82-84 : J ðx; sÞ5 2s 11ðxsÞ 2 (11) where Du is the jump angle of the CAH bond on conformational transition, g H and g C are gyromagnetic ratios of 1 H and 13 C, É is Planck's constant divided by 2p, r CH 5 1.09 Å is the CAH bond length, Dr is the chemical anisotropy of the 13 C nucleus considered (see Experimental), N is the number of H atoms attached to the C atom. Note that in Eq. (8)), the sum of populations x endo and x exo is 1 (not in %). The correlation time s c can be determined using the NT 1 value (where N is the number of H atoms bonded to C) of the backbone C a carbons, which are least affected by the intramolecular motions, hence better describe the overall motion of the molecule. 82-86 In GPGG, NT 1 values of C a carbons are 1.146 s (Gly-1), 0.995 s (Pro-2), 1.106 s (Gly-3), and 1.836 s (Gly-4) (Table SIII, Supporting Information). The end residue backbone C a carbons of Gly-1 and Gly-4 show the largest values, which suggest additional intramolecular dynamics for this carbon compared to mid-chain C a carbons of Pro-2 and Gly-3. The minimum value of NT 1 is observed for the C a carbon of Pro-2, therefore we have used T 1 of this backbone carbon to determine the correlation time s c for the overall motion. The likely intramolecular motion that can influence the T 1 value for this carbon is the pyrrolidine ring interconversion. However, as estimated previously the jump angle Du is <5 for the C a carbon of the pyrrolidine ring (see Table IX in Ref. 82). Using Eqs. (8)- (11), it can be estimated that Du 5 5 leads to only $0.4% increase in the T 1 value and therefore can be neglected. From the T 1 value of 995 6 6 ms for the C a carbon of Pro-2 in GPGG measured at 298 K for the 57 mM solution in D 2 O, the correlation time s c is 48.2 6 0.3 ps. This value was used in the analysis of the T 1 value for the C g carbon of Pro-2 in GPGG to determine the correlation time s e for the intramolecular ring interconversion (see below).
In Eq. (8), two terms are weighted by factors dependent on the populations of C g -endo and C g -exo ring conformers (x endo and x exo , with x endo 1 x exo 51) and on the jump angle Du for a given CAH bond direction on changing the ring conformation. The largest jump angles are expected for C g carbon of the pyrrolidine ring. Thus, the T 1 relaxation times of C g carbons (Tables SII and SIII, Supporting Information) were used for s e determinations. Madi et al. determined Du values using dihedral angles, which they estimated using the Karplus relationship. 82 Because the accuracy of the Karplus relationship for predicting dihedral angles is relatively poor, we have taken a different approach, in which QM predicted geometries are used. Such an approach is supported by the finding that in the absence of relatively strong intermolecular interactions QM geometries reproduce accurately experimental molecular geometries derived from X-ray and neutron diffraction measurements. 70 We used the two lowest energy conformations of NAcPro from M06-2X/def2-TZVP IEFPCM(H 2 O) calculations described above, the geometries of which were optimized without any restrictions. Additional frequency calculations were carried out to verify that the final structures correspond to true minima. The obtained structures correspond to C g -endo-and C g -exo-conformations of the pyrrolidine ring with P/v m values of 171.5 /39.3 and 16.5 /39.0 , respectively. As discussed previously, 82,83 the most rigid part of the Pro ring in peptides is the C-N-C a -C fragment, where Cs are carbonyl carbons of COMe and COO in the case of NAcPro (see Fig. 5). We therefore overlaid the C g -endo and C g -exo conformations such that the rms deviations in the positions of four atoms of the C-N-C a -C fragment are minimal (Fig. S5, Supporting Information). 65 The angle Du was then estimated as the angle between the corresponding C g -H bond directions in two conformations. The values of Du determined for the C g -H g2 and C g -H g3 bonds were 82.65 and 82.47 with the average value of 82.56 , which was used as a fixed value of Du in our fittings using T 1 relaxation times of C g carbons. The populations of C g -endo and C g -exo ring conformers are known from the analysis of 3 J HH coupling constants measured at 298 K (Table II) 35 and are assumed to be temperature independent. With these restrictions in place, the correlation time s e for the intramolecular ring interconversion process were determined using the measured T 1 values for C g carbons at different temperatures. From the comparison of the above Eq. (8) and Eq. (37) of Lipari and Szabo, 59 the generalized order parameter is dependent on the populations of conformers and the jump angle Du in the case of the two-site jump model and can be calculated using the following relationship: For x endo 50.543 and Du582.56 , the calculated experimental value of S 2 is 0.27.
Using the measured T 1 values for C a and C g carbons of Pro in GPGG for the 214 mM solution of D 2 O (Table  SII,  Our MD simulations were carried out at 298 K. Using the T 1 value of 898 6 4 ms for the C g carbon of Pro-2 in GPGG measured at 298 K for the diluted 57 mM solution of GPGG in D 2 O, we have estimated the correlation time s e for the C g AH bond reorientations in Pro-2 of GPGG as a result of the pyrrolidine ring interconversion as 29.7 6 0.4 ps, which is slightly smaller than the value of s e calculated as 30.3 ps using the activation parameters reported above for the 214 mM solution. As higher concentrations may in principle lead to partial self-associations of peptides, 87 we have used the experimental value of s e 5 29.7 ps at 298 K as a reference point for our MD simulations. From the analysis of s e calculated for 14 parameter sets with a single non-zero  Tables (I-V)) is selected as the final solution which reproduces the experimental structural (Tables II  and III) and dynamic properties (Tables III and IV, Fig.  S7) of the sidechain of the Pro residue significantly better than the original AMBER99SB force field.

Force field phase variations
In another set of optimizations we considered variations of both the V 3 force constant and the phase g 3 . The value of V 3 was varied between 1 and 5 kJ mol 21 with a step of 1 kJ mol 21 , while the value of g 3 was varied between 250 and 50 with a step of 10 . The results of 700 ns long MD simulations for each pair of V 3 and c 3 values are summarized in Tables SV-SVIII in Supporting Information. Over four parameters considered (rms Jp , x endo , s e and S 2 ), the force field with V 3 5 4.0 kJ mol 21 and c 3 5 0 shows the best agreement with experiment. This additional grid search analysis allowed us to confirm that the above optimization leading to V 3 5 4.3474 kJ mol 21 and c 3 5 0 is the unique solution in the two-dimensional (V 3 ,g 3 )-parameter space.

Influence on the backbone conformation
To examine the influence of the new sidechain parameter set on the protein backbone conformations and dynamics, we have carried out 1-ls long MD simulations of ubiquitin. Three Pro residues of ubiquitin-Pro-19, Pro-37, and Pro-38-were considered, conformational  Unlike Pro-19 and Pro-37, the pyrrolidine ring of Pro-38 in ubiquitin is in predominantly C g -exo conformation according to MD simulations (Table SIX), which is in agreement with the finding that in Xaa-Yaa-Gly triplets of collagen the Pro ring prefers the endo pucker (i.e., C g -endo conformation) in the X position, while in the Y position it prefers the exo pucker. 89,90 In principle, this can be verified experimentally by measuring accurate values of 3 J HH -couplings of the pyrrolidine rings in ubiquitin. However, pyrrolidine cyclic protons usually show strongly-coupled 1 H NMR spectra due to small chemical shift differences for methylene protons in b and g positions. 34 Accurate measurements of J HH -couplings would therefore require a full lineshape analysis, which is complicated by strongly overlapping spectra in the case of proteins.
The values of N v2 in ubiquitin prolines are in good agreement with those predicted for the Pro residue in GPGG, although the number of v 2 transitions decreases significantly in Pro-38, which is likely caused by the Pro-37 residue preceding Pro-38. We have compared three experimental 3 J(C 0 ,H a ) couplings of 1.22 Hz (Pro-19), 1.71 Hz (Pro-37), and 1.06 Hz (Pro-38) in ubiquitin 37,38 with those calculated from MD simulations of ubiquitin using Karplus parameters, derived empirically 50 and from DFT B3LYP/EPR-III calculations. 31 Compared to the AMBER99SB*-ILDN calculations, the parameter set (25) lead to only small variations in 3 J values (Table SIX, Supporting Information). This result confirms that the changes in the sidechain dynamics interchanging the C g atom position below and the above the C a -N-C g plane cause only small changes in the torsional angle H a -C a -N-C ( Fig. 5 and Fig. S5).
Finally, the performance of parameter sets AMBER99SB*-ILDN and (25) were compared using experimental values of five different types of backbone 3 Jcouplings, each of which has been determined for 60-67 amino acid residues in ubiquitin. 37,38 On calculating the MD-predicted averaged 3 J-couplings we have considered up to four different sets of Karplus parameters for each type of 3 J coupling. 49,50 From the results summarized in Table SX (Supporting Information), both force fields reproduce 3 J couplings equally well, confirming that the new Pro torsion potential does not cause undesirable side effects on the backbone conformations compared to the original force field, the performance of which has been verified extensively. 3,4,14,15,19-33

Force field validation
As an independent test, we have used NMR data and MD simulations of Val-Ala-Pro-Gly (VAPG). In Table V, we compare conformational populations and geometries of the Pro ring in VAPG in water as predicted by NMR and by 800-ns long MD simulations. The rms Jp values relative to experimental values of 10 3 J HH -couplings show that the new force field (25) reproduces better the experimentally measured values than the original force field. The value of v m serves as a measure of nonplanarity of the five-membered ring. The results confirm that the new force field (25) leads to significantly improved agreement with experiment compared to the original force field AMBER99SB.
In terms of motional dynamics, the predicted values of the correlation time and generalized order parameter for  (4)  Although we have primarily focused on force field optimizations for the trans-rotamer about the bond preceding the Pro residue, it would be interesting to verify whether the new force field would offer any improvements for the cis-rotamer compared to the original force field. In the case of cis-VAPG (with the cis-orientation of the CH 2 group of Gly and the CO group of Pro), the MD-predicted 3 J HH couplings by the new force field (25) show improved agreement with experimental values of 3 J HH couplings compared to the original force field as judged by the rms Jp values: 1.00 Hz and 1.32 Hz for force fields (25) and AMBER99SB. However, the agreement with the experiment is not as good as for the trans-VAPG considered above due to the lower value of the predicted population of the C g -endo conformer by the new force field (74%, as opposed to the experimental value of 83%). The difference in the predicted population by the new force field is further amplified in the predicted value of S The change of the amino acid residue proceeding the Pro residue to Phe has been shown to lead to the increased population of the C g -endo conformer. 91 We have redetermined conformational characteristics of the Pro residue in Gly-Pro-Phe (GPF) using experimental values of all ten 3 J HH couplings reported by Anteunis et al. 91 and the least squares fitting procedure described previously. 34 The results summarized in Table V confirm that the content of the C g -endo conformer increases in GPF (x endo 5 68.0%) compared to that in GPGG and VAPG. However, the degree of change is not as significant as previously predicted (x endo 5 85%) using Karplus relations of Pogliani et al. 92 In Table V, we compare conformational populations and geometries of the Pro ring in GPF in water from 800 ns long MD simulations and experiment. As in the case of tetrapeptide VAPG above, the rms Jp values relative to experimental values of ten 3 J HH -couplings show that the new force field (25) reproduces better the experimentally measured values than the original force field. The higher values of x endo and v m compared to the original force field are also in better agreement with experiment (Table V).
We have also analyzed NMR data and MD simulations of octapeptide angiotensin II (Asp-Arg-Val-Tyr-Ile-His-Pro-Phe, Fig. S8 in Supporting Information). After initial assignments of peaks in 1 H and 13 C spectra of 16 mM solution of angiotensin II in D 2 O using 2D NMR spectra (Tables SXII and SXIII, Supporting Information), full 1 H NMR lineshape analysis was carried out to determine vicinal 3 J HH couplings of the Pro-7 sidechain ( Fig. S9 and Table SXIV, Supporting Information), which were subsequently analyzed to estimate conformational characteristics of the pyrrolidine ring of Pro-7 in angiotensin II. In addition, 13 C spin-lattice relaxation times were measured at 298 K (Table SXV in Supporting Information), which allowed to measure values of S 2 and s e . As in the case of GPGG and VAPG discussed above, the T 1 values of the backbone C a carbons show clear decrease towards the mid-chain residues (in ms next page): The minimum value of T 1 observed for the C a carbon of Tyr-4 suggests that this site is least affected by intramolecular motions. It is therefore best suited for determining the correlation time s c of the overall molecular motion. From Eqs. (8)- (11), the value of s c corresponding to T 1 5 310 6 3 ms is 246 6 6 ps. This value was used in the analysis of the T 1 value for the C g carbon of Pro-7 in angiotensin II to determine the correlation time s e for the intramolecular ring interconversion (see below).
To estimate the jump angle Du in angiotensin, we have used M06-2X/def2-TZVP calculations of GPF with the Phe residue following Pro as in angiotensin II. After overlaying the C g -endo-and C g -exo-conformations of GPF such that the rms deviations in the positions of four atoms of the C-N-C a -C fragment are minimal, the jump angle Du was determined as 83.16 (82.97 for C g -H g2 and 83.34 for C g -H g3 ), which was used as a fixed value of Du in our fittings T 1 relaxation data.
In Table V (Table V), the timescale of motion is reproduced significantly better by the new force field (25). The corresponding values of s e are 8.4, 33.1 and 32 6 4 ps for AMBER99SB, the new force field (25) and experiment, respectively.
Finally, the relative experimental values of overall and internal correlation times s c /s e were 48.2 ps/29.7 ps in GPGG, 82.8 ps/30.7 ps in VAPG and 246 ps/32 ps in angiotensin II. These clearly show that despite the fivefold increase in the correlation time of the overall motion, the timescale of the internal motion remains essentially unchanged in these peptides of varying size. Thus, it is likely that the overall molecular motions and the intramolecular dynamics of the Pro ring are independent in the peptides considered. has been shown to reproduce the experimentally observed preference of the C g -exo conformer in Hyp over the C gendo conformer better than that of Mooney et al. 89 Our MD simulations carried out for Ace-Hyp-NHMe (AHM, Fig. 6) are in agreement with these findings (Table VI). The predicted population of the C g -endo conformer is 51.4% on using parameters of Mooney et al., while the smaller value of 6.7% predicted by the Hyp parameters of Park et al. is in good agreement with the experimental value of 12%. Similarly, the experimental 3 J HH couplings   (Table VI). Furthermore, the predicted motional characteristics of the ring dynamics by both parameter sets are in sharp contrast with experiment, showing significantly higher frequencies of ring interconversions. In particular, the correlation times of the ring interconversions (s e ) are 7. We have optimized the force field parameters for the hydroxyproline N-C d -C g -O torsional angle (denoted as v h ) to better match the dynamics characteristics of the Hyp sidechain. The new force field (25) for the CACACAC (v 2 ) torsion was used as a fixed constant (V 3 5 4.3474 kJ mol 21 and g 3 5 0 ) in these optimizations for the Hyp residue. In the original AMBER99SB force field V 3 5 0.65084 kJ mol 21 and g 3 5 0 for the hydroxyproline N-C d -C g -O (v h ) torsion. Initially, 1.5-ls MD simulations were considered in which the value of V 3 for v h was gradually increased (Table VI). This showed that the population x endo approaches the experimental value at only very high values of V 3 (see Table  VI), at which even 1.5 ls MD simulations may not be sufficient for the convergence of the predicted population.
Similar to the Pro residue considered above, we used QM calculations to fit the v h parameters in Hyp. The  Table VI, which predicts very high value of x endo compared to experiment. Therefore, no new MD simulations were carried out.
In a new set of optimizations we considered variations of both the V 3 force constant and its phase g 3 . The results of 600-ns long MD simulations for each pair of V 3 and g 3 values are summarized in Tables SXVI-SXIX  (Supporting Information). Over four parameters considered (rms Jp , x endo , s e and S 2 ), the force field with V 3 5 5.3 kJ mol 21  ps. This value of V 3 together with the phase g 3 5 30 was used for our further verifications (referred to as parameter set (h13)). A 1.5-ls long MD simulation using force field (h13) for v h of Hyp (with force field (25) for the v 2 potential) confirmed the improvement of the parameterization of the v h potential, as S 2 is 0.69 and s e 5 77.6 ps compared to the original AMBER99SB force field with S 2 5 0.34 and s e 5 7.8 ps and the experimental values of S 2 5 0.69 and s e % 83 ps (Table VII). Also, the predicted x endo population is 9.6%, which is in close agreement with the experimental value of 11.9%. In addition, the v m value increases from 35.0 for AMBER99SB to 39.5 for (h13), which compares better to the experimental estimate of 42 6 2 . As expected,  Further independent validation for the hydroxyproline parameters was carried out using 1.5-ls long MD simulations of N-acetyl-4-hydroxy-L-proline-glycine (Ace-Hyp-Gly, AHG, Fig. 6; full NMR data is included in Tables SXX-SXXII, Supporting Information). The new force field (h13) for the v h torsion together with the force field (25) for the v 2 endocyclic torsion shows a much improved agreement with experiment compared to the original force field AMBER99SB (Table VII) For comparison, v m 5 42 6 2 based on the analysis of the experimental NMR data. The predicted value of x endo also shows improved agreement with experiment, that is, the experimental value of 13.8% 6 0.5% is reproduced as 9.4% by the new force field. This is also reflected in the reduced rms Jp value which is 0.64 Hz (Table VII). By far the largest improvement is obtained for dynamics characteristics of the hydroxyproline ring interconversion.  (Table VII).

DISCUSSION
We propose a new approach for force field optimizations which aims at reproducing experimental dynamics characteristics using biomolecular MD simulations, in addition to improved prediction of motionally averaged structural properties available from experiment. As the source of experimental data for dynamics fittings, we use 13 C NMR spin-lattice relaxation times T 1 of various backbone and sidechain carbon atoms, which allow to selectively determine correlation times of both overall molecular reorientations and intramolecular motions. For relative conformational stability and structural fittings, we use motionally averaged experimental values of NMR 3 J couplings over three bonds. The proline residue and its derivative 4-hydroxyproline with relatively simple structure and sidechain dynamics were chosen for the assessment of the new approach in this work. Initially, the grid search and simplexed MD simulations identified large number of parameter sets which fit equally well experimental J couplings. Using the Arrhenius-type exponential relationship between the force constant and the correlation time, the available MD data for a series of different parameter sets were analyzed to determine the value of the force constant that best reproduces experimental timescale of the sidechain dynamics. Verification of the new force-field parameters against NMR J couplings and correlation times showed consistent and significant improvements compared to the original force field in reproducing both structural and dynamics properties. These results suggest that matching experimental timescales of motions together with motionally averaged characteristics is a valid and robust approach for force field parameter optimization. Such a comprehensive approach is not restricted to cyclic proline and 4hydroxyproline residues and can be extended to sidechain structure and dynamics of other amino acid residues, as well as to the protein backbone. In cases more complex than the Pro or Hyp sidechain dynamics, QM methods may also prove successful in providing information regarding the barrier heights of conformational changes, especially when the interpretation of the NMR relaxation data is not straightforward.