Water Mediation Is Essential to Nucleation of β-Turn Formation in Peptide Folding Motifs

Water-mediated bond formation: The structure of the peptide GPG-NH2 has been investigated in aqueous solution to understand the role of water in the formation of a β-turn. Using a combination of neutron diffraction enhanced by isotopic substitution, NMR spectroscopy, and computer simulations, it was found that water is an essential component to initiate folding in solution.


Sample preparation
Glycyl-L-prolyl-glycinamide·HCl (GPG ·HCl) was purchased from Bachem (Bubendorf, Switzerland) and used without further purification. For the samples which required deuterium labeling, GPG ·HCl was dissolved and then lypholized in 99.8% D 2 O. This process was repeated three times to ensure an adequate level of deuteration of the exchangeable hydrogens.
For the NMR experiments all samples were prepared under an inert N 2 atmosphere, either in a glove box or using a gas/vacuum manifold. D 2 O (99.8%) was obtained from Acros Organics and ultrapure H 2 O from a Millipore water purification system and both were thoroughly degassed using N 2 prior to use.
Deuterium-exchanged GPG (dGPG) was diluted in either D 2 O and/or H 2 O where two different concentrations were prepared, a 'dilute' sample at around 0.09 M and a 'concentrated' sample at the same solute:solvent ratio that was used in the NDIS, specifically at a molecular ratio of 1:58 GPG:water (0.95 M). Both dilute and concentrated samples were also prepared using fully proteated GPG and H 2 O in order to measure density and pH. The pH for the dilute sample was 5.3 and 4.9 for the concentrated sample where this sample has a measured density of 1.08 g/ml in H 2 O.

Neutron diffraction (NDIS) experiments and EPSR
The use of NDIS for hydrogen-containing systems is particularly effective because the neutron scattering length b of hydrogen is relatively large (-3.74 fm) and very different than that of deuterium (6.67 fm). [1] By measuring isotopically unique yet chemically equivalent species, it is possible to obtain a variety of diffraction patterns for the liquids in question. The diffraction pattern (or static structure factor) for a liquid or solution, F(Q), can be written as where c i and b i are the relative concentration and scattering length of atom i, respectively, δ αβ is the Kronecker delta, Q is the scattering vector, Q = 4 π/λ · sin(2θ /2) with the neutron wavelength λ and the scattering angle 2θ . Eq. S1 describes the sum of all of the partial structure factors S αβ (Q) for each unique atom pair. The Fourier transform of partial structure factors S αβ (Q) gives the atomic distances in real space, g αβ (r) (RDFs) on the Å (10 −10 meters) scale via S αβ (Q) = 1 + 4 π ρ Q r · (g αβ (r) − 1) · sin(Qr) dr (S2) where ρ is the atomic number density of the sample (in atoms/Å 3 ) and g αβ (r) is the radial distribution function (RDF) between atoms α and β . Further, the average number of β atoms around a central α atom (the coordination number or CN) in a distance between r 1 and r 2 can be calculated from these RDFs using Neutron diffraction measurements were performed on the SANDALS instrument located at the ISIS Facility (STFC, UK) on GPG-NH + 3 ·Cl -(see Fig. 1 main text) using four isotopically substituted water solvents, each at a concentration of ≈ 1 M (1:58 GPG:water ratio) at 298K.
All samples for NDIS measurements were prepared by weight and then transferred to flat plate vanadium cells which were coated with an ≈ 0.1 mm layer of PTFE. Vanadium containers were used as the scattering of neutrons from vanadium is predominantly incoherent and thus leads to a more tractable data analysis. Although the PTFE itself contributes to the container background, the very low quantity present does not interfere with the scattering signal of the sample itself. For each measurement, the raw data obtained were converted to F(Q) after appropriate corrections for absorption, multiple scattering and inelasticity effects were made using the program GUDRUN which is available at ISIS [2] based on the ATLAS package. [3] The corrected F(Q) data are shown in Fig. S1 for the measured data sets listed in Table S1.  Figure S1. Data (points) and fits (black line). The deviation between data and fits are shown as grey lines.
eling is a reverse Monte Carlo modelling technique used to augment the information obtained by NDIS on solutions.
[4] EPSR simulations were performed with a box of molecules that contained 20 GPG-NH + 3 molecules, 20Clions and 1160 water molecules. The 'seed' potentials for the GPG-NH + 3 molecules were adapted from the CHARMM force field [5,6] and modified TIP3P potentials for used for the water molecules. [7,8] The peptide bonds were constrained to be planar and the cis-trans ratio was fixed to the value obtained by NMR.
EPSR begins with a standard Monte Carlo simulation using a set of atomic reference potentials. After this initial step, these potentials are refined, iteratively, until the simulated structure fits all of the provided neutron data, ultimately resulting in a model which is consistent with the set of isotopically unique, chemically equivalent data. It should be noted that while EPSR provides a model which is consistent with the diffraction data, it is not necessarily unique. [4,9] Absence of GPG-GPG aggregation in solution Small angle neutron scattering (SANS) measurements were performed on proteated GPG in D 2 O (the exchangeable hydrogens were deuterated in this instance) to assess any macromolecular structure formation in solution using the SANS2d instrument (STFC, UK) at the ISIS facility. SANS is useful in that it can detected larger range structures, typi-cally those above around 50 Å. [10] Fig. S2 shows the SANS measurement of GPG in solution where it is evident that there are no long range structures being formed as aggregation of molecules would lead to a strong rise of the scattering intensity at low Q. Possible smaller scale association between GPG molecules in solution was also assessed in both EPSR and MD simulations. The intermolecular g(r) between the GPG oxygen and hydrogen atoms on different GPG molecules are shown in Fig. S3 and the corresponding coordination numbers are listed in Tables S2, S3 and S4. Neither the pure MD simulation nor the EPSR simulation of the NDIS show an appreciable amount of aggregation. Figure S3. Intermolecular radial distribution functions of the N-H hydrogens around the C=O oxygens. The corresponding coordination numbers are given in tables S2, S3 and S4.

Details of the MD simulations
The Molecular Dynamics simulations were performed with GROMACS. [11] For each system, a series of position restraints were used first in order to eliminate any clashes between atoms that resulted from the building of the initial configurations. Then the temperature was equilibrated by using the NVT ensemble to run a 2 ns simulation at 300 K. Finally, the pressure was allowed to equilibrate within the system by carrying out a simulation using the NPT ensemble at a temperature of 300 K and a pressure of 1 atm, which was 2 ns in duration. Finally, the production simulations were conducted using the NPT ensemble, where the temperature was 300 K and the pressure was 1 atm, which were run for 40 ns using a 2 fs timestep. The configuration of the system was saved in steps of 10 ps for the analysis. In all simulations, the temperature is controlled using the Nose-Hoover thermostat [12,13] and the pressure is controlled using the Martyna-Tuckerman-Tobias-Klein (MTTK) barostat. [14] The van der Waals interactions were cutoff at 14 Å, while the particle mesh Ewald (PME) algorithm [15,16] was used to compute the long-range Coulomb interactions. The CHARMM forcefield that was used to model the GPG molecules in the MD simulations utilized the CMAP term, which is a grid-based correction for the φ -ψ-angular dependence of the energy. This correction has been shown to provide significant improvements in the residue-location specific distribution of dihedral angles in protein in solution simulations. [17] In following with the standard implementation of the CMAP term, the correction is not applied to either of the two terminal Gly residues.

Details of the NMR experiments
NMR spectra were acquired using home-built 500 MHz and 750 MHz spectrometers at the University of Oxford which are controlled with GE/Omega software and equipped with a home-built triple-resonance pulsed-field-gradient probehead. For all the experiments, the sample temperature was set to 20 • C. The full 1 H spectrum for GPG in H 2 O is shown in Fig. S4, the concentration of the GPG sample was the same as in the neutron measurements. From this figure the different cis versus trans peaks can be seen. The spectra from two more dilute solutions of GPG in water were also obtained to ensure that the cis/trans ratio was not concentration dependent. Hn4' Hn4 Hn3 Figure S5. Naming convention used for the NMR spectra shown in Figs. S4 and S6. The NH + 3 group is also named for clarity although it is not visible in the NMR spectra.