Probing Arginine Side-Chains and Their Dynamics with Carbon-Detected NMR Spectroscopy: Application to the 42 kDa Human Histone Deacetylase 8 at High pH**

Arginine side-chains play a distinct role because of their high pKa and perpetual positive charge. An NMR method is presented, based on carbon-detected 13Cζ–15Ne correlation spectra, which allows probing the arginine side-chains and their dynamics at neutral-to-high pH. The methodology is demonstrated on human histone deacetylase 8.


Theoretical aspects of probing arginine side-chain dynamics
Determining arginine 15

N  spin relaxation rates
The nuclear spin-relaxation of 15 N  can be treated as described previously [1] and is similar to the relaxation of the amide nitrogen in the protein backbone. The relaxation of 15 N  is caused by the 15 N  -1 H  dipole-dipole interaction and the chemical shift anisotropy (CSA),  N = 114 ppm [2] , assuming axial symmetry of the CSA tensor. Due to the weak scalar coupling between 13 C  and 15 N  ( 1 J C-N ≈ 20 Hz), we prefer to derive the motional parameters from the anti-phase relaxation rate constants, R 1 (2C z N z ) and R 2 (2C z N x ). By adopting this approach instead of measuring the in-phase nitrogen relaxation rates, we can limit the time the magnetization spends in the transverse plane, because the extra refocusing step that would be required can be omitted and additional loss of coherence is avoided. As shown below, the measurement of anti-phase relaxation rates does not affect the accuracy of the derived motional parameters. The contributions from cross-correlations involving 15 N  and 13 C  can be neglected because of the weak 15 N  -13 C  dipole-dipole interaction (see below) and we therefore obtain: where R 2 (N x ) and R 1 (N z ) are the transverse and longitudinal 15 N  relaxation rates defined below and R 1 (C z ) is the longitudinal relaxation rate of 13 C  , which we determine experimentally. Apart from possible interference from cross-correlation effects, the above approximations neglect higherfrequency spectral density terms in the dipolar contributions to the relaxation rates. These approximations are reasonable due to the weak 15 N  -13 C  dipole-dipole interaction, as shown below.
Theoretical simulations were used to verify the approximations made in Eqs S1 and S2 and in particular to investigate the effect of the cross-correlation between the 15 N  -13 C  dipole-dipole and 13 C  CSA or 15 N  CSA relaxation mechanisms. Since R 2 (N x ) in most instances is significantly larger than R 1 (N z ) a verification of the approximation made in Eq. S2 also verifies Eq. S1. To investigate the crosscorrelation between the 15 N  -13 C  dipole-dipole and 13 C  CSA or 15 N  CSA relaxation mechanisms we considered a basis of four normalized operators in the product operator formalism: {E/2, C z , N z , 2C z N z }, where E is the unity operator, C z is the longitudinal operator for 13 C  , N z is the longitudinal operator for 15 N  , and 2C z N z is the longitudinal two-spin order operator. Auto relaxation rates and transition rates between the four operators were calculated using a 13 C  -15 N  distance of 1.33 Å, a 13 C  axial CSA of 78 ppm [3] , and a 15 N  CSA of 114 ppm. [2] A contribution of 0.3 s -1 was added to the auto-relaxation rates of C z and 2C z N z to account for the 13 C  -1 H  dipolar interactions, and the 2 relaxation of 15 N  caused by the 1 H  was explicitly added to the auto-relaxation rates of N z and 2C z N z .
The experimentally extracted relaxation rates R 1 (C z ) and R 1 (2C z N z ) were simulated by (1) integrating the homogeneous master equation [4] over the corresponding relaxation delays [5] , T relax in Figures S2 &   S3, for values of T relax used experimentally and (2) fitting the resulting intensities to single exponential decays. As a measure of the accuracy of Eq. S2 we have plotted in Figure S1 the fractional error introduced by our assumptions: is the autorelaxation rate of N z calculated above. Overall, the total error introduced by neglecting the higherfrequency spectral density terms in the dipolar contributions to the auto-relaxation and crosscorrelation between the 15 N  -13 C  dipole-dipole and 13 C  CSA or 15 N  CSA relaxation mechanisms is always less than 4.5 %. Figure S1: Contour plot of fractional error introduced by the assumption made in Eq. S2 for probing arginine side-chain dynamics. The fractional error, [R 1 (N z )  { R 1 (2C z N z )R 1 (C z )}]/R 1 (N z ) is shown for different values of the order parameter, S 2 , and local correlation times,  e , assuming an overall rotational correlation time of 11.6 ns and a static magnetic field strength of 16.4 T.

Extraction of motional parameters
In general, motions of a specific bond-vector on the pico-to-nanosecond timescale are accessible by experimentally measuring nuclear relaxation rates that, in turn, can be interpreted in terms of meaningful physical parameters using different model-dependent [6] and model-independent treatments. [7] For a given IS spin-system, such as an 1 H-15 N spin-pair, the longitudinal relaxation rate, R 1 (N z ), the transverse relaxation rate, R 2 (N x ), and the steady-state nuclear Overhauser effect, { 1 H}- 15 N NOE, can be expressed in terms of the spectral density function, J() [8] : and J(0), J( N ) and J( h ) are the values of the spectral density function evaluated at a frequencies of 0 rad s -1 ,  N , and  h , respectively, where  N is the 15 N Larmor frequency and  h = 0.87 H is the effective proton Larmor frequency. [9] The contribution to the transverse relaxation that originates from fluctuations of the Zeeman Hamiltonian due to chemical exchange processes is R ex . In Eqs. S6 and S7,   is the chemical shift anisotropy (in ppm), assuming axial symmetry for the 15 N chemical shift tensor (  = 160 to 172 ppm [1] for backbone 15 N and    = 114 ppm for arginine sidechain 15 N  [2] ) µ 0 is the permeability of free space, h is Planck's constant,  H and  N are the magnetogyric ratios of 1 H and 15 N, respectively, and r NH = 1.021.04 Å is the N-H bond length. [1] The spectral density function, J(), can be expressed in the model-free formalism as a function of the generalized order parameter, S 2 , the overall rotational correlation time of the protein, where t e' -1 = t R -1 + t e -1 . The overall rotational correlation time,  R , is determined below from the backbone R 1 (N z ) and R 2 (N x ) rates using only those residues with limited internal dynamics and no contributions to the transverse relaxation from chemical exchange.

Spin Relaxation Measurements
All the carbon-detected pulse sequences for probing arginine side-chains presented here are based on the template 13 C  -15 N  sequence shown in Figure 1. Specifically, transfers from 13 C  to 15 N η are avoided by applying a selective RE-BURP [10] centred at 84 ppm, and with a length of 5.  [11] Proton decoupling is applied during acquisition to remove the small 13 C  -1 H  scalar coupling and 15 N decoupling is applied to remove 13 C  -15 N  coupling evolution. Carbon decoupling during the indirect chemical shift evolution of 15 N  is implemented using a broadband adiabatic sweep with a sweepwidth of 80 kHz and centred at 100 ppm [12] , which decouples both the 15 N  -13 C  and the 15 N  -13 C  couplings. The modules included in Figure 1 to allow for relaxation measurements are shown in Figure S2.  The flanking alignment sequences [13] have the following delays = 1/(2 SL ) 4 / and  /, where is the 15 N 90° pulse width. The phase cycles are as follows:   x  y  x, x   (x), 2(x)  (y), 4(y), rec  x, 2(x), x, x, 2(x), x. Figure S2a shows the element implemented in the pulse sequence for measuring the relaxation rate R 1 (2C z N z ) of the longitudinal two-spin order coherence 2C z N z . Cross-correlation effects between the 15 N  -1 H  dipole interaction and 15 N  CSA are suppressed during the relaxation delay by application of proton 180° pulses at intervals of 100 ms, while cross-correlation effects between the 15 N  -13 C  dipole interactions and the 15 N  CSA and 13 C  CSA are neglected as justified above.
The transverse relaxation rate of 15 N  is obtained via the corresponding relaxation rate in the rotating frame. The direction of the effective field for the 15 N nucleus during the application of a spinlock field,  SL , along x is ẑ ' = sin(q)x + cos(q)ẑ, where x and ẑ are direction vectors in the rotating frame, tan()= SL / N , and  N is the offset of the 15 N nucleus from the RF carrier. The relaxation rate of the anti-phase coherence in the rotating frame is then given by [9] , which is measured with the scheme shown in Figure S2b. The pulse sequence elements before and after the spin-lock period in Figure S2b serve to align the magnetization with the effective spin-lock field, ẑ ', and return it to ẑ as suggested previously. [13] Cross-correlations between the 15 N  -1 H  dipole interaction and 15 N  CSA are suppressed by application of a single proton 180° pulse in the middle of the spin-lock period. [14] The transverse anti-phase relaxation rate, R 2 (2C z N x ), is subsequently calculated from the spin-lock field strength,  SL , the offset  N , R 1 (2C z N z ), and R 1 (2C z N z ') using Eq. S9. One advantage of obtaining the transverse relaxation rate via the relaxation rate in the rotating frame R 1 (2C z N z ') is that contributions from the exchange of 1 H  with the solvent are minimized. [15] A pulse sequence to measure the longitudinal relaxation rate of 13 C  is shown in Figure S3.
This sequence is very similar to the sequences shown in Figures 1 and S2, except that the decay of longitudinal in-phase 13 C  magnetization is encoded at the beginning of the sequence and is then followed by a 13 C  -15 N  HSQC ( Figure 1). Although the obtained R 1 (C z ) rate is affected by 13 C  -1 H  dipole-dipole cross-correlation effects we show below that these rates are adequate for determining the in-phase 15 N  relaxation rates. RE-Burp, E: E-BURP-2 [10] , C: smoothed CHIRP [12] ). Phases are x unless stated otherwise by indicating a phase cycle: Overall, we derive the in-phase 15 N  relaxation rates from the anti-phase relaxation rates and R 1 (C z ) as described above (Eqs S1 and S2), and subsequently use these in-phase relaxation rates to derive motional parameters for the arginine side-chains.

Chemical Shift Assignments
For chemical shift assignments of the 13 C  -15 N  arginine side-chain resonances in proteins at neutral and high pH, we designed a 3D CCNeCz-TOCSY pulse sequence (see Figure S4

WALTZ64
[16] with a field strength of 4 kHz (carrier at 7 ppm) for proton and 1 kHz (carrier at 3 ppm) deuterium decoupling and GARP4 [17] with a field strength of 0.7 kHz (carrier at 78 ppm) for nitrogen decoupling. The Briefly, the 3D CCNeCz-TOCSY pulse sequence starts with excitation and indirect chemical shift evolution of aliphatic 13 C  magnetization. The evolution time is restricted to 10 ms to limit carbon-carbon scalar coupling evolution. Following TOCSY mixing via the FLOPSY-16 scheme [19] , 13 C  magnetization is transferred to 15 N  via a constant-time period whose length is tuned to give zero net 8 13 C  -13 C  coupling evolution. The 15 N  chemical shift is encoded using a semi-constant-time [20] evolution period during which the 15 N  -13 C  coupling is refocused. Concomitant evolution of the 15 N  - 13 C  coupling leads to generation of magnetization that is anti-phase with respect to 13  Based on our success here with T4L L99A, where ten of the 13 arginine side-chains could be assigned using this method, we anticipate that the carbon-detected sequences are suitable for assignment of arginine side-chains of per-deuterated proteins up to ~20 kDa. Alternatively, the 13 C ζ - 15 N ε resonances can be assigned by site-directed mutagenesis, as shown below.
The codon-optimized coding sequence of HDAC8 was obtained from GenScript (Piscataway, USA) in a pET-29b+ vector containing an N-terminal His-NusA-tag [22] separated from the HDAC8 coding sequence by a linker that contains a specific TEV cleavage site (ENLYFQG). The R223Kmutation was introduced by the Quikchange protocol. The wild-type and mutant constructs of  Table S1. Final sample concentrations were ~ 0.15-0.3 mM.

NMR experiments
All proton-detected backbone 15 Table S1 for further details).
Backbone 15 N relaxation rates (R 1 and R1  ) were measured at 500 MHz (11.74 T) using established proton-detected experiments [23] based on a gradient-selected, sensitivity-enhanced, refocused 15 N HSQC sequence. [24] The water signal was preserved using selective water pulses (2 ms sinc shape) and weak bipolar gradients were applied during the indirect chemical shift evolution to maintain the H 2 O magnetization along z. In the R 1 sequence, cross-correlation effects were suppressed by application of random-phase proton CW during the nitrogen spin-lock, [23] and magnetization was explicitly aligned with the spin-lock field. [13] Relaxation delays were 20, 200, 400, 700, 1000 and 1400 ms for R 1 measurements and 10, 20, 40, 60, 80 and 100 ms for R 1 measurements. For the R 1 measurements, 15 N-1 H cross-correlation pathways were suppressed by application of amide-selective IBURP-1 pulses (1.92 ms at 500 MHz, centred at 8.27 ppm) at intervals of 10 ms during the relaxation delay. [10,25] For measurement of the arginine 15 N  relaxation rates, the backbone 15  Water magnetization was preserved in the reference spectrum as described above. Saturation of proton magnetization was achieved using a 5 s train of high-power 120° pulses applied at 5 ms intervals. The reference and saturated spectra were recorded in an interleaved fashion, and therefore to ensure full recovery of the water magnetization at the start of each increment of the reference experiment, a long recycle delay of 15 s was used.
For the comparison of signal/noise ratios between the proton-detected and the carbondetected HSQC of T4 lysozyme L99A, both spectra were processed with the same shifted square sine window function. Noise levels and peak intensities were determined by NMRpipe [26] to calculate an average signal-to-noise ratio for each spectrum, which was further normalized by the acquisition time of the experiment.
A comprehensive list of all experiments including sample details, experimental conditions, and recording parameters is provided separately (see Table S1).

Calculation of order parameters
Rotating frame R 1 relaxation rates were converted to R 2 relaxation rates using Eq. S9. R 2 /R 1 ratios were calculated for those backbone amides that have limited flexibility and minimal chemical exchange. [27] Subsequently the R 2 /R 1 ratios were used as inputs to calculate the overall diffusion tensor, D, and local correlation times ( R,local ) of specific arginine backbone 1 H-15 N and side-chain 1 H  - 15 N  bond-vectors of T4L L99A (PDB: 3dmv [28] with protons added) using the program quadric_diffusion [29] and assuming an axially symmetric diffusion tensor.
Order parameters were calculated from 15 N  side-chain R 2 and R 1 relaxation rates (from carbon detected and proton detected side-chain experiments) by using the calculated  R as a constraint and solving Eqs S3, S4, and S8 numerically for S 2 and  e ' using the Octave numerical software package (http://www.gnu.org/software/octave/).
Side-chain order parameters were also calculated by including heteronuclear NOEs measured by proton-detected experiments. With this additional dataset order parameters were calculated by a ² minimization in Octave using Eqs. S3, S4, S5 and S8, again constraining  R as above. These order parameters obtained from three parameters (R 1 , R 2 , NOE) were compared to those obtained from two parameters (R 1 , R 2 ) and showed good agreement (see below, Figure S9). The propagation of errors was calculated by Monte Carlo simulation [30] using at least 10 randomly generated and normally distributed datasets. The error for the input parameters (the standard deviation of the randomly generated datasets) was set to the experimentally determined error or 2% of the parameter value, whichever was the largest. In the case of the NOE data the uncertainty was set to 2% of the highest parameter value.

Assignment of R223 and potassium titrations
The 13 C  -15 N  correlation spectra of the HDAC8 mutant R223K were recorded in the same potassium-phosphate buffer as the wild-type (see Table S1) and spectra of wild type and mutant were overlaid (see Figure 3). HDAC8 has been proposed to be regulated allosterically by potassium ions [31] and consequently mutations might cause perturbations in other parts of the enzyme. No other arginines are in the close vicinity of R223, the closest arginine to R223 in the crystal structure (PDB code: 2V5W) being approximately 20 Å away. We therefore conclude that the obvious disappearance of a dispersed peak as shown in Figure 3 is due to the absence of an arginine residue at position 223 and attribute slight changes of the intensities seen in the random-coil region of the spectrum to secondary effects caused by the mutation.
Potassium titrations of HDAC8 were performed in Tris-buffer (25 mM Tris pH 8.0, 0.5 mM TCEP, 1 mM NaN3, 0.001% DSS and 10% D2O) with 1, 10, 100 and 200 mM KCl and approximately 0.2 mM HDAC8. Two-dimensional 13 C ζ -15 N ε HSQC spectra were recorded using the sequence presented in Figure 1. The experimental time for one titration point was approximately 19 h. The R223 cross-peak was hardly visible at low concentrations of potassium and its intensity increases as the concentration of potassium is increased. Thus, the binding/release of the potassium ion is in the slow exchange regime [32] , i.e. on the order of or slower than ~10 ms, since a change in intensity of the R223 peak is observed rather than a change in peak position. The fact that we could not observe an isolated peak disappearing in the 13 C  -15 N  HSQC spectrum as the concentration of potassium was increased suggests that the arginine side-chain of the potassium-free state of HDAC8 is either disordered or undergoing millisecond chemical exchange such that the corresponding resonance either becomes masked in the random-coil region of the spectrum or broadened beyond detection, respectively.
Peak volumes and errors were extracted from a fit of Gaussian line-shape as implemented in Sparky. [33] Furthermore, these were corrected for differences in protein concentration as estimated from the relative area of the methyl regions in 1D 1 H-detected spectra. To correct for the loss of sensitivity due to higher ionic strength, the proton signals of the Tris-buffer, which is constant in concentration during the titration, were used as references. Subsequently, the obtained normalized intensities were fitted to a hyperbolic binding curve, where I([K + ]) is the corrected peak volume at a given potassium concentration [K + ], I 0 and I max the (corrected) peak volumes at zero and saturating potassium concentrations, respectively, and K D the dissociation constant of potassium binding. The data in Figure 3d are normalized such that I max = 1.0.

Isotope shifts in the 13 C  -15 N  spectrum
The peak shape in the 13 C  -15 N  HSQC spectrum of T4 L99A (Figure 2b) is not fully symmetrical and for the intense signals we observe some residual upfield-shifted peaks (see for example at 159.5 ppm and 83.6 ppm in Figure 2b). These weak residual peaks are caused by the 10% D 2 O in the samples that results in a distribution of isotopomers and consequently isotope shifts. [34] In this study we recorded the spectrum in 90% H 2 O/10% D 2 O to allow comparison of the motional parameters derived from the carbon-detected experiments with those derived from conventional protondetected experiments. For applications at neutral-to-high pH, a 100% 2 H 2 O buffer could often be preferable. For comparison, Figure S5 shows a spectrum of T4L L99A in ~100% D 2 O. In our study, however, the small residual isotopomer peaks did not hamper either the chemical shift assignment or the relaxation measurements of T4L L99A. 13

Chemical shift dispersion of the 13 C  -15 N  spectrum
The 13 C  -15 N  and 1 H  -15 N  spectra show similar levels of chemical shift dispersion ( Figure S5) indicating the feasibility of the carbon-detected method for probing individual arginine side-chains.
As shown in Figure S6 below, the numbers of expected overlapped peaks in the two spectra, 13 C  -15 N  and 1 H  -15 N  HSQC, are very similar. Briefly, we randomly generated two-dimensional spectra using peak positions published in the BMRB database [35] and linewidths extracted from the spectra in Figure 2b and Figure S5. For different numbers of peaks we calculated the most probable number of overlaps and found that the number of predicted overlaps is very similar for the 13 C  -15 N  and 1 H  -15 N  spectra. For example, for a protein with 13 arginines, there will be on average 0.95±1.1 overlaps of peaks in the 1 H  -15 N  spectrum, while 1.05 ± 1.1 overlaps are expected in the 13 C  -15 N  spectrum. As much as these simulations might be biased by the peak positions published in the BMRB database, they demonstrate that a similar overlap of peaks is expected in the two spectra. shifts were taken randomly from the BMRB database. [35] Linewidths used for the simulation of the

Consistencies of derived relaxation rates from carbon-detected and proton-detected spectra.
As described above, we used 13 C  -15 N  correlated spectra to determine the 15 N  relaxation rates and subsequently derive motional parameters for arginine side-chains. More specifically, the anti-phase spin relaxation rates R 1 (2C z N z ) and R 1 (2C z N' z ) were measured for the peaks shown in Figure 2b. In an independent experiment, we measured R 1 (C z ) in order to calculate the pure in-phase 15 N  relaxation rates, R 1 (N z ) and R 1 (N z '), according to Eqs S1 and S2. As an initial validation of our strategy to quantify molecular motions of arginine side-chains using carbon-detected experiments, we compared these rates obtained from the 13 C  -15 N  -type experiments with the corresponding R 1 (N z ) and R 1 (N z ') rates obtained using standard proton-detected experiments. [25] This comparison is shown in Figure S7 where the 15    In Figure 2d we calculate the side-chain order parameters solely from the R 1 and R 2 rates, whereas order parameters are often calculated from the three rates, R 1 , R 2 , and { 1 H}-15 N NOEs. Figure   S9 shows that arginine side-chain order parameters calculated solely from the R 1 and R 2 rates are in good agreement with the corresponding order parameters calculated from the three rates, thus justifying our approach.

Comparison of side-chain and backbone order parameters
The derived order parameters for the arginine side-chains of T4L L99A range from approximately 0.1 to 0.9 showing that some arginine side-chains (S 2 ~ 0.1) are nearly completely uncoupled from the overall tumbling while other arginine side-chains (S 2 ~ 0.9) are as rigid as the backbone. This is in agreement with previous results obtained for methyl-bearing side-chains, where the range of order parameters correlates with the number of degrees of freedom for the side-chain. [36] The fact that we see this wide range of order parameters for the arginine side-chain is therefore in agreement with possible motions around the four side-chain dihedral angles ( 1 , 2 , 3 , 4 ).
In general, the side-chains of amino acids probe a different environment from that of the backbone. We have compared the side-chain order parameters, S 2 , of T4L L99A with those derived from the backbone. This comparison is shown in Figure S10. Although the results confirm earlier studies [2,37] in showing that the side-chain and the backbone motions are largely uncoupled, it once again underlines the need to probe the side-chain moieties specifically in order to gain insight into their dynamics. experiments. The R 2 rates were not corrected for the contribution from chemical exchange, R ex, , because the R ex contributions calculated under the applied conditions [27,38] were in general small ( <0.5 s -1 , < 3 % ). The contribution from chemical exchange was suppressed, because the R 1 rates used to calculate R 2 were measured using a spin-lock field ( SL = 2 ·1500 rad/s) that is larger than the chemical exchange rate (k ex ~ 1000 s -1 ). [38] The solid line represents y=x. It is clear that the side-chain order parameters and thus dynamics of the side-chains are very different from the backbone dynamics, as has also been observed for other sidechains.   [24] T4 L99A 13C-15N-2H 2.5 mM A 11.7 T 47.7 ms 8 19h23min Figure S9, S10 1 H  -15 N  R  (N z ') [23,24] T4 L99A 13C-15N-2H 2.5 mM A 11.7 T 47.7 ms 16 1d15h Figure S9, S10