Centerband‐Only Detection of Exchange NMR with Natural‐Abundance Correction Reveals an Expanded Unit Cell in Phenylalanine Crystals

Abstract The NMR pulse sequence CODEX (centerband‐only detection of exchange) is a widely used method to report on the number of magnetically inequivalent spins that exchange magnetization via spin diffusion. For crystals, this rules out certain symmetries, and the rate of equilibration is sensitive to distances. Here we show that for 13C CODEX, consideration of natural abundance spins is necessary for crystals of high complexity, demonstrated here with the amino acid phenylalanine. The NMR data rule out the C2 space group that was originally reported for phenylalanine, and are only consistent with a larger unit cell containing eight magnetically inequivalent molecules. Such an expanded cell was recently described based on single crystal data. The large unit cell dictates the use of long spin diffusion times of more than 200 seconds, in order to equilibrate over the entire unit cell volume of 1622 Å3.

The NMR pulse sequence CODEX (centerband-only detection of exchange) is a widely used method to report on the number of magnetically inequivalent spins that exchange magnetization via spin diffusion. For crystals, this rules out certain symmetries, and the rate of equilibration is sensitive to distances. Here we show that for 13 C CODEX, consideration of natural abundance spins is necessary for crystals of high complexity, demonstrated here with the amino acid phenylalanine. The NMR data rule out the C 2 space group that was originally reported for phenylalanine, and are only consistent with a larger unit cell containing eight magnetically inequivalent molecules. Such an expanded cell was recently described based on single crystal data. The large unit cell dictates the use of long spin diffusion times of more than 200 seconds, in order to equilibrate over the entire unit cell volume of 1622 Å 3 .
Despite its small size, the common polymorph of phenylalanine that forms upon evaporation from water was reported only in the year 2014, based on high quality single crystals. [1] The polymorph was previously described as C 2 with two inequivalent molecules in the unit cell, and now stands corrected with an expanded unit cell containing eight molecules. The corrected structure is also in agreement with a DFT report in which geometry of a unit cell with eight molecules was optimized. [2] Magic-angle spinning (MAS) NMR spectra of phenylalanine resolved two inequivalent molecules based on isotropic chemical shits, consistent with C 2 symmetry, but demonstrated linewidths that are significantly larger than those of the phenylalanine hydrochloride salt ( Figure S1). The new interpretation of a P 21 polymorph explains this additional broadening. Nevertheless, the isotropic chemical shift spectrum cannot resolve the 8 molecules in the unit cell, a degeneracy of chemical shifts well known in NMR crystallography . [3][4][5] The CODEX pulse sequence is based on magnetization exchange among nearby spins and is more sensitive than multiple quantum [6][7][8][9] approaches to determine oligomeric number or inequivalent molecules in crystals. The sequence was first invented by Schmidt-Rohr et al. for investigation of slow dynamics [10,11] and later demonstrated by Hong et al. for the application of determining oligomeric numbers and intermolecular distances. [12,13] By isotope labelling only a single site and encoding the orientation dependent chemical shift anisotropy (CSA), the CODEX sequence can detect spin exchange [14,15] between chemically equivalent but orientationally inequivalent spins. Such exchange is not detected in 2D exchange spectra such as proton assisted spin diffusion (PDSD), [14,15] since the exchanged signal also occurs on the diagonal. At sufficiently long mixing times, initial magnetization is equally distributed among different orientations in the cluster, and since only the starting orientation results in signal, it follows that the signal plateaus at the inverse of the number of spins in the molecular cluster. In a crystal, translation of the lattice results in an identical orientation, and the signal is retained, allowing the determination of the number of inequivalent molecules in the unit cell. [13] In addition, if the lattice geometry or oligomer symmetry is known, distances among spins can be deduced from the spin diffusion rate. [13,14] Due to its robustness in long distance restraint determination, CODEX has been widely used. Hong and her colleagues applied 13 C and 19 F CODEX to the determination of spin diffusion rates in single-site labelled glycine, leucine, phenylalanine hydrochloride and tryptophan crystals [13,16] and to distance determination in membrane proteins such as the tetrameric Influenza A M2 and HIV gp41 in lipid bilayers. [17][18][19] Schmidt-Rohr and Hu quantified the distribution of strongly bonded citrates on the surface of hydroxyapatite using 13 C CODEX and REDOR. [20] Kong and coworkers applied 13 C CODEX on the partitioning of surface ligands on nano-crystals and established a correlation with solubility of nanoparticles. [21] In the same year, Kong and Wang applied 13 C CODEX to measure the distribution of acetate on metal organic framework (MOF) surfaces and describe the potential defects caused by acetate in the MOF structure. [22] However, previous reports of 13 C CODEX data considered only spin diffusion taking place between singly labelled carbon sites without consideration of the magnetization exchange to the 13 C from unlabeled sites, which occurs with a natural abundance of 1 %. Consideration of spins at 1 % abundance is not needed for small molecular size, such that the previous 1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57 studies are clearly valid. But the effect from unlabeled sites accumulates with the increase in molecular size and with the number of distinct molecular orientations. We take a commonly used amino acid phenylalanine as an example. Apart from the singly labeled site, there are 8 other carbon sites in phenylalanine. The number of spins over which magnetization can equilibrate (M') therefore averages 1.08 rather than to 1 if we consider unlabeled sites in a single molecule (M = 1). Consequently, for a phenylalanine cluster of 10 molecules, the total number of spins to consider is 10.8 in the 10-molecule cluster, which is very close to 11 spins. The equilibrium magnetization after long mixing time CODEX will therefore result in an incorrect conclusion of an 11-molecule cluster, unless the natural abundance spins are considered. This impact is much larger than the almost negligible influence from the 99 percent isotope enrichment level (Table S1).
Here we implement a two-part strategy to accurately account for natural abundance. First, we introduce a premixing period to the original CODEX scheme in order to equilibrate the level of magnetization across all sites, which are initially polarized unequally due to site-specific cross polarization(CP) efficiency. [23] Secondly, in the data analysis step, we calculate the average sum of spins considering natural abundance spins at 1 percent. The scheme is applied in analysing phenylalanine crystals labelled at either 13 C', or at the 13 Cγ. The use of very long spin diffusion time up to 512 seconds allows transfer over unprecedented distance between 13 C spins.
By comparing CODEX curves of three different crystals of increasing complexity (Figure 1), we found that consideration of natural abundance spins becomes important for M = 8. In Figure 1A, CODEX curves are shown for glycine and for two polymorphs of phenylalanine. Experimental details and verification of the phenylalanine polymorph is found in the SI text and Figures S2-S4. The signal (S) and reference signal (S 0 ), where τ m and τ z values are swapped, can be seen in Figure 1B. The equilibration of initial polarization via long CP and τ eq (Figure S5) ensures that the reference S 0 signal accurately compensates for differential T 1 relaxation or CP efficiency. The T 1 determination is shown in Figure S6. For glycine(C 2 H 5 NO 2 ), each molecule contains one other 13 C site in natural abundance. Considering two crystal orientations, the average 13 C cluster spins (M') is 2.02 (1 % of natural abundance) instead of 2, and the equilibrium that is expected is therefore 0.495 instead of 0.5. Experimentally we determined a value of 0.493 + /À 0.015 by fitting to a stretched exponential function. The same comparisons are applied for the three crystals in Table 1. For gly Figure 1. A) CODEX decay curves for Glycine (blue), phenylalanine hydrochloride (black) and phenylalanine (red) crystals. Error bars (2σ) were estimated with a Monte Carlo approach and the spectral noise. S is the CODEX signal while S 0 is the signal to compensate for T 1 relaxation during CODEX by exchanging τ eq and τ z . B) Example 1D carbon spectra after 256 s spin diffusion and the labelling scheme for phenylalanine and phenylalanine hydrochloride in A). C) A modified version of the CODEX pulse sequence and labelling scheme for phenylalanine and phenylalanine hydrochloride in A). 16 seconds premixing (τ eq ) was used for phenylalanine, and 5 s for glycine. A long CP of 2.8 ms was used. Spectra were acquired on a 600 MHz spectrometer with 8 kHz MAS at 100 Kelvin. and phe * HCl, results are trivial, however, for phe the equilibration is 0.109 + /À 0.007, and the cluster size appears close to 9 if we do not consider natural abundance. With correction, the experimental and measured equilibrium values agree, within experimental error. Equilibration over the large unit cell indicates relayed transfer over 1622 Å 3 , or a distance of at least 12 Å. The stretched exponential fit accurately determines the equilibrium value, and could even give a distribution of distance restraints. [24] However, this assumes direct transfer and is not valid for long range spin diffusion, where relayed spin diffusion needs to be considered. This can be done with a rate matrix approach.

Determination of F(0) with the Rate Matrix
Although the spin diffusion rate depends on both orientation and distance, in practice, variation due to orientation has a relatively small impact on extracted distances. Standard practice is to fit rates to known crystal structures, and apply the same calibration of the parameter F(0) (see below) to unknown distance determination. F(0) is the overlap integral describing the probability that single-quantum transition occurs at the same frequency for two spins, and depends on the details of the CSA, isotropic chemical shift, dipole coupling with proton, the magnetic field, and the spinning frequency. [13,25] Although F(0) could in principle be calculated, here and in previous implementations, [13,14,[16][17][18][19] it becomes a calibration factor that is set based on a known crystal form. In the case of phenylalanine, crystal forms are known, and the data provides a useful indication of variations in the spin diffusion rate in different crystal orientations, as has been reported for α and γ glycine polymorphs. [26,27] Using equations 1-3, [13,15,28] the spin diffusion rate, represented in the form of F(0) can be calculated using the known atom coordinates.
M stands for the magnetization, K is the rate matrix with elements k ij . F ij (0) is the overlap integral describing the probability that single-quantum transition occurs at the same frequency for spin i and j. The rate matrix approach applied to Phe HCl is further described in Figure S7 and the least square fitting of F(0) using the crystal structure coordinates is shown in Figure 2 using home written MATLAB code (see SI).
For 13 Cγ labelled phenylalanine, we compare F(0) for the two phenylalanine polymorphs in Figure 2. To account for many longer distances present in the crystal, the coupling w 2 ij of Eqn. 2 is replaced by the sum over all spins j within 30 Å of i, where the effective coupling has converged ( Figure S8). The closest distance between labelled carbons is 6.09 and 5.00 Å in the hydrochloride [29] and neutral forms [1] (CSD entries PHALNC01 and QQQAUJ05), and the dipolar coupling 33.6 and 60 Hz, respectively. We find F(0) values of 11.4 and 6.5 μs.
A different CODEX equilibrium close to 0.2 is observed for carbonyl labelling ( Figure 3) which clearly does not capture the full structure detected at the rings. This can be explained by the packing of phenylalanine molecules in the crystal structure, which results in all the carbonyls lying close to two planes, with a large distance between planes. Four magnetically inequivalent phenylalanine molecules lie in each plane. The closest carbonylcarbonyl distance from one plane to the other is 12.5 Å, substantially longer than for the aromatic labelling. Due to the size and complexity of the phe unit cell, this data establishes an upper limit to the maximum spin diffusion distance in 13 C codex, even for particularly long diffusion times of several minutes, using a moderate MAS frequency at a 600 MHz spectrometer. Indeed, there is a 13 C-13 C separation of 9.4 and 12.5 Å in the ring and carbonyl labelling, respectively, across which spin diffusion must occur to allow full equilibration ( Figure 3B). Incomplete equilibration for the carbonyl is also consistent with the range of F(0) values observed ( Figure S9, Table S2). Spin pair CODEX curves are shown for the fit carbonyl F(0) value of 1.2 μs in Figure S10.
We showed how careful consideration of natural abundance spins is required to correctly model large spin clusters that occur in the complex unit cell of phenylalanine. The data rule out the original description of phenylalanine crystals with 4 magnetically inequivalent molecules, and are consistent with recent single crystal data that indicate an expanded unit cell with 8 magnetically inequivalent molecules. With carbonyl labelling, we showed that 12.5 Å is an upper limit to carbon spin diffusion at 8 kHz MAS at a 600 MHz spectrometer. This example shows how CODEX data can be useful for refinement of structures if ambiguity exists in diffraction data, and will be useful for analysis of molecule clusters in materials applications.