Non-uniform sampling for NOESY? A case study on spiramycin

To date, most nuclear magnetic resonance (NMR)-based 3-D structure deter-minations of both small molecules and of biopolymers utilize the nuclear Overhauser effect (NOE) via NOESY spectra. The acquisition of high-quality NOESY spectra is a prerequisite for quantitative analysis providing accurate interatomic distances. As the acquisition of NOE build-ups is time-consuming, acceleration of the process by the use of non-uniform sampling (NUS) may seem beneficial; however, the quantitativity of NOESY spectra acquired with NUS has not yet been validated. Herein, NOESY spectra with various extents of NUS have been recorded, artificial NUS spectra with two different sampling schemes created, and by using two different NUS reconstruction algorithms the influence of NUS on the data quality was evaluated. Using statistical analyses, NUS is demonstrated to influence the accuracy of quantitative NOE experiments. The NOE-based distances show an increased error as the sampling density decreases. Weak NOE signals are affected more severely by NUS than more intense ones. The application of NUS with NOESY comes at two major costs: the interatomic distances are determined with lower accuracy and long-range correlations are lost.


| INTRODUCTION
Being the most widely applied tool of modern nuclear magnetic resonance (NMR)-based structure determination, the importance of the nuclear Overhauser effect (NOE) cannot be overstated. Its unique capability to provide interatomic distance information makes it invaluable for 3-D structure determination, description of conformational ensembles and identification of relative configurations, independent of molecular size and flexibility. [1][2][3][4] NOEs are extensively utilized in virtually all fields of chemical and biological research, including natural product, synthetic organic and medicinal chemistry, drug development and structural biology.
Using either initial rate approximation or full relaxation matrix analysis approaches, [5,6] transient NOESY experiments allow the determination of interatomic distances with high accuracy. [1,2,7] However, due to its swiftness, interatomic distance calculations based on a single NOESY spectrum only are not unusual either. These presume the negligible contribution of other relaxation mechanisms than dipolar interactions at the applied mixing time. The average error of the NOE determined interproton distances has been determined to 3.0%-6.9%, with the accuracy being solvent viscosity dependent. [1,2,[7][8][9] More viscous solvents, such as DMSO-d 6 , lead to a larger error upon the NOE-based distances. [1] In order to acquire the most intense NOE signals, the use of a relaxation delay (d1) of 3-5 times T 1 , which allows for near full recovery of the magnetization (99.33%), is often recommended. The T 1 relaxation times of the protons of biopolymers are usually below 0.5 s, whereas for molecules with a mass of a couple of hundreds this falls between 0.5 and 5.5 s. [10][11][12][13] Recording a series of 2-D NOESY spectra to map the build-up curves would take over 2 weeks for a small molecule with T 1 times around 5 s, which is not realistic. Instead, using a relaxation delay of 1-2 times T 1 , which creates a steady state, allows quantitative NOE analysis despite not allowing full recovery of the magnetization. [14] The NOE is a time-averaged phenomenon. Accordingly, as most (small) molecules of real-life relevance are flexible, the experimentally observed NOEs are typically the population-weighted averages of the NOEs of their conformers and do not represent a single geometry. [15] A number of algorithms, such as MEDUSA, [16] CPA, [17] PEPFLEX-II, [18] FINGAR, [19] DISCON, [20] NAMFIS, [21] GENFOLD [22] and StereoFitter, [23] have been developed to deconvolute the population-averaged NOE data into the interatomic distances of a set of conformations with their corresponding molar fractions, taking the distance dependency of the NOE intensity into consideration. Despite being readily available, these are not yet extensively used by non-experts. Whereas it is fairly common to determine accurate interproton distances for small molecules, using the initial rate approximation, for calculation of biomolecular structures, usually only upper and lower distance limits are determined, based on a single NOESY spectrum, for various atom-pairs. This is certainly less accurate yet is easier to automate and is quicker. For biopolymers typically 3-D NOESY spectra are obtained. [9] As these are time-consuming, distances are commonly derived from a single spectrum using the full relaxation matrix approach [6] instead of acquiring build-up curves, which would necessitate acquisition of several spectra with a series of mixing times. Independently of whether NOEs are collected for structure elucidation of small molecules, biopolymers or their complexes, the weak NOE signals corresponding to longer interatomic distances provide by far the most valuable piece of information.
Multidimensional NMR experiments usually require considerable spectrometer time. Conventional sampling schemes use an incremental, stepwise collection of data points, and accordingly, data collection of the first dimension is quick in comparison to that of further dimensions. Non-uniform sampling (NUS) with varying increment steps has been suggested in the late 1980s to speed up data collection in the indirect dimension(s) of an NMR experiment and has been increasingly applied over the past decade. [24] A variety of sampling schemes have been proposed, such as exponentially weighted, [24] uniformly random, [25] Poisson-gap distribution, [26] randomshuffle, [8] radical [27] and concertic [28] sampling. Prior to Fourier transformation (FT), data collected using NUS have to be reconstructed. Several software have been developed for this purpose, including Compressed Sensing (CS), [29] Multidimensional Decomposition (MDD), [30] Maximum Entropy (ME), [24] Iterative Soft Thresholding (IST), [31] Scrupulous CLEANing to Remove Unwanted Baseline Artefact (SCRUB), [32] and NESTerov's Algorithm (NESTA-NMR). [33] As spectra recorded with 50% or 25% NUS use only half or a quarter of the time that the same experiment would need using a uniform sampling scheme, the implementation of NUS schemes is becoming the standard way of data acquisition for the investigation of biopolymers. [6,9,34] It has so far scarcely been applied in small molecule NMR studies, [35][36][37] however, could obviously be beneficial for shortening the time requirement of NOE build-ups, for example. Overall, NUS is currently seen as a powerful time-saving tool with little to no downsides.
The influence of NUS schemes on the quantitative performance of NMR experiments has been systematically studied. Among others, the random error introduced by NUS [38] into COSY and TOCSY spectra, [39,40] and into relaxation measurements [41,42] has been assessed; these studies reported the reconstruction of weak signals in particular to be cumbersome. Whereas the impact of this observation in the context of metabolomics, for instance, has been extensively discussed, so far the influence of NUS on NOE-based structure determination, and thus in particular on the quantitativity of NOE experiments, has scarcely been estimated.
A recent report provided a qualitative assessment, concluding that strong cross-peaks are better reproduced than weaker ones. [43] Another report [8] that assessed the impact of NUS on NOEs quantitatively analysed short distance NOEs only, ≤2.60 Å, for which strong cross-peak intensities are expected. Such short-range NOEs are typically less valuable for 3D structure elucidations as compared to the long-range correlations that are looked for at the description of the orientation of different protein strands, protein-ligand and protein-protein interactions. [6] Long-range NOEs provide the key information also for the description of conformational ensembles of natural products, peptides and macrocycles, for example. [44][45][46][47][48] Thereby, the influence of NUS on the intensity of such long-range correlations is the truly relevant question for real-life applications. This report used strychnine as model compound, which is popular for NMR method development due to its peculiar rigidity and small size. Typical drug candidates, but even biopolymers, even in their most ordered form, show, however, a much higher degree of flexibility.
Herein, we assess the applicability of NUS for 2-D NOESY experiments, taking both relaxation delay and the number of sampled data points in the indirect dimension into consideration. Out of the plethora of available options, the random-shuffle [49] and the Poisson-gap [26] sampling schemes along with the Modified Iterative Soft Thresholding (MIST) [50] reconstruction method are applied as representative examples. In addition to MIST, the MDD [30] reconstruction method, originally developed for spectra of higher dimensionality, has also been explored as it is one of the newer reconstruction methods and is available in TopSpin, ACD Labs and NMRPipe. MIST is implemented in MestReNova. Thus, these both are readily available and widely applied. Importantly, the aim of this work is neither to find the best sampling scheme and reconstruction algorithm nor to evaluate all available options and their combinations but rather to demonstrate a challenge of broad impact on representative examples, thereby facilitating future studies and developments. The best sampling scheme and reconstruction mechanism for acquisition of a certain spectrum are currently not predictable but are typically selected by trial and error. Therefore, we chose to evaluate two of the most commonly used algorithms for both sampling and reconstruction, without presuming these to be better than any other possible options. Experimental parameters were chosen to match those typically used in small molecule NMR studies. Identical experimental setup has also been used at an independent study that described the solution ensembles of spiramycin ( Figure 1), [3] a 16-membered macrolide originally isolated from Streptomyces ambofaciens in 1954, and marketed as antibiotic and antiparasitic agent year 2000. [51,52] It is a typical drug of our time and thus an optimal model system for evaluating a method for real-life applicability.
2 | EXPERIMENT 2.1 | Data acquisition NMR spectra were obtained on a 4-mM sample of spiramycin dissolved in DMSO-d 6 at 25 C on a 600-MHz BRUKER Avance NEO spectrometer equipped with a 5-mm TCI cryogenic probe. The assignment of spiramycin was based on 1 H and 13 C NMR, COSY, TOCSY, HSQC, HMBC and NOESY spectra.
T 1 relaxation times were determined using the inversion-recovery pulse sequence, [53,54] acquiring a series of 11 1 H NMR spectra with relaxation delays varying from 0.0 to 20.0 s. The t1ir standard Bruker pulse programme was used with four transients, 3.6045 s acquisition time and 32,768 data points per spectrum. T 2 relaxation times were determined using the CPMG pulse sequence, [55,56] acquiring a series of 14 spectra with loop counts varying between 2 and 1800, with four transients, 3.6045 s acquisition time and 32,768 points per spectrum.
Four series of NOESY spectra with different sampling densities (100-25% NUS, vide infra) with seven mixing times each (100-700 ms) were recorded. Interproton distances were determined using the initial rate approximation, by analysis of build-up curves. A series of spectra allowing for full recovery of the magnetization by using It has a 16-membered macrocyclic core, is flexible and allows detection of long-range nuclear Overhauser effects (NOEs), hence providing an optimal model system for the evaluation of the influence of non-uniform sampling (NUS) on the quantitativity of NOESY. (b) The solution ensemble of spiramycin previously determined based on NOESY build-ups [3] relaxation delay corresponding to 5 times T 1 of the slowest relaxing spin has also been recorded, as well as spectra using an increased number of data points sampled in the indirect dimension.

| NUS sampling scheme
Non-uniformly sampled spectra were acquired with the random-shuffle sampling scheme [8] without repetition until 70% NUS, as implemented in TopSpin 4.0.6. Due to the comparably long T 1 and T 2 relaxation times of the protons, no additional weighting was used for the sampling. Artificial NUS data were generated from the 500-ms uniformly sampled spectrum (Build-up_NoNUS series, Table 1), using either the Poisson-gap sampling scheme applying the schedule generator algorithm (v. 3.0) [57] of Wagner and co-workers, [50] or using the random-shuffle sampling scheme as implemented in Top-Spin. [49] In total, 20 copies of each both with 25% and 50% NUS were created to allow for analysis of the error that comes from the NUS schemes. The random-shuffle [49] and the Poisson-gap [26] sampling schemes were selected, as these are among the most commonly applied schemes in the literature, and being implemented in TopSpin (Bruker), they are easily available for users.

| Datasets
The NOESY datasets recorded for the detailed analysis of the influence of NUS are referred to as 'Build-up' (standard parameters), '5 T1' (long d1 delay) and '512-points' (extended number of increments in f1) spectra. The spectra recorded with NUS were acquired using the randomshuffle sampling scheme. An overview of the parameters used for the various experimental datasets is given in Table 1, whereas a detailed description is provided below.
The 'Build-up' dataset consists of 28 spectra, divided in four series of seven NOESY spectra. These spectra were recorded with 16 transients, 2.5-s relaxation delay, 0.3113-s acquisition time, 4096 points in the direct dimension, and seven mixing times varying from 100 to 700 ms with 100-ms increments. The datasets titled Build-up_75%, Build-up_50% and Build-up_25% were recorded with NUS, and their indirect dimension was reconstructed using the algorithms MIST as implemented in the software MestReNova 14.1.1 and with MDD (multidimensional decomposition) implemented as an extension to NMRPipe. [30,31] The MIST reconstruction was performed using maximum 100 iterations and a threshold level of 0.750. MDD reconstruction was accomplished with maximum 50 iterations, 25 components per subregion and a scaling factor of 0.7 for the residuals as they are added to the reconstructed spectrum. The number of sampled points in the indirect dimension was 512 for the series Build-up_NoNUS, 384 for the series Build-up_75%, 256 for Build-up_50% and 128 for Build-up_25%.
The '5 T1' dataset consists of three NOESY spectra recorded with 16 transients, 10.5-s relaxation delay, 0.3113-s acquisition time, 4096 points in the direct dimension and 500-ms mixing time. Within this, the 5T1_NoNUS spectra were acquired with 512 sampled points in the indirect dimension, the 5T1_75% with 384 and the 5T1_50% with 256. The spectra 5T1_75% and 5T1_50% were recorded with NUS and reconstructed as described above.
The '512-points' dataset consists of two NOESY spectra recorded with 16 transients, 2.5-s relaxation delay, 0.3113-s acquisition time, 4096 points in the direct dimension, 512 sampled points in the indirect dimension and 500-ms mixing time. The spectra 512-points_50% and 512-points_25% were recorded with 50 and 25% NUS, respectively, and were reconstructed as described above. Hence, the latter spectra were acquired with 50 and 25% NUS but twice (1024.2n) or four times (2048, 4n) the number of data points in the indirect dimension as compared to the uniformly sampled spectrum (512, n). Hence, the overall acquisition time of these two NUS spectra is equal to that of the uniformly sampled one.

| Data reconstruction and processing
NUS recorded spectra were reconstructed in parallel by the MIST and MDD algorithms. Artificial NUS spectra were reconstructed using the MIST algorithm only. All spectra were processed in MestReNova (version 14.1.1) using forward linear prediction in the indirect dimension to 2048 points. The phase error was corrected manually for each dataset, and baseline correction was performed using the Whittaker smoother algorithm. Identical apodization was applied for all spectra: first point in f1 at 0.50 and an exponential weighting function of 5 Hz in both the f1 and the f2 dimension. The absolute intensities of the cross-peaks and diagonal peaks of interest were obtained by integration. To keep the integration area consistent, the 700-ms spectrum of the Build-up_NoNUS dataset was integrated manually and an integration template was created and was applied subsequently to all NOESY spectra to obtain the absolute intensities for each.

| NOE build-up analyses
The analysis of the 'Build-up' dataset included seven separate build-up curve evaluations: one of the uniformly sampled data and two of each NUS sampled dataset using the two different reconstruction algorithms (Table S3). Interproton distances were determined by analyses of the build-up curves using the initial rate approximation, using the geminal methylene protons at Position 7 ( Figure 1) as an internal distance reference (1.78 Å). [1] For one of the seven datasets, these methylene protons gave a nonlinear build-up, and therefore, the geminal methylene protons at Position 18 ( Figure 1) were used for this specific analysis instead. Further, geminal methylene protons within the molecule offered a quality control for the calculated distances and were observed to provide interproton distances fitting to the expected 1.78 Å. Cross-peak intensities were normalized according to the normalized intensity = (cross-peak ab × cross-peak ba )/ (diagonal-peak a × diagonal-peak b ) equation, for protons H a and H b , and plotted against the corresponding mixing times to yield build-up curves. Only interproton NOEs with linear build-ups with R 2 ≥ 0.95 from the uniformly sampled data-set 'Build-up_NoNUS' were included in the detailed analyses, whereas those showing nonlinearity were omitted. Interproton distances were calculated based on the slope of the build-up curves and calculated according to is the distance between protons H a and H b in Ångström, r ref is 1.78 Å (methylene), and σ ref and σ a,b are the slopes of the build-up curve for the reference and the H a -H b proton pair, respectively. Distances obtained from the uniformly sampled dataset were used as reference in the error estimation of the distances derived from other datasets with non-uniformly sampled acquisition. Besides looking at the error on the calculated distances, the error on the normalized intensities was also investigated. For this, a reference build-up curve was created by taking the slope of the linear fit for the uniformly sampled data and shifting its position so that it intersects with the origin (0,0) of the graph. Points corresponding to 100-to 700-ms mixing times were taken from this reference curve, and were used for the statistical analyses of normalized intensities of cross-peaks in single NOESY spectra.

| Single NOESY spectrum analyses
The NOE correlations that yielded a linear NOE build-up curve (Section 2.5) were reanalysed using the data obtained from the spectrum acquired with 500 ms mixing time only. In total, nine such spectra were obtained, of which seven were recorded with NUS. The NUS spectra were reconstructed by the two reconstruction algorithms, as described above, giving rise to a total of 16 NOESY spectra (Table S3). Interproton distances were determined by analysing the normalized intensities of the correlations of interest within a single spectrum, giving the corresponding interproton distances. Peak normalization and internal referencing were done as described above, extracting the reference intensity corresponding to 500 ms from the reference curves. Interproton distance determination for the artificial MIST reconstructed NUS spectra was performed in the same fashion as described previously for the single NOESY spectra.

| Statistical analyses
The quality of the datasets may be assessed by quantifying their accuracy in terms of the slope of the build-up, the variance of the points from linearity and the offset. Evidently, among the acquired datasets, the uniformly sampled one containing the largest number of observed data points is of highest quality, whereas the NUS sampled data, which contain a lower number of experimental data points, may at best be equally accurate. Possible deviations of the NOE build-up rates from that of the gold standard are of especial interest as these would be detrimental for the purpose of acquiring quantitative NOEs. Slopes of the data obtained by NUS compared to those obtained with uniform sampling (Supporting Information) selectively quantify this vastly important parameter.
We selectively assessed the variance of data points within a build-up using the square of the Pearson product-moment correlation coefficient (PPMCC), commonly termed as the R 2 value (Equation S1). It quantifies deviation from linearity, that is, the 'random noise' on the data. A high-quality dataset is characterized by R 2 1.0. The less accurate a dataset is, the lower its R 2 . The use of the R 2 value is advantageous as it is independent of the alteration of the slope or offset and can be calculated for each individual NOE build-up without having to rely on previously obtained data or on any further hypothesis than the linearity of the initial NOE build-up rate. Thereto, the R 2 values of the build-ups obtained from various US-and NUS-sampled data are easy to compare (Tables S4-S10). [5,58] For an overall statistical assessment of the quality of the datasets-including slope, variance and offset-we used Lin's concordance correlation coefficient, r C (Equation S2). [59] This statistical descriptor has been developed for comparison of measurement methods to a gold standard method. It quantifies how well a set of bivariate data compares to a gold standard dataset; here, the reference build-up that crosses the origin (0,0) has R 2 = 1.00 and possesses the slope corresponding to that of the uniformly sampled dataset, created for each proton pair. According to the classification of McBride, [60] r C values ≥ 0.99 indicate excellent concordance, 0.95-0.99 substantial, 0.90-0.95 moderate, and <0.90 poor concordance, whereas values near 0 no concordance and near −1 discordance.

| Distance determination from NOE build-up curves
NMR spectra were acquired for spiramycin in DMSO-d 6 solution, as this due its viscosity is known to be among the more challenging solvents for determination of NOEbased distances. [1] The assignment of spiramycin is given in Table S1. Its protons showed T 1 (0.55-2.09 s) and T 2 (0.047 and 0.758 s) times typical for small molecules (Table S2). NOESY spectra were acquired with seven mixing times (100-700 ms), and build-ups with R 2 ≥ 0.95 were considered to be linear. [5,58] The analysis of the uniformly sampled NOESY spectra gave rise to 62 interproton correlations, for which distances (1.71-4.85 Å) could be determined with high accuracy [1] using the initial rate approximation, following commonly accepted routines of small molecule NMR studies. [3,44,45] The same correlations were analysed for the dataset collected with 25%, 50% or 75% NUS, acquired with the random-shuffle sampling scheme. The results for analyses of the MISTreconstructed data are presented and discussed in the following sections. In addition, the same data were also reconstructed using MDD and analysed. Similar overall trends were found for the MIST-as for the MDDreconstructed data, and therefore, the latter is given in the Supporting Information.
When comparing the uniformly and the nonuniformly sampled datasets, a significant decrease in the number of linear NOE build-ups upon utilization of NUS is observed ( Table 2). The build-up curves for the NOE correlations of five selected proton pairs are shown in Figure 2. The trends of significance are that (i) NOE build-ups originating from sparser sampled spectra are less linear than those from more densely sampled data and (ii) build-ups derived from data points belonging to long-range correlations show less linearity than those belonging to shorter distance correlations. Build-up curves for all 62 correlations obtained from the seven different datasets are shown in Figures S8-S69. The linearity of the NOE build-up curves obviously depends on the accuracy of the individual data points, that is, the intensity of the individual cross-peaks in the analysed spectra. Therefore, we calculated the average absolute error on the normalized intensity for the 62 NOE correlations at each mixing time for the uniformly sampled data as well as for the reconstructed NUS datasets, with the result of the analysis being summarized for the three MIST-reconstructed datasets in Figure 3. The intensities for each data point-at each mixing time-are compared to the reference intensities extracted from the reference build-up curves. The complete analysis, that is the average absolute errors found on the normalized intensity for five distance ranges for the MIST-reconstructed and those for the  (Figure 3) that NOESY spectra recorded with longer mixing times show lower average absolute error, most plausibly because of the better signal to noise ratio. A larger extent of NUS results in larger error, independent of mixing time. Whereas the errors of the data collected with 75% and 50% NUS are comparable, that of the 25% NUS dataset is disproportionately larger.
The error of the cross-peak intensities as a function of the interproton distance is shown in Figure 4. The better an individual data point matches the linear fit of its build-up curve, the higher the accuracy of the corresponding distance. In our hands, the R 2 value decreases as the interproton distance increases ( Figure 5). The observation that the average absolute error for the normalized intensity is lowest for the shorter distance correlations is likely explained by these cross-peaks being the most intense and thereby least affected by an increased noise level. The data quality, as expressed by the R 2 value, declines as the spectra are more sparsely sampled (75-50-25% NUS). In Figure 4, the MISTreconstructed data are shown; however, similar trends were observed for the MDD-reconstructed NUS data as well ( Figure S4). It is important to note that the uniformly sampled data are of high quality at each of the seven mixing times, whereas the data collected with NUS are of acceptable quality only at interproton distances <3 Å with 75% and <2.5 Å with 50% NUS. This observation is in line with a previous report [8] that found NUS applicable for quantitative NOEs for interatomic F I G U R E 3 The average absolute errors on the normalized intensity for the 62 nuclear Overhauser effect (NOE) correlations at seven mixing times. In black, the error on the uniformly sampled, in red the error of the 75% non-uniform sampling (NUS), in green the error of the 50% NUS and in blue the error of the 25% NUS data is given. The errors shown in this graph concern Modified Iterative Soft Thresholding (MIST)-reconstructed NUS data only, whereas those of the MDD-reconstructed data are shown in Figure S3. The average absolute error, given on the y axis, was estimated by calculating the error for each of the 62 correlations as compared to the reference intensities, obtained from the reference build-up curves. The sum of the absolute value of these errors, divided by the number of correlations, is the average absolute error shown in his graph F I G U R E 4 The average R 2 value of the individual data points as compared to the reference nuclear Overhauser effect (NOE) build-up for the uniformly sampled data are shown in black, whereas those for the Modified Iterative Soft Thresholding (MIST)-reconstructed data with 75% non-uniform sampling (NUS) in red, with 50% NUS in green and with 25% NUS in blue. The black line is placed at R 2 = 0.95, the cutoff for considering build-ups linear distances ≤2.60 Å. The long-range NOE correlations that are most important for structure determination are not well represented in any of the datasets acquired with NUS.
The above analysis estimates data quality by assessing only the variance of individual data points (R 2 ), but neglecting the error in the slope and in the offset of the build-ups. An overall assessment using Lin's concordance correlation coefficient, r C, [59] indicates that the uniformly sampled data show reasonable concordance to the reference curve ( Table 3). The NUS datasets show poor concordance, independent of the reconstruction algorithm and distance range. Hence, the comprehensive assessment of the errors of the experimentally obtained NOE build-ups suggests that only uniform sampling results in a dataset of sufficient quality ( Figure 5). As expected based on the r −6 dependence of the NOE phenomenon, the data points corresponding to shorter distances show overall better concordance. This trend is clear despite the averaged concordance values being influenced by the number of NOE correlations available per distance range, making the r C dependence on distance to seem not monotonous.
To validate the accuracy of interproton distances, we investigated the NOEs belonging to the geminal methylene proton-pairs at Positions 7 and 18 (Figure 1), which have well-defined and short interproton distance (1.78 Å), and their error is accordingly expected to be well below 6.9%. [1,2] The error of the calculated distance for these correlations, with or without NUS, is in line with this expectation (see Table S11 for details). The error for such short distances is expected to be below 4%, which is the case for the uniformly sampled and, probably by chance, for the 25% MIST-reconstructed NUS dataset (Table S11).
Next, we determined the errors on the interproton distances obtained from the NUS datasets as compared to the reference distances. As expected, the trends on the F I G U R E 5 The average Lin's concordance correlation coefficient (r C ) for the nuclear Overhauser effect (NOE) build-up curves as a function of distance range. The uniformly sampled data are shown in black, whereas those for the Modified Iterative Soft Thresholding (MIST)-reconstructed data with 75% nonuniform sampling (NUS) in red, with 50% NUS in green and with 25% NUS in blue. The black line is placed at r C = 0.90, below which concordance to the reference curve are considered to be poor. The r C value plotted as a function of distance ranges for the Multidimensional Decomposition (MDD)reconstructed datasets is shown in Figure S5 T A B L E 3 Lin's concordance correlation coefficient, r C given for seven build-up datasets acquired with the random-shuffle sampling scheme (Tables 1 and S4- [60] MDD was originally designed for reconstruction of spectra of higher dimensionality and therefore, as our data confirm, should not be used for individual distances follow the trends found for the accuracy of the linear fits, namely, (i) the longer the interproton distance and (ii) the sparser the data sampling, the larger the error. The interproton distances are both overestimated and underestimated in all NUS datasets. Our analysis suggests that the lower the sampling density, the less frequent are distances underestimated. The spread in the error of the MIST-reconstructed spectra for interproton distances ranging between 3.50 and 3.99 Å is shown in Figure 6. The error increases as the extent of NUS increases, and the error range increases as the distance increases (for further details, see Table S12 and Figures S5 and S6).

| Distance determination from a single NOESY spectrum
Whereas determination of interatomic distances is most reliable when a full build-up curve analysis is performed, in many situations, this is not a realistic option due to the limitations of spectrometer time. Interproton distances are then estimated based on the normalized intensities of NOE cross-peaks within a single spectrum. To evaluate the influence of NUS on this type of distance estimations, we analysed the error of the normalized intensities for the 62 NOE correlations discussed above in the spectra acquired with 500-ms mixing time, from each of the seven build-up datasets (Tables S13-S15). Backcalculated intensities from the reference build-up curve, generated as described above, were used as reference intensities. As the NOE cross-peaks of spiramycin, when acquiring the spectra in DMSO-d 6 solution at 600 MHz, are all negative, the NOE correlations that gave positive cross-peaks in these spectra are obviously erroneous and would result in unreliable distances. The number of valid correlations as a function of NUS, reconstruction algorithm and interproton distance range is given in Table 4. The chief observation here is that more valid correlations were found for the data originating from more densely sampled spectra. Analysing the absolute errors of the normalized cross-peak intensities in the NOESY spectra acquired with 500-ms mixing time, we found smaller average absolute errors for the shorter distances and for more densely sampled spectra (Figure 7). We observed similar trends for the average absolute error on the normalized intensity for the MIST-and MDD-reconstructed 75% and 50% NUS data (see Table S71 for details).
The average absolute error of the interproton distances derived from the normalized intensities of the seven 500-ms mixing time datasets are shown in Figure 8. Distances determined from the uniformly F I G U R E 6 The error on the interproton distances 3.50-3.99 Å determined from Modified Iterative Soft Thresholding (MIST)reconstructed non-uniform sampling (NUS) datasets. In red the errors found for distances originating from spectra collected with 75% NUS, in green for that from spectra collected with 50% NUS and in blue for that with 25% NUS data is shown Note: The number of correlations found with correct phases of the cross and diagonal peaks for the MISTreconstructed spectra are given. The analysis of the MDD-reconstructed data is given in Tables S6, S8  sampled data stay well below 6.9% average error, which has been reported to be typical for measurements performed on DMSO-d 6 solutions by Butts et al. [1] As expected, the error increases as the interatomic distance increases. Distances determined from a single 500-ms mixing time show larger error than those determined from a build-up curve. We observe that the error on the distance systematically increases as the sampling density decreases. The difference in average absolute error between the distances derived from an NOE build-ups and a single NOESY spectrum is largest for more intense NOEs, that is, shorter distances. For low-intensity NOEs, average absolute error is more comparable and is generally large. There is no situation where distance determination based on the normalized intensity of a single NOESY spectrum would give consistently more accurate results than that based on the analysis of the build-up curves.

| The sampling scheme
To investigate whether the choice of the sampling scheme has an impact on the outcome, the 500-ms uniformly sampled NOESY spectrum was artificially reconstructed to 50% NUS data using the Poisson-gap F I G U R E 7 The average absolute error on the normalized intensities of each dataset as a function of interproton distance. Uniformly sampled data in black, 75% non-uniform sampling (NUS) data in red, 50% NUS data in green and 25% NUS data in blue. All NUS data concerns Modified Iterative Soft Thresholding (MIST)-reconstructed data F I G U R E 8 The average absolute error for distances determined from the slope of build-up curves (BU) or from the peak intensities of a single NOESY spectrum (1S). For all distance ranges, the average absolute error on the distances of each dataset is plotted: uniformly sampled data in black, 75% non-uniform sampling (NUS) data in red, 50% NUS data in green and 25% NUS data in blue. Only Modified Iterative Soft Thresholding (MIST)-reconstructed NUS data are given in this figure. The black line is placed at 6.9%, the average error of uniformly sampled NOESY data according to Butts et al. [1] sampling scheme, and the resulting NOE peak intensities were compared to those obtained by the random-shuffle sampling scheme (50% NUS) as well as by uniform sampling. Twenty different copies of the artificial NUS spectra were created, for each sampling scheme, and the averaged outcomes of these copies following MIST reconstruction were analysed (Table 5).
To validate the artificial reconstruction, first, the artificially reconstructed 50% NUS spectra obtained using the random-shuffle sampling scheme were compared to those obtained experimentally using the same sampling scheme and MIST reconstruction. Out of the 62 distances, only 1 was underestimated, 34 had a larger error than 6.9% and 2 gave incorrect phase. This is, as expected, comparable to the number of valid distances derived from the experimentally obtained data, using the same sampling scheme (Table 5).
A similar analysis of the dataset obtained with the Poisson-gap sampling scheme was performed. We obtained a larger number of distances as compared to the data from the random-shuffle scheme. For a thorough comparison of sampling schemes, their inherent sensitivity would need to be matched, and the linewidths and signal per noise carefully compared. [61] However, the accuracy of the data obtained by Poissongap sampling still by far cannot compete with that of the data obtained from a single NOESY spectrum using uniform sampling ( Table 5). The quality of the data derived from the spectra acquired with relaxation delays of 2.5 s and 10.5 s (5 × T 1 ) are comparable (Table S14). This indicates that the quality of the NUS derived distances above is not low due to the use of a much too short relaxation delay.

| Number of increments and number of transients
NOESY spectra recorded with NUS show increased t 1 noise, which is known as one of the reasons for the decreased accuracy of NUS-recorded data. [8] We recorded the uniformly sampled data with 512 (n) data points in the indirect dimension. Accordingly, originally 256 and 128 increments were recorded when using 50% and 25% NUS scheme, resulting in 50% and 75% shorter overall acquisition times as compared to the uniformly sampled dataset. In order to ensure that the error of the distances derived from the data sampled with NUS does not originate from lower resolution, we acquired spectra with 1024 (2n) and 2048 (4n) increments along with 50% and 25% NUS, providing 512 sampled increments, identical to the uniformly sampled data. In other words, this provides uniformly and non-uniformly sampled spectra acquired with equal overall acquisition time. The increased number of increments did not improve the data quality (Table S15). These spectra remain to suffer from t 1 noise to a larger extent than the spectra acquired without NUS.
We created four further artificial datasets using the Poisson-gap and the random-shuffle sampling schemes, with 25% NUS and 64 transients (4n), and 50% NUS and 32 transients (2n). The overall acquisition time of these spectra corresponds to that of the uniformly sampled spectrum; however, in this case, it is achieved by increasing the number of transients instead of the number of increments, as compared to that acquired with uniform sampling. This has neither improved the spectrum quality significantly (Tables S22-S26).
T A B L E 5 Number of distances that were underestimated, had an error larger than 6.9%, or were based on a correlation involving a cross peak with incorrect phase are given for the uniformly sampled experimental NOESY spectrum acquired with a mixing time of 500 ms, experimental 50% NUS NOESY spectrum obtained with random-shuffle sampling, without repetition sampling (mixing time of 500 ms), the average for 20 artificially non-uniformly sampled NOESY with 50% Poisson-gap sampling and the average for 20 artificially non-uniformly sampled NOESY using 50% random-shuffle without repetition sampling NUS

| CONCLUSION AND DISCUSSION
In our hands, the quality of NOE-based distance determination is influenced by the use of NUS. This is equally true when the distances are derived from the analysis of a NOE build-up curve or from the normalized intensity of a single NOESY spectrum acquired with an adequate mixing time. The introduction of NUS results in a lower number of NOE correlations that can be used for distance determination, in addition to larger errors upon these. Whether the Poisson-gap or the random-shuffle sampling scheme is applied, it does not make a substantial difference for the quantitativity of the NOE correlations. Under all conditions tested in this work, the accuracy of the reconstructed NUS data remains inferior to that obtained by standard processing of uniformly sampled data. Neither full recovery of the magnetization in between transients (d 1 ≥ 5 × T 1 ) nor increased resolution in the indirect dimension, by recording 2n and 4n points in the indirect dimension, compensates for the decreased quality of the distances determined from spectra acquired with NUS. Hence, independent of the way the data are recorded and reconstructed, implementation of NUS schemes resulted, in our hands, in a loss of observed NOE correlations and a decreased accuracy. The NUS sampling scheme and reconstruction mechanism are known to influence the quality of the spectra; however, currently, the combination that may provide the highest quality spectrum is not predictable, leaving the users to finding the best option by trial and error, choosing from the combination of numerous options. Hence, extensive spectral processing and the use of other sampling schemes and reconstruction algorithms may yield improvements. However, without a reference system at hand, the uniformly sampled dataset in our case, it is impossible to assess whether an optimal combination of sampling scheme and reconstruction algorithm was used. In our hands, neither different sampling schemes, reconstruction algorithms, relaxation delays nor alteration of the number of recorded points in the indirect dimension resulted in acceptable quality, comparable to that obtained without NUS.
From our data, we can observe that weak NOE crosspeaks, and accordingly longer interatomic distances that are the most important in structure determination of small molecules, biomolecules and complexes, suffer the most from NUS. Assessing the build-ups by R 2 analysis, short distances (≤3.0 Å) derived from strong cross-peaks might be determined with a larger but still <6.9% error when recorded with 75% NUS; however, these are typically expected sequential NOEs. Evaluating the data comprehensively using the Linn's correlation coefficient (r C ), the NUS datasets show poor concordance to the gold standard, independent of the sampling scheme, reconstruction algorithm and distance range.
For the evaluation of the influence of NUS on NOE quantitativity, we used spiramycin, a commercially available drug. It shows conformational flexibility and is a representative model for compounds of interest for real-life drug development, for example. There is no indication that the conclusion drawn from this investigation would depend on molecular size or would for any reason be irrelevant for the evaluation of NOE spectra in general. In this work 2-D NOESYs of a small molecule were evaluated; however, we cannot exclude that the quantitativity of 3-D NOESY of biomolecules could be influenced by NUS. Hence, this should be assessed. Distance inaccuracies resulting from the implementation of NUS for NOESY experiments are of ubiquitous importance for virtually every solution NMR-based structure determination, independent of research field. As a cautionary tale, our work wishes to prompt for the cautious use of NUS in quantitative NMR applications, until sampling and reconstruction algorithms reliably providing quantitative NOE crosspeak intensities have been developed, proven and guidelines for their use disclosed.