Synthetic Receptors for the High‐Affinity Recognition of O‐GlcNAc Derivatives

Abstract The combination of a pyrenyl tetraamine with an isophthaloyl spacer has led to two new water‐soluble carbohydrate receptors (“synthetic lectins”). Both systems show outstanding affinities for derivatives of N‐acetylglucosamine (GlcNAc) in aqueous solution. One receptor binds the methyl glycoside GlcNAc‐β‐OMe with K a≈20 000 m −1, whereas the other one binds an O‐GlcNAcylated peptide with K a≈70 000 m −1. These values substantially exceed those usually measured for GlcNAc‐binding lectins. Slow exchange on the NMR timescale enabled structural determinations for several complexes. As expected, the carbohydrate units are sandwiched between the pyrenes, with the alkoxy and NHAc groups emerging at the sides. The high affinity of the GlcNAcyl–peptide complex can be explained by extra‐cavity interactions, raising the possibility of a family of complementary receptors for O‐GlcNAc in different contexts.

. Chemical shifts (δ in ppm) in 9:1 H2O:D2O of GlcNAc-β-OMe (2) unbound and when bound to receptors 4, 5, or 9. The values in brackets denote the change in chemical shift (Δδ) on complex formation. All CH protons of the sugar's core are much more shielded when bound by bis-pyrenyls 5 and 9 (up to -4.73)

Synthetic Procedures General Experimental
Commercial reagents were purchased from Sigma-Aldrich, Alfa-Aesar or Acros Organics and were used without further purification unless otherwise specified. Carbohydrates employed in binding studies were purchased from Sigma-Aldrich or Carbosynth Ltd.
Solvents were utilised as supplied unless otherwise stated. Anhydrous THF or CH2Cl2 were dried by passing through alumina, using a system manufactured by Anhydrous Engineering. Anhydrous DMF and DMSO were dried by distillation from P2O5 and CaH2, respectively. 1 H, 13 C, 19 F and 31 P-NMR spectra were acquired on Varian 400-MR, Jeol Eclipse (300 MHz), Varian VNMRS500a, Varian VNMRS500b or Varian VNMRS600 Cryo spectrometers. All spectra were referenced to their residual internal solvent peak and were acquired at 298 K, unless otherwise specified.
DOSY NMR spectra were run on a Varian VNMRS500a spectrometer, using a one-shot pulse sequence set with gradient length 2 ms, diffusion time 100 ms, maximum gradient strength 25000 and minimum gradient strength 2000. The data was then processed using DOSY Toolbox. 1 Mass spectra were recorded in the Mass Spectrometry Facility of the University of Bristol. ESI-MS spectra were recorded either on a Orbitrap Elite (Thermo Scientific), Synapt G2S (Waters), Apex IV (Bruker Daltonics) or a MicrOTOF II (Bruker Daltonics). MALDI-MS spectra were recorded either on a UltrafleXtreme (Bruker Daltonics) or a 4700 Proteomics Analyser (Applied Biosystems).

HPLC chromatography was performed using a Waters 600 Controller with a Waters 2998 Photodiode Array
Detector. For analytical runs a XSELECT CSH C18 5µm (4.6x150mm) column was used, and for preparative runs a XSELECT CSH Prep C18 5µm OBD (19x250mm) column was utilised, normally with an Acetone-Water solvent mixture.
Peptide synthesis. Fmoc-L-Amino acid derivatives and resins were purchased from NovaBioChem and ACS grade N, N-dimethylformamide (DMF) from Fisher Chemicals. All other chemical reagents were purchased from Aldrich, Acros, Alfa Aesar and Fischer and used without further purification. All solvents employed were reagent grade. Reversed phase high performance liquid chromatography (RP-HPLC) was performed on an Agilent 1100 series system equipped with an auto-injector, fractioN-collector and UVdetector (detecting at 220 nm) using Agilent Zorbax EclipseTM C18 semi preparative column (5 μm, 10 x 250 mm) at a flow rate of 1.5 mL/min. All runs were performed using linear gradients of 0-100% solvent B over 40 min; solvent A = 50 mM ammonium bicarbonate in water, solvent B = acetonitrile. High resolution mass spectra were obtained cyano-4-hydroxycinnamic acid as an internal standard matrix.

O-Protected receptors 6 and 10
A solution of pyrene "half-receptor" 12 (94 mg, 12.6 mol), hydrochloride salt of 1,3,6,8tetrakis(aminomethyl)pyrene (5.87 mg, 12.6 mol) in THF (5.18 mL) and H2O (1.14 mL) was prepared. A solution of DIPEA (16.7 L, 0.101 mmol) in THF (5.18 mL) and H2O (1.14 mL) was then added in a dropwise manner with a syringe pump over 6 hours. The resulting yellow solution was left stirring for 16 hours, after which the solvent was removed under reduced pressure. The resulting solid was dissolved in DCM and washed with NH4Cl (sat., 30 mL), brine (30 mL) and H2O (30 mL). The combined aqueous phases were extracted with DCM (3 x 30 mL), and the combined organic phases were dried over MgSO4, filtered and concentrated under reduced pressure. The sample was dissolved in 18% H2O in THF, filtered (0.2 m syringe filter) and purified by HPLC (acetone:H2O, 81:19 to 92:8 over 130 min; flow rate: 17 mL/min). 5 The components eluting at 113 min and 117 min were collected, the volatiles were evaporated and the samples were freeze-dried to yield the "staggered" cage 10 (20 mg, 2.85 mol, 23%; from the faster-eluting peak), and the eclipsed cage 6 (30 mg, 4.27 mol, 34%; from the slower-eluting peak). The structures were initially assigned as explained in Figure S13, and the assignments were confirmed as part of the structural determinations discussed in final section of this supporting information.
Data for 10 (eluting at 113 min);   Figure S13. Top: stacked 1 H NMR spectra of 10, 6 and a ~1:1 mixture of both. Bottom: { 1 H-1 H} NOESY of the ~1:1 mixture with integrals of the p2/p4 cross peaks for 10 and 6. The p2/p4 integral for 10 is about twice that for 6, despite the former being slightly less concentrated. This supports the assignment, as the distance between p2 and p4 on different pyrene units is expected to be smaller for 10 than for 6. Monte Carlo Molecular Mechanics conformational searches on 10 and 6 (MMFF force field) yielded global energy minima in which the average p2/p4 distances (considering all possible combinations) were 9.59 and 9.30 Å for 6 and 10 respectively. Spectra were recorded at 500 MHz in CD3OD.

Water-soluble receptors 5 and 9
The protected receptor 10 or 6 (5 mg, 0.71 µmol) was dissolved in DCM (2 mL) and the solution cooled to 0 °C, before the addition of TFA (1 mL) in a dropwise manner. The solution was then warmed to RT and stirred for 24 hours, before the volatiles were removed under reduced pressure. The product was then suspended in water (2 mL) before freeze drying to give a pale-yellow solid. The solid was suspended in H2O (2 mL) and the pH adjusted to 7.0 by addition of sodium hydroxide. The resulting solution was then filtered (0.45 µm syringe filter) and freeze-dried, affording the deprotected receptor (4.1 mg, 0.71 µmol, 100%) as a pale yellow solid. The above NMR spectra were obtained at 358 K because spectra at lower temperatures were broadened and did not reflect the symmetry of the cage structures. For details of the NMR studies on 5 and 9 see Section 2 below. containing free amine group on N-terminus; the remaining steps performed manually. The resin was then treated with 60% hydrazine in methanol for 2 h. The resin was washed thoroughly with DMF (5 mL x 2), DCM (5 mL x 2) and MeOH (5 mL x 2) and then dried in vacuo. The resin was swelled in DCM (5 mL A solution of the glycopeptide in water was neutralised by additions of aqueous HCl, while monitoring with a pH meter. Once pH 7.0 was reached the solution was lyophilized to provide neutralised glycopeptide.       S29 Figure S24. 1 H-NMR spectra of staggered receptor 9 at a range of concentrations, 15.6 µM to 500 µM in D2O at 298 K. Figure S25. 1 H-NMR spectra of staggered receptor 9 at a range of temperatures, 25 °C to 85 °C in 10 °C steps before returning to 25 °C, 750 µM in D2O. Peaks in the spectra are observed to sharpen and resolve as temperature increases, then return to their original state on cooling.  used in the experiment, were prepared and allowed to equilibrate overnight before use. Aliquots were then added to an NMR tube containing 400 L of receptor solution. The receptor concentration was thereby held constant while the carbohydrate concentration was increased. The sample tube was shaken after each addition and 1 H-NMR spectra were acquired at 298 K. If the receptor bound saccharide slower than the NMR sample rate ("slow exchange"), the Ka was determined by analysing the NMR integral of a peak assigned to the complex. The variable X was defined as the integral of an isolated resonance of the complex (typically in the aromatic region) divided by the integral of all the related resonances (typically the whole aromatic region). As X is proportional to fraction of host in the bound state, the change in X could be plotted as a function of the guest concentration to give a curve which could be fitted to a 1:1 binding model to yield the association constant Ka. Mathematically, the fitting process is essentially identical to that employed for binding with fast exchange, except that the integral of a peak due to the complex replaces the chemical shift of a peak due to bound + unbound receptor. The calculation was performed using a non-linear least squares curve-fitting programme implemented within Excel. The programme yields binding constants Ka and limiting X as output. Ka values are listed in Table 1 (main paper) and Table S1 below. An estimated error for Ka was obtained from individual data points by assuming the determined Ka and Xlim.. These errors are reported in Table S1 and are typically well below 5%.
Receptor 5 bound two substrates in slow exchange, the β-GlcNAc glycosides 2 and 3. In both cases the limiting value of X was anomalously low and some signals due to receptor remained unchanged throughout the titration. However, on leaving 5 in the presence of either 2 or 3 for a long period (6 and 2 months respectively) the spectra evolved towards those attributed to the complex (see Figures S30 and S34). These results imply that a proportion of 5 is present in an inactive form, presumably a conformational isomer, which interconverts with the active form on a timescale of months. The titrations with 2 and 3 were performed within 24 hours, and showed no indication that significant conversion from inactive to active forms was occurring over this time scale (fits to the 1:1 binding model were good, with no sign of an upward drift in X towards the end of the experiment). This being the case, the presence of the inactive form should not affect the titration, which can be analysed to give the association constants of the two complexes. Although the concentration of active 5 was not known accurately, the analysis was insensitive to this parameter. 7 In cases where the receptor bound saccharide at similar or faster rates than the NMR sample rate ("medium/fast exchange"), no attempt was made to estimate binding constants due to the complex nature S32 of the receptor spectra. Nonetheless, binding was clearly indicated in some cases, whereas in others there appeared to be little or no interaction between receptor and carbohydrate.
Spectra from the NMR titrations, and analysis curves where relevant, are included among Figures S28 -S67 below. Commentry on individual cases is given in the Figure  under identical conditions. The heats of dilution were then subtracted from the binding data, and the resulting peaks integrated to give the heat evolved by each addition. The total heat evolved (ΔH) was then plotted against the total concentration of guest. The data was then fitted to a 1:1 binding model using a nonlinear least squares curve-fitting program implemented within Excel, to give a binding constant (Ka). Gibbs free energy of binding (ΔG) can then be derived from the binding constant (Ka), and the entropy of binding (ΔS) can be derived from ΔH and ΔG. The fitting procedure also yields errors in Ka as in the case of NMR described above. This method consistently produced more accurate fits than fitting the data to an S-curve, as in the MicroCal software (S-curves are typically not observed for binding constants below ~10 4 -10 5 M -1 ).
It must be noted that, in the case of receptor 5, the presence of inactive receptor should introduce errors to ΔH (and thus also ΔS). The Ka and derived ΔG values remain reliable, as discussed above for the 1 H NMR titrations.
ITC output and analysis curves are included among Figures S28 -S67. An overview of the binding data, including thermodynamic quantities and errors is given in Table S1 below. Table S1. Summary of binding results for receptors 5 and 9 with carbohydrate substrates in aqueous solution at 298 K, including estimated errors derived from the fitting procedure and thermodynamic quantities from the ITC measurements.
[a] Unreliable due to presence of inactive receptor.  Figure S28. 1 H NMR binding study of eclipsed receptor 5 (0.15 mM) titrated with methyl N-acetyl-β-D-glucosaminide 2 (28.3 mM) in D2O at 298 K. Spectra imply binding with slow exchange, allowing analysis of integrals to give Ka (see below).           Figure S42. Partial 1 H NMR spectra from the titration of eclipsed receptor 5 (0.15 mM) with D-galactose (525 mM) in D2O at 298 K.

D-Galactose
No evidence for binding can be detected. Figure S43. Partial 1 H NMR spectra from the titration of eclipsed receptor 5 (0.15 mM) with D-cellobiose (255 mM) in D2O at 298 K.
No evidence for binding can be detected. Figure S44. Partial 1 H NMR spectra from the titration of eclipsed receptor 5 (0.15 mM) with N-acetyl-D-galactosamine (482 mM) in D2O at 298 K. Spectra show minor changes consistent with binding with medium/fast exchange, which could not however be quantified from this study. Figure S45. Partial 1 H NMR spectra from the titration of eclipsed receptor 5 (0.15 mM) with N-acetyl-D-mannosamine (489 mM) in D2O at 298 K. Spectra show minor changes consistent with binding with medium/fast exchange, which could not however be quantified from this study. Figure S46. Partial 1 H NMR spectra from the binding study of eclipsed receptor 5 (0.2 mM) titrated with N,N'-diacetyl-D-chitobiose (102 mM) in D2O at 298 K. No evidence for binding can be detected. Figure S47. 1 H NMR binding study of staggered receptor 9 (0.15 mM) titrated with methyl N-acetyl-β-D-glucosaminide (2) (18.4 mM) in D2O at 298 K. Spectra imply binding with slow exchange, allowing analysis of integrals to give Ka (see below). Figure S48. 1 H NMR binding study of staggered receptor 9 (0.15 mM) titrated with methyl N-acetyl-β-D-glucosaminide (2) Figure S49. ITC binding study of staggered receptor 9 (0.25 mM) titrated with methyl N-acetyl-β-D-glucosaminide (2) (10 mM) in H2O. The sum of heat evolution (µcal) was plotted as a function of the concentration of carbohydrate (mM) and fitted to a 1:1 binding model indicating Ka = 16625 ± 548 M -1 (3.3%) and r = 0.9988 (thermodynamic data also given in the figure). Figure S50. Partial 1 H NMR spectra from the binding study of staggered receptor 9 (0.25 mM) titrated with glycopeptide 3 (5.12 mM) in D2O at 298 K. Spectra imply binding with medium/fast exchange, which could be relatively strong given that changes occur at low concentrations. However, quantification was not possible from this study. Figure S51. Partial 1 H NMR spectra from the binding study of staggered receptor 9 (0.15 mM) titrated with methyl N-acetyl-α-Dglucosaminide (13) (202 mM) in D2O at 298 K. Spectra imply binding with slow exchange, allowing analysis of integrals to give Ka (see below).      Figure S60. Partial 1 H NMR spectra from the titration of staggered receptor 9 (0.15 mM) with D-glucose (16) (592 mM) in D2O at 298 K. Spectra imply binding with medium/fast exchange, which could not be quantified from this study. Figure S61. ITC binding study of staggered receptor 9 (0.50 mM) titrated with D-glucose (16) (20 mM) in H2O. The sum of heat evolution (µcal) was plotted as a function of the concentration of carbohydrate (mM) and fitted to a 1:1 binding model indicating Ka = 194 ± 1 M -1 (0.6%) and r = 0.99997 (thermodynamic data also given in the figure). Figure S62. Partial 1 H NMR spectra from the titration of staggered receptor 9 (0.15 mM) with D-mannose (494 mM) in D2O at 298 K. Spectra imply binding with medium/fast exchange, which could not be quantified from this study. Figure S63. Partial 1 H NMR spectra from the titration of staggered receptor 9 (0.15 mM) with D-galactose (524 mM) in D2O at 298 K. Spectra imply binding with medium/fast exchange, which could not be quantified from this study. Figure S64. Partial 1 H NMR spectra from the titration of staggered receptor 9 (0.15 mM) with D-cellobiose (248 mM) in D2O at 298 K. Spectra imply binding with medium/fast exchange, which could not be quantified from this study. Figure S65. Partial 1 H NMR spectra from the titration of staggered receptor 9 (0.15 mM) with N-acetyl-D-galactosamine (506 mM) in D2O at 298 K. Spectra imply binding with medium/fast exchange, which could not be quantified from this study. Figure S66. Partial 1 H NMR spectra from the titration of staggered receptor 9 (0.15 mM) with N-acetyl-D-mannosamine (495 mM) in D2O at 298 K. Spectra imply binding with medium/fast exchange, which could not be quantified from this study. Figure S67. Partial 1 H NMR spectra from the binding study of staggered receptor 9 (0.13 mM) titrated with N,N'-diacetyl-D-chitobiose (18.4 mM) in D2O at 298 K. Spectra imply binding with medium/fast exchange, which could not be quantified from this study.

Structural studies on receptor-guest complexes
General procedure to obtain 3D NMR structures NMR structures were obtained for the four complexes for which slow exchange between free and bound states resulted in well-resolved spectra of the complexes (i.e. 5•2, 5•2, 9•2 and 9•13). Clear cross peaks were observed in { 1 H-1 H}-NOESY, -TOCSY and -COSY spectra of samples in 10% D2O in H2O. 8 Those involving the sugar protons and the protons of the pyrenyl cage (i.e. all except the dendrimeric side-chains) were assigned to obtain a self-consistent interpretation of each NMR data-set. To minimize the risk of spin diffusion, the NOESY spectra were all recorded with a short mixing time (150 ms). 9 Isolated nOe cross-peaks were integrated and each integral (Ix) was converted to an interatomic distance (dx) using the two spin approximation: 10 = √ 6 One of the cross peaks between pyrenyl protons p4/p5 or p9/p10 was chosen as reference integral (Iref), assuming that the average interatomic distance between such pyrene protons -as found in pyrene-containing crystal structures-can be considered a reliable reference (i.e., dref = 2.28 Å). 11 The obtained distances were used as constraints in an initial model that was energy-minimised using Batchmin v10.3, accessed via Maestro 9.7 and energy minimized using the MMFFs forcefield, GBSA water solvation and 20% tolerances on the constrained distances. A few constraints which appeared unrealistically short, possibly due overlaps in the NOESY spectra, were ignored during the minimisations. Annotated spectra, distance tables and images of the structures are given in the following pages. Final coordinates are presented as .cif files accompanying this Supporting Information, with core (assigned) protons labelled as herein.
The chemical shift changes observed for GlcNAc-β-OMe (2) on complexation with 5, 9 and (for comparison) 4 are listed in Table S2 below. It can be seen that all carbohydrate protons undergo substantial upfield shifts on complexation, and that these are generally much larger for pyrene-based 5 and 9.
8 Data was collected on a 600 MHz VNMRS spectrometer equipped with a 5mm cryogenically cooled probe or when further resolution was required, at 900 MHz.  S60 Eclipsed receptor 5 with methyl N-acetyl-β-D-glucosaminide (2) Figure S68. Structures of eclipsed receptor 5 and methyl N-acetyl-β-D-glucosaminide with numbering using for structural assignment.

Images of molecular model of 5 + GlcNAc-β-OMe (2)
S70 S71 S72 Eclipsed receptor 5 with glycopeptide 3 Figure S76. Structures of eclipsed receptor 5 and the sugar fragment of glycopeptide 3 with attached residue. The numbering used for structural assignment is also given.