Molecular Recognition of Disaccharides in Water: Preorganized Macrocyclic or Adaptive Acyclic?

Abstract When facing the dilemma of following a preorganized or adaptive design approach in conceiving the architecture of new biomimetic receptors for carbohydrates, shape‐persistent macrocyclic structures were most often chosen to achieve effective recognition of neutral saccharides in water. In contrast, acyclic architectures have seldom been explored, even though potentially simpler and more easily accessible. In this work, comparison of the binding properties of two structurally related diaminocarbazolic receptors, featuring a macrocyclic and an acyclic tweezer‐shaped architecture, highlighted the advantages provided by the acyclic receptor in terms of selectivity in the recognition of 1,4‐disaccharides of biological interest. Selective recognition of GlcNAc2, the core fragment of N‐glycans exposed on the surface of enveloped viruses, stands as an emblematic example. NMR spectroscopic data and molecular modeling calculations were used to ascertain the differences in binding mode and to shed light on the origin of recognition efficacy and selectivity.


Introduction
Among the plethora of biologically relevant oligosaccharides, those connected by a glycosidic 1,4-linkage are plentiful in nature. Lactose and maltose are two of the most common 1,4disaccharides, whereas cellulose and chitin, constituted by repeating units of cellobiose (Glc 2 , CeB) and of N,N'-diacetylchitobiose (GlcNAc 2 ), respectively, both connected by 1,4-glycosidic linkages, are among the most abundant biopolymers in nature. [1,2] Glycosidic 1,4-linkages are also very common in glycan structures. For example, the disaccharide GlcNAc 2 is a part of the GlcNAc 2 Man 3 fragment, highly conserved in the core of N-glycans exposed on the surface of enveloped viruses, some of which are particularly hazardous for human health, including, among others, coronaviruses and retroviruses. [3][4] Molecular recognition of disaccharides of biomedical relevance by biomimetic receptors in physiological media represents a major challenge of current research, [5,6] because selective recognition of neutral saccharides in water must cope with a highly competitive solvent. [7] Nevertheless, in the last few years significant steps forward have been made by developing biomimetic receptors based on rigid macrocyclic architectures. [8] Although this approach has been quite successful for the recognition of several mono-and oligosaccharides, it is hampered by lengthy multistep syntheses of low overall yields, due to the critical macrocyclization step. [9] On the other hand, examples of effective recognition of neutral saccharides in water by acyclic receptors are extremely rare in the literature, [10,11] even though acyclic flexible architectures can take advantage of being more easily adaptable to the guest, while featuring simpler structures suitable for further optimization.
We have recently reported two biomimetic receptors (1 [12] and 2, [13] Figure 1) effectively recognizing carbohydrates in water. The two receptors share a common tridentate diaminocarbazole hydrogen binding motif, equipped with phosphonate hydrosolubilizing groups, and two anthracene groups, providing extended CH-π interactions with the saccharidic backbone. [14] Receptor 1 features a preorganized macrocyclic structure possessing a hydrophobic cavity lined with H-bonding groups, whereas receptor 2 possesses a flexible, acyclic, tweezer-shaped architecture featuring analogous binding motifs. Receptor 1, easily available in six steps with 30 % overall yield, effectively binds monosaccharides in water, selectively recognizing the beta anomer of glucose with a 1.3 mM affinity (expressed as intrinsic median binding concentrations, BC 0 50 ), and the α anomers of glucose, galactose, and fucose with affinities of 3.12, 1.19 and 360 μM, respectively. [12] Although extensively investigated toward monosaccharides, the binding properties of receptor 1 toward disaccharides were not yet explored. On the other hand, receptor 2, which has been shown to effectively recognize 1,4-disaccharides, with a marked affinity (160 μM) and selectivity for the methyl β-glycoside of GlcNAc 2 , [13a] did not bind to monosaccharides at all.
In order to assess the role of the architecture in saccharide recognition, in this work we investigated the binding affinities of receptor 1 toward the set of glucose-containing disaccharides used to test receptor 2, to compare their binding properties and ascertain the effect of macrocyclic (preorganized) versus acyclic (adaptive) structures on recognition ability. [15] NMR-based molecular modeling calculations were used to give a three-dimensional description of the complexes of the two receptors with a common guest, which revealed the substantial role of CH-π interactions.

Results and Discussion
In a preliminary screening by 1 H NMR spectroscopy, the binding ability of 1 was tested toward a set of disaccharides constituted by at least one glucose unit, for which the receptor showed good affinities, including cellobiose (CeB), lactose (Lac), maltose (Mal), trehalose (Tre), and sucrose (Suc; Figure 2). Binding ability was qualitatively evaluated by monitoring the shifts of the proton signals of the sugar upon addition of an equimolar amount of 1. Although for Suc and Tre no variations were observed, a marked upfield shift was detected for CeB, Mal and Lac, reasonably due to the shielding effect of the anthracene moieties in the binding cavity, larger for the β than for the α anomers. A concomitant broadening of signals, larger for the β anomers, indicated slow chemical exchange, most likely due to strong binding (Figures S1-S3 in Supporting Information).
For a quantitative determination of the binding ability of 1, 1 H NMR titrations of methyl-β-glycosides of cellobiose (Meβ-CeB), lactose (MeβLac), and maltose (MeβMal) were carried out in D 2 O (pD 7.4) at 298 K, additionally including MeβGlcNAc 2 , for which receptor 2 showed high affinity ( Figure 2). To avoid ambiguities in the definition of the binding model, the cumulative association constants reported in Table 1 were obtained by the simultaneous fit of all available signals from two independent titrations, run at different reactant concentrations. Because multiple complex species were found for all systems, the overall affinities reported in Table 1 were determined by the intrinsic median binding concentration parameter (BC 0 50 ), [16] which was calculated from the measured binding constants. 1 H NMR titrations with MeβMal were also duplicated at pD 11 (Table S1) and fitted to the association model obtained at pD 7.4. While protonation of the aromatic amino groups is not expected in the investigated range of pD, the degree of protonation of the phosphonate groups does not affect the binding ability of receptor 1, as previously observed for binding to monosaccharides [12] and confirmed by the comparable affinities obtained at different pD values.
As with monosaccharides, Table 1 shows multiple binding constants for receptor 1 with disaccharides. Strong selfassociation, with a dimerization constant of log β dim = 3.84 � 0.20, favors complex species in which the receptor is dimeric,   [b] for receptor to glycoside (R : G) complexes of 1 and 2 with methyl glycosides, measured at 298 K from NMR data in D 2 O at pD 7.4. [c] Receptor

Chemistry-A European Journal
Full Paper doi.org/10.1002/chem.202101238 featuring two binding cavities and giving rise to complexes with stoichiometries higher than 1 : 1. Results show that receptor 1 effectively binds to MeβCeB, MeβMal, and MeβLac with good affinities, though with lack of selectivity, but does not recognize MeβGlcNAc 2 , for which no significant variations of chemical shifts were detected ( Figure S8). Thus, receptor 1 can distinguish 1-4 from 1-1' disaccharides (Suc/Tre), which are not bound at all, and MeβCeB from the N-acetylated aminoanalogue MeβGlcNAc 2 , but cannot discriminate among glucose containing 1-4 disaccharides, proving to be insensitive to the configuration of the anomeric linkage (MeβCeB/ MeβMal) and to the presence of axial substituents (MeβLac). Surprisingly, these 1-4 disaccharides are bound with an affinity very close to that previously observed for MeβGlc, indicating lack of selectivity between mono-and disaccharides.
Comparison of binding properties between receptors 1 and 2, as obtained by 1 H NMR titrations, quantifies the selectivity advantage achieved through the adaptive architecture. Indeed, in contrast to 1, receptor 2 not only strongly binds to MeβGlcNAc 2 , but also discriminates among the investigated glucose containing 1,4-disaccharides. Receptor 2 shows preference for the all-equatorial MeβCeB, which is bound with an affinity very close to that observed for 1, whereas MeβMal and MeβLac are bound with an affinity more than one order of magnitude smaller.
To shed light on the origin of such unexpected difference between 1 and 2, a description of the binding mode characterizing the receptor-disaccharide complexes in solution was attempted by combining NMR techniques with molecular modeling calculations, following the approach previously adopted to study the complex between MeβGlcNAc 2 and receptor 2, which provided an informative picture of the interaction. [13a] Because MeβGlcNAc 2 did not bind to 1, the investigation was carried out on MeβCeB, MeβMal and MeβLac.
Chemical shift variations of the anomeric protons H-1 and H'-1 upon formation of the 1 : 1 complex with receptor 1, as calculated by nonlinear regression analysis of titration data, showed an upfield shift for all three disaccharides caused by the aromatic shielding effect (Figure 3a and Table S2), which is more pronounced on the H proton of the methyl-β-glucoside unit (Δδ = 0.80-0.82) than on the H'-1 proton (Δδ = 0.24-0.32), suggesting a closer contact of the former to the aromatic moieties. On the other hand, from the analysis of chemical shift variations in the 1 : 1 complexes with receptor 2 (Figure 3b and Table S3), a stronger shielding effect is clearly apparent, more marked for MeβCeB, indicating a closer proximity of the entire disaccharide to the aromatic rings. This evidence suggests that 2 can adapt better than 1 to the disaccharidic guests, showing preference for the all-equatorial MeβCeB. The chemical shift differences (CSDs) for the H-1 and H'-1 protons of the latter are very similar, suggesting a fit of the entire disaccharide into the cleft of the receptor. The CSDs of MeβMal and MeβLac, significantly smaller and with the H-1 CSD predominant, suggest a less comfortable fit in the cleft, in agreement with the corresponding lower affinities.
The complexes of receptor 1 and 2 with MeβCeB were then selected as representative examples, and their binding modes were studied by NOESY spectroscopy at pD 11, a medium in which the receptors are fully deprotonated species. From NOESY spectra run on the 1 : 1 mixture of 1 and MeβCeB, a strong intramolecular NOE contact was found between the H'-1 and the H-4 protons ( Figure S10), suggesting that in the complex the disaccharide is in the conformation usually found in solution. Unambiguous intermolecular NOE contacts were also identified (Figures S11 and S12), the strongest of which were those between the OCH 3 protons and both the HÀ C and HÀ D protons of the anthracene ring (Figure 1), and between the H'-1/H'-5 protons and the HÀ D protons.
NOESY spectra performed on an equimolar mixture of 2 and MeβCeB showed unambiguous intermolecular NOE contacts between both saccharidic units of MeβCeB and the anthracene protons of 2 ( Figure S15). The NOESY map shows a strong NOE cross peak between H-2 and HÀ C, and a NOE contact of H'-2 with the HÀ F located on the opposite side of the anthracene ring. Moreover, the OCH 3 protons show NOE contacts with the HÀ C, HÀ D and HÀ E protons.
Based on NOESY NMR evidence, molecular mechanics calculations were carried out on the 1 : 1 complex of 1 with MeβCeB, on the assumption that, although prevalently dimeric, the receptor would feature two independent binding sites. A conformational search, using a well-tested unconstrained molecular mechanics protocol, [17] returned a family of conformers within 5.19 kJ mol À 1 from the global minimum that was in very good agreement with NMR spectroscopic data. The

Chemistry-A European Journal
Full Paper doi.org/10.1002/chem.202101238 minimum energy structure depicted in Figure 4a and b shows MeβCeB partially located inside the receptor cavity, with the methyl glycoside unit nested inside the cavity and the other unit protruding outward, in a geometry that agrees with the strongest NOE contacts observed in NOESY maps (Table S4) and with the shift differences observed from titration experiments (Figure4a). All O···H interatomic distances shorter than the sum of the van der Waals radii and compliant with hydrogen bonding criteria were calculated from the above model, and several hydrogen-bonding interactions were found involving the methyl glucoside unit exclusively (Figure 4b). Additional contribution to binding is provided by several CH-π interactions showing short distances, established between the methyl glucoside unit and the anthracene rings (Table S6).
The binding geometry obtained from calculations supports the observed affinities. Indeed, because the disaccharide is bound through the methyl glycoside unit exclusively, lack of selectivity among the investigated set of disaccharides can be easily anticipated, irrespective of the α/β glycosidic linkage to the second unit. This evidence also explains the closely similar affinities observed between the disaccharides and the monosaccharide MeβGlc. Likewise, the 1-1' disaccharides, featuring a bulky substituent in place of the methyl group, and MeβGlc-NAc 2 featuring the N-acetyl groups, can hardly fit into the receptor cavity. Thus, despite the good affinities observed, lack of selectivity between glucose containing 1,4-disaccharides can be ascribed to the size of the macrocyclic cavity, unable to accommodate the entire disaccharide.
The conformational search carried out on the 1 : 1 complex between 2 and MeβCeB resulted in a single family of minimum energy conformers within 8.73 kJ mol À 1 from the global minimum. The minimum energy structure depicted in Figure 4c shows the MeβCeB entirely located inside the binding cleft between the two anthracene faces, in a geometry closely similar to that previously observed in the complex with MeβGlcNAc 2 , [13a] and in agreement with the proximities inferred by strong NOE contacts (Table S5).
Hydrogen bonding interactions could be calculated from the above model ( Figure 4d) and, analogously to 1, four hydrogen bonds were found between the diaminocarbazole unit and MeβCeB. However, in contrast to 1, a significant enhancement to binding could result from the extensive network CH-π interactions that can be established between the axial protons of both the saccharidic units and the anthracenes ( Table S6).
The above three-dimensional descriptions clearly show that the acyclic structure of 2 can adapt to the disaccharidic guest better than the macrocyclic structure of 1, giving rise to increased affinity despite the lack of a hydrogen-bonding unit. The evidence indicates that the latter is effectively compensated for by a tighter fit and by extensive CH-π interactions. Such compensation is not fully achieved with MeβMal and MeβLac because axial substituents hamper a tight fit into the cleft, causing a drop in affinity. In contrast, the lack of preorganization and absence of a hydrogen-bonding unit cause a severe drop in the affinity of 2 for monosaccharides, which is not compensated for by additional interactions; this results in undetectable binding. Thus, macrocyclic receptor 1 appears to be well preorganized for binding a monosaccharidic but not a disaccharidic guest, whereas acyclic receptor 2 can take advantage of its adaptive structure to establish more extensive attractive interactions with respect to its macrocyclic counterpart.
The enhanced binding of 2 to MeβGlcNAc 2 compared to MeβCeB could be explained by the additional hydrogen bonding and CH-π interactions involving the N-acetyl group that the former can establish with the receptor.

Conclusion
Together, the results presented demonstrate that a flexible acyclic structure can be an effective alternative to the widely studied macrocyclic architectures for the molecular recognition of neutral disaccharides in water, provided that a suitably designed combination of hydrogen bonding and CH-π interactions can be established with the saccharidic guest. The structurally simple tweezer-shaped receptor 2 presents significant advantages over its macrocyclic counterpart 1, accommodating the disaccharidic guest within the binding cleft and selectively recognizing the methyl-β-glycoside of GlcNAc 2 over a set of monosaccharides and structurally related 1,4 disaccharides. Because of its simple structure, easy synthetic availability, and potential for structural modifications, the tweezer-shaped architecture of receptor 2 opens the way to the design of acyclic receptors for the recognition of saccharides in water.