Inducing Social Self‐Sorting in Organic Cages To Tune The Shape of The Internal Cavity

Abstract Many interesting target guest molecules have low symmetry, yet most methods for synthesising hosts result in highly symmetrical capsules. Methods of generating lower symmetry pores are thus required to maximise the binding affinity in host–guest complexes. Herein, we use mixtures of tetraaldehyde building blocks with cyclohexanediamine to access low‐symmetry imine cages. Whether a low‐energy cage is isolated can be correctly predicted from the thermodynamic preference observed in computational models. The stability of the observed structures depends on the geometrical match of the aldehyde building blocks. One bent aldehyde stands out as unable to assemble into high‐symmetry cages‐and the same aldehyde generates low‐symmetry socially self‐sorted cages when combined with a linear aldehyde. We exploit this finding to synthesise a family of low‐symmetry cages containing heteroatoms, illustrating that pores of varying geometries and surface chemistries may be reliably accessed through computational prediction and self‐sorting.


Introduction
Controlled host-guest recognition is of crucial importance to biological processes and artificial supramolecular systems alike. [1,2] Cage-like compounds have been developed to exploit such host-guest interactions to achieve pollutant remediation, [3] gas storage, [4] anion binding, [5] biomimetic guest recognition, [6] molecular separations, [7] and nanoparticle templation. [8,9] Advantages of organic cage hosts include their improved solubility over framework materials,m aking them excellent candidates for both liquid-or solid-phase applications. [10][11][12][13][14] Furthermore,cages offer synthetic handles that can be used to finely tune their cavity shape and electronic properties,a nd hence potential interactions with guest molecules. [7,[15][16][17][18][19] However, the synthesis of molecular cages is often challenging,p articularly where multiple bonds must be formed selectively.T oavoid this problem, many molecular cage syntheses take advantage of dynamic covalent chemistry, in which reversible reactions provide an error correction mechanism to ensure the thermodynamic cage product is obtained. [20][21][22] In the case of imine-based cages,m ultiple amines and aldehydes must react to yield asingle cage species instead of oligomeric mixtures of imines. [23][24][25][26] Ideally,h ighfidelity self-sorting biases the formation of the cage product over the many other possible products,e nabling selective isolation of the target molecule. [27][28][29] Al imitation of self-sorting strategies is that they often result in the formation of highly symmetrical products. [22,30,31] Reducing the symmetry of the host may induce anisotropy in the solid state,i mprove the binding of low-symmetry guests, or enable more controlled and directional post-synthetic modification. [32][33][34][35] Forexample,fine-tuning of the cavity of an organic cage has been shown to afford precise control over the selectivity of the resultant solid-state material. [7] Stepwise syntheses exploiting orthogonal reactivities can afford lowsymmetry organic cages, [6,36,37] but this limits the scalability of the resulting materials.Alternatively,low-symmetry architectures may be obtained by purification of complex mixtures, but this is al aborious process and may be unachievable on ap reparative scale due to reconfiguration of the desired products. [38][39][40][41] Ar ecent study showed that low-symmetry cages can be formed using al ower-symmetry aldehyde precursor,b ut the presence of multiple structural isomers precluded the unambiguous characterisation of the cage products. [42] We sought to avoid these problems by designing an alternative single-step route to low-symmetry imine-based cages.W eused mixtures of multiple aldehyde precursors with different geometries to investigate their self-sorting behaviour,s creening for combinations that led to the selective formation of low-symmetry cages.
We recently reported as eries of Tet 3 Di 6 tubular organic cages that were prepared through imine formation between three pseudo-linear tetratopic aldehydes ("Tet") and six ditopic amines ("Di"), and selected the linear tetraaldehydes as astarting point for these studies. [43,44] Reacting amixture of two tetraaldehydes with as ingle diamine can produce three distinct sorting outcomes (see Figure 1i): narcissistic selfsorting, in which only cages incorporating as ingle aldehyde precursor are observed; [45][46][47][48] social self-sorting, in which only cages incorporating both aldehyde precursors are observed; [27,49] and scrambling,i nw hich am ixture of different sorting outcomes are observed. [50] However,i ti se xtremely hard to predict-either intuitively or computationally-which outcome will be observed for ag iven pair of aldehyde reactants.F or simple cases,such as two linear aldehydes,one could expect that narcissistic self-sorting is likely due to the mismatch in the aldehyde lengths and strain in the resultant cages. [48,[51][52][53] It is much more difficult to predict the outcome when al inear aldehyde is combined with ab ent aldehyde, such as B1 (see Figure 1ii).W ehypothesised that the greater conformational degrees of freedom of non-linear aldehydes compared to linear aldehydes would aid social self-sorting or scrambling by accommodating aw ider range of options for the cage geometry. [54][55][56] To test our hypothesis,w es tudied imine-based cages formed from two bent tetratopic aldehydes B1-B2 and four linear aldehydes L1-L4 of varying length (see Figure 1iii). First, we sought to confirm that all the aldehydes individually form cages with (1R,2R)-trans-1,2-cyclohexanediamine (R,R-CHDA)inthe presence of trifluoroacetic acid catalyst. Each bent aldehyde was then allowed to react sequentially with the series of linear aldehyde partners and R,R-CHDA to assess their cage-forming and self-sorting behaviour.A ll reactions were characterised by ultra-performance liquid chromatography-mass spectrometry (UPLC-MS) and 1 HNMR spectroscopy.W here cage species could be isolated, crystal structures were sought to confirm their identities and assess their stable conformations.T he sizes of isolable low-symmetry cages were further investigated in solution by diffusionordered spectroscopy (DOSY NMR) and ion-mobility spectrometry-mass spectrometry (IMS-MS). Aldehyde B1 was found to induce social self-sorting in the studied cages,t hus heteroatom-containing analogues of B1 were synthesised and Figure 1. i) Illustration of narcissistically and socially self-sorted systems as opposed to non-sorted scrambledo utcomes for an imine-based organic cage forming reaction using linear (orange) and bent (green) aldehydes in presence of (1R,2R)-trans-1,2-cyclohexanediamine (blue); ii)structures of the bent (B1, B2)and iii)linear (L1-L4)t etraaldehydes used in this work. reacted using the same methods to test whether the selfsorting behaviour was retained.
We previously used density functional theory (DFT) formation energies to explain the thermodynamically preferred cage topologies in the dynamic imine-based selfassembly processes. [41,[57][58][59] Comparing the thermodynamic stabilities of potential cage products can be ag ood guide to selectivity,b ut the reaction outcome can also be affected by factors,s uch as reaction kinetics, [39,[60][61][62][63][64] solvent effects, [65][66][67][68] and the solubilities of the species involved in the equilibrium. [33,40,47] In parallel with the synthetic efforts,w eu sed computational techniques to predict the stability of the different homo-and heteroleptic structures originating from aldehydes B1-B2 and L1-L4.T he experimentally observed outcomes agreed with the relative gas-phase formation energies of the possible Tet 3 Di 6 products,s howing the predictive power of the simple model for the self-sorting behaviour of imine-based organic cages.

Results and Discussion
Single aldehyde systems Aldehydes B1-B2 and L1-L4 were synthesised via Pdcatalysed cross-coupling reactions (see Supporting Information Section S3 for the synthetic details). Thereactions of L2 and L3 with R,R-CHDA have been reported to give tubular covalent cages [3L2] and [3L3],respectively (see Figure 2f or the single-crystal structures.Details and CCDC numbers are in the Supporting Information). [44] Ther eaction of L1 with R,R-CHDA afforded ac omplex mixture of imine condensation products that could be purified by recrystallisation to yield [3L1].R eactions of L4 and B2 with R,R-CHDA both result in single cage products [3L4] and [3B2],r espectively.
Thes ingle crystal structures of [3L1] and [3B2] could be elucidated. Unlike for the other aldehydes,reaction products of B1 with either R,R-o rS,S-CHDA could not be identified and attempts to grow single crystals from such reaction mixtures were unsuccessful. However,c o-crystallisation of the opposite-handed reaction products led to re-equilibration of the building blocks and provided ap seudo-C 3h -symmetric [3B1-RS] cage.Investigation of the crystal structure of [3B1-RS] revealed incorporation of equal amounts of each enantiomer of CHDA into the cage structure,w hich is reminiscent of the CHDA self-sorting observed in apreviously reported organic cage CC3-RS. [69] This result prompted us to computationally explore the thermodynamic preference for the formation of pseudo-C 3h cages incorporating both CHDA enantiomers against the corresponding enantiopure Tet 3 Di 6 cages (see Section S2 for the computational details). Indeed, the DFT formation energy for [3B1-RS] was 20 kJ mol À1 lower than the formation energy for the enantiopure [3B1] at the M06-2X/6-311G(3df,3pd) level of theory. B1 was the only studied aldehyde for which any preference was observed. As the reaction of B1 with R,R-CHDA only formed the enantiopure [3B1] cage in trace amounts,w ep ostulated that more stable cages may result from the addition of linear tetraaldehydes, enabling access to socially self-sorted lower-symmetry architectures.Bycontrast, cages including linear aldehydes and B2 would compete with the favourable formation of [3B2], leading to narcissistic self-sorting or statistical mixtures of all possibilities.

Mixed aldehyde systems
To explore self-sorting in the system, we combined aldehydes from the L and B families in single-pot reactions with R,R-CHDA under cage-forming conditions.Weexpected the Tet 3 Di 6 topology to be favoured in all cases based on our previous work. [44] This assumption reduced the space of the possible structures to an umber that could be systematically explored by computational methods.A si mine formation is reversible under the reaction conditions used here,t he observed product distributions are expected to relate to the thermodynamic minima. Therefore,f ormation energies can be predictive of the range of products seen. If all the possible cages have similar formation energies,wep redict that multiple cage products will be formed or that the self-sorted products will be selected by solvation and entropic effects. Conversely,i fo ne or more cages are much lower in energy than the other possibilities,w ep redict that those structures will dominate the product distribution.
Fore ach reaction studied, we manually constructed the Tet 3 Di 6 cages in all possible stoichiometries of the linear and bent aldehydes.W et hen applied aw orkflow consisting of high-temperature molecular dynamics simulations with OPLS3e force field, [70] followed by further geometry optimisations at the PBE-GD3/TZVP-MOLOPT-GTH level of theory [71][72][73][74][75][76][77][78] to find the expected gas-phase conformations of the resulting cages (see Section S3 for more details). Single point energies were calculated for the modelled structures at the M06-2X/6-311G(3df,3 dp) level of theory and the resulting formation energies are summarised in Table 1( for B1)and Table 2(for B2). These calculations are performed on isolated molecules in the gas phase,w hich does not consider solvent effects,a nd hence large energetic differences are needed to predict solution-phase structures with confidence.
In parallel, we attempted to address the problem synthetically by targeting the cage stoichiometry of [L + 2B] (+ 6 R,R-CHDA omitted for clarity). Ad euterated variant of aldehyde L3 was available from aprevious study and used in reactions with B1 to allow discrimination of the resultant cages by mass in the UPLC-MS. [79] No such deuterated analogue was available for the mixture of L4 and B2,and the reaction was characterised primarily by 1 HNMR chemical shifts and UPLC retention times.T able 1a nd Table 2 summarise which structures were experimentally observed.
Reactions of B1 with aldehydes L1-4 and R,R-CHDA all resulted in cage compounds corresponding to entries marked with asterisks in Table 1( see Section S3.4 for screening details and raw spectra). Forthe shortest linear aldehyde L1, the major cage product was apseudo-D 3 low-symmetry [L1 + 2B1] cage,w hich was readily isolated via recrystallisation. Thes tructure of this compound was elucidated by singlecrystal X-ray diffraction (see Figure 4) and was the lowest energy structure predicted for that system. When the elongated L2 was used instead, the two major products observed by UPLC-MS were al ow-symmetry [2L2 + B1] cage and the previously described homoleptic [3L2],w hich again were the two lowest-energy predicted structures.F or the even longer aldehyde L3-d,acomplex mixture was observed by UPLC-MS (see Figures S16-S21 and Tables S3,S4). Them ajor product was identified as the [3L3-d] cage,i na greement with the computational models.L owermass peaks corresponding to [2L3-d + B1] and [L3-d + 2B1] could also be detected, both structures being of comparable DFT formation energies.T he relative proportions of the products could not be determined due to insufficient chro- Table 1: Side-and top-views of the DFT-optimised structures (PBE-GD3/ TZVP-MOLOPT-GTH) of the possible Tet 3 Di 6 outcomes for the reactions using mixtures of B1 and L1-L4 under cage forming conditions. Underneath are the single point formation energies (M06-2X/6-311G-(3df, 3dp)) in kJ mol À1 .Entries marked with asterisks are the experimentally observed outcomes. Building blocks are coloured according to Figure 1, nitrogen atoms are dark blue, hydrogen atoms are omitted. matographic separations.F or the longest aldehyde trialled, L4,t he major product was the symmetrical tubular [3L4] cage,w hich was of significantly lower DFT energy than any competing structure in this system. In all cases,other species could be detected by mass spectrometry as trace products, including the chiral cage [3B1],but could not be isolated. The distribution of the products is affected by the length of the linear aldehydes L1-L4 in aseemingly unpredictable way,but the observed structures agree with predicted trends in the DFT formation energies.T he length of L1 appears to be suitable to relieve strain in acage containing two B1 moieties; L2 is of suitable length to relieve strain in ac age containing one B1 moiety,but L3 and L4 seem to be too long and do not form stable mixed Tet 3 Di 6 cages with B1.
Reactions involving aldehyde B2 and aldehydes L1-L4 also all resulted in the formation of cage compounds (see Table 2a nd Section S3.4). Fora ldehydes L2 and L3,t he outcomes of the reactions were scrambled and all possible Tet 3 Di 6 cages were observed, which were all of comparable DFT formation energies.F or L1 and L4,w eo bserved narcissistic self-sorting.  Table S5). However,a sa ll products in this system have the same mass,i tw as not possible to unambiguously characterise the self-sorting behaviour with mass spectrometry.W hile formation energies of [3L1] and [L1 + 2B2] are comparable,a nd the formation energy of [3L4] is higher than that of the socially-sorted cages [nL4 + mB2],w ep ropose that the clean narcissistic self-sorting in those cases is ar esult of antagonistic coupling between the homoleptic and the heteroleptic cages in these libraries. [ Ther eaction of L1 and B1 with R,R-CHDA stands out as the only combination which produces alow-symmetry heteroleptic organic cage as the only product detected by NMR.

Heteroatom-containing [L1 + 2B1X] systems
To investigate whether social self-sorting would also be observed with analogues of B1 we synthesised thiophene (B1S)a nd pyridyl (B1N)d erivatives that have structurally related geometries (see Figure 3f or the chemical structures and Section S3.2 for the synthetic details). Due to the incorporation of heteroatoms and differently sized rings in their cores,t hese aldehydes were expected to produce cages with different pore geometries and electronic properties. [48,81,82] 2: Side-and top-views of the DFT-optimised structures (PBE-GD3/ TZVP-MOLOPT-GTH) of the possible Tet 3 Di 6 outcomes for the reactions using mixtures of B2 and L1-L4 under cage forming conditions. Underneath are the single point formation energies (M06-2X/6-311G-(3df, 3dp)) in kJ mol À1 .Entries marked with asterisks are the experimentally observed outcomes. Building blocks are coloured according to Figure 1, nitrogen atoms are dark blue, hydrogen atoms are omitted.
In both cases,the reactions of B1S and B1N with L1 and R,R-CHDA gave analogues of [L1 + 2B1] as the major product, accompanied by significant formation of the corresponding homoleptic [3B1X] cages (see Section S3.5 for the synthetic details). Computational modelling predicts comparable formation energies for [L1 + 2B1X] and [3B1X] in both cases,w ith as light preference for the heteroleptic structures (see Section S2 for the computational details). Inspection of the computational models suggests that [L1 + 2B1] and its analogues exhibit similar shapes and sizes.I tw as possible to obtain as ingle-crystal X-ray structure of [L1 + 2B1S] (see Figure 4) and comparison of this structure to that of [L1 + 2B1] supports this hypothesis.U nfortunately,h owever,c rystallisation experiments of [L1 + 2B1N] were unsuccessful. Thus,w ei nvestigated the size of the cages in solution by DOSY NMR experiments,w hich demonstrated the hydrodynamic radii are similar for all three structures (see Section S4). Further evidence was obtained from IMS-MS experiments,w hich indicated that the drift times for the three cages are similar (see Section S5), supporting the conclusion that the subtle differences in the linker structures have little effect on the overall molecular size in these systems. [83,84] We performed analysis of the shapes and electronic structures of the internal cage cavities to probe the effect of using different aldehydes.Cage geometries were optimised as described previously.T hese structures were used to calculate the total electron density and the electrostatic potential at the M06-2X/6-311 + G(d,p) level of theory.W ef ound the 0.0004 a.u. density isosurface and selected as ubsurface approximating the internal cavity of each cage (see Section S2.5 for the algorithm and the implementation). Figure 5s hows the mapping of the electrostatic potential onto the cavity surface.T he potential around the main window between the two non-linear aldehydes is most affected by the neighbouring heteroatoms,w hile the entire cavity surface becomes narrower and elongated in the case of the more expanded thiophene linker in [L1 + 2B1S].T he heteroatoms themselves have little effect on the shape of the void, but do affect its electronic properties,p roviding as ubtle yet important distinction that may have consequences for guest binding and selectivity.

Conclusion
Four new tetraaldehydes,two linear (L1, L4)and two nonlinear (B1, B2), were synthesised and treated with R,R-CHDA to form three new Tet 3 Di 6 organic cages: [3L1], [3L4],a nd [3B2].A ldehyde B1 did not form the expected cage [3B1] when treated with R,R-CHDA or S,S-CHDA. However,u pon co-crystallisation of reaction mixtures containing B1 and both enantiomers of CHDA,the formation of the pseudo-C 3h -symmetric cage [3B1-RS] was detected. We exploited the lack of astable homochiral cage [3B1] to form low-symmetry cage compounds containing mixtures of B1 and linear aldehydes L1-L4 of varying lengths.C omputationally obtained formation energies of the resultant cages were able to rationalise the experimentally observed resultant cages.I np articular, the heteroleptic cage [L1 + 2B1] was predicted to be more stable than the corresponding homoleptic cages [3L1] and [3B1],a nd was indeed preferentially formed. Twoheteroatom-containing analogues of [L1 + 2B1] were formed using this strategy,demonstrating the generality of the social self-sorting approach to synthesis of organic cages of low symmetry.T he slight change in the aldehyde geometry and the incorporation of heteroatoms did not affect the overall size of the cage molecules,w hile allowing for tuning of the shape and electronic properties of the internal cavity.W ehope that these results will aid the design of more anisotropic organic cages for challenging separations and the selective encapsulation of biologically relevant low-symmetry guests.

Conflict of interest
Theauthors declare no conflict of interest.