Experimental and theoretical rationalization for the base pairing abilities of inosine, guanosine, adenosine, and their corresponding 8‐oxo‐7,8‐dihydropurine, and 8‐bromopurine analogues within A‐form duplexes of RNA

Abstract Inosine is an important RNA modification, furthermore RNA oxidation has gained interest due, in part, to its potential role in the development/progression of disease as well as on its impact on RNA structure and function. In this report we established the base pairing abilities of purine nucleobases G, I, A, as well as their corresponding, 8‐oxo‐7,8‐dihydropurine (common products of oxidation at the C8‐position of purines), and 8‐bromopurine (as probes to explore conformational changes), derivatives, namely 8‐oxoG, 8‐oxoI, 8‐oxoA, 8‐BrG, and 8‐BrI. Dodecamers of RNA were obtained using standard phosphoramidite chemistry via solid‐phase synthesis, and used as models to establish the impact that each of these nucleobases have on the thermal stability of duplexes, when base pairing to canonical and noncanonical nucleobases. Thermal stabilities were obtained from thermal denaturation transition (T m) measurements, via circular dichroism (CD). The results were then rationalized using models of base pairs between two monomers, via density functional theory (DFT), that allowed us to better understand potential contributions from H‐bonding patterns arising from distinct conformations. Overall, some of the important results indicate that: (a) an anti‐I:syn‐A base pair provides thermal stability, due to the absence of the exocyclic amine; (b) 8‐oxoG base pairs like U, and does not induce destabilization within the duplex when compared to the pyrimidine ring; (c) a U:G wobble‐pair is only stabilized by G; and (d) 8‐oxoA displays an inherited base pairing promiscuity in this sequence context. Gaining a better understanding of how this oxidatively generated lesions potentially base pair with other nucleobases will be useful to predict various biological outcomes, as well as in the design of biomaterials and/or nucleotide derivatives with biological potential.

Therefore it is important to understand how a distinct H-bonding pattern, arising from the corresponding lesion, may lead to destabilization or conformational changes that potentially impact RNA structure and/or function. To this end we explored the base pairing capabilities of purines oxidized at the C8-position and compared them to their corresponding canonical analogues, as control experiments, using duplexes of RNA as model structural motifs.
While our initial focus was on exploring the outcome of oligonucleotides containing 8-oxoG, we were also interested in probing the impact of the C2-exocyclic amine in guanine, or 8-oxoguanine, which led us to explore duplexes containing inosine or 8-oxoI. To this point inosine is an important modification that is observed in many biologically relevant processes, [22] and that has been reported to code as G, A, or U in a context dependent manner, [23] highlighting the importance of establishing its base pairing abilities. On the other hand 8-oxoI is not expected to be biologically relevant, given that the oxidation potential of inosine is higher than that of A. [24] However, this chemical modification can be used to learn about potential H-bonding patterns and/or conformational changes around the glycosidic bond, as well as on the role of the C2-exocyclic amine. [25,26] It is known that unmodified nucleosides exist in an equilibrium that favors the anti-conformation and result in the H-bonding patterns shown in Figure 1 (Watson-Crick face).
As depicted in Figure 1A, the lack of an exocyclic amine (for I) reduces the number of H-bonds between the purine derivative and its potential Watson-Crick base pair cytidine (C), which can be expected to result in decreased thermal denaturation transitions. [27] On the other hand, functionalization of the C8-position is known to switch the equilibrium in favor of the syn-isomer and leads to a distinct H-bonding pattern ( Figure 1B). [28] This conformational change is a result of steric hindrance between the C8-group/atom and the C5 0 -H atoms. With this in mind, we decided to use the corresponding 8-bromo functionalized nucleosides to explore H-bonding interactions, where the preferred synconformation also exhibits a different H-bonding pattern ( Figure 1C). [29] It is worth noting that both the C8-oxo and C8-Br substituted nucleosides are also capable of forming WC base pairs [30] at the expense of disfavored interactions between this group and the C5 0 -position, potentially resulting in overall thermal destabilization of a duplex containing the modified nucleotides. Lastly, the same behavior can be expected on the corresponding adenosine nucleosides, where A will differ from that expected on 8-oxoA or 8-BrA ( Figure 1D). Oligonucleotides containing the 8-BrA derivative could not be obtained in our hands (vide infra).
Overall, establishing the patterns and preferences for base pairing of the modified nucleosides explored herein is of potential biological relevance, and can also be of use in the design of other nucleosidebased structural motifs or biomaterials. In fact, our laboratory is interested in probing the various base pairing abilities of these and other chemically modified nucleosides to generate aptamers of RNA with distinct selectivities.

| General
The synthesis for the phosphoramidite of 8-oxoG, [19] 8-oxoA, [31] and 8-oxoI/8-BrI [32] were previously reported by us and the same methodology was used to prepare all oligonucleotides, via solid-phase synthesis. The synthesis of oligonucleotides of RNA containing 8-BrA was not possible in our hands due to its transformation to the corresponding 8-methylamine derivative (upon AMA-deprotection of the synthesized oligonucleotide, Figure S-4). It is possible that oligonucleotides containing this modified nucleoside can be attained by varying the deprotection conditions, however we have not been yet successful.
UV-vis spectroscopy of all small molecules was carried out on a Perkin Elmer λ-650 UV/vis spectrometer using quartz cuvettes (1 cm pathlength). All experiments described herein were carried out in triplicate.

| RNA synthesis
All oligonucleotides were obtained via solid-phase synthesis using a 394 ABI DNA/RNA synthesizer. CPG supports and 2 0 -O-TBDMS phosphoramidites of U, A, C, and G were purchased from Glen Research.

| UV-vis spectroscopy
All oligonucleotides were quantified via UV-vis using a 1 mm pathlength with 1 μL volumes (Thermo Scientific Nano Drop Nd-1000 UV-vis spectrometer). Origin 9.1 was used to plot the spectra of monomers and oligonucleotides for comparison. shown in Tables S1-S10.

| RESULTS
The sequence of the dodecamers is shown in Table 1 Figure 2) was chosen based on that of a previous report from our group that displays thermal denaturation transitions in the 70 C range, [19] which is a value that allowed us to record increments or drops in the corresponding thermal denaturation transition (T m ) values accurately. In addition, thermodynamic parameters of RNA duplexes containing I:C base pairs were recently reported and showed that flanking Gs provided increased stability. [33] The obtained values All experiments were carried out in triplicate. Values denoted with an asterisk (*) were measured, and matched reported values. [19] A (>>) sign was given to values with differences greater than 5 C; and (≈) to differences < 1 C varies from those measured for G or 8-oxoG, while a 8-BrG:C base pair is the most stable of the family, the formation of a 8-BrG:G or 8-BrG:I base pair displayed a relative stability within this family.
Next, we explored the thermal stabilities with the model oligonucleotide containing I (10) opposite G/U/A/C/I/8-oxoG (2/3/4/5/6/7, respectively). Interestingly, upon close inspection of literature we discovered that thermal denaturation transitions in RNA are only available for a handful of scenarios, specifically those involving an I:U, [35] or an I:C [33] base pair, where I:U base pairs have been shown to distort the RNA duplex. As shown in Figure 2B, pairing ability of dI has been reported to be C > A > T ≈ G > I in two different sequence contexts. [36,37] Most notably is the change involving a preference for U in RNA and A in DNA, as the second most stable base pair. Possibly explained due to overall structural changes within the duplex (A-form vs B-form), [38] although more examples are necessary to assign this as a general trend. Another important trend can be observed upon comparison with values within the G-family where the discrepancy between these two nucleosides indicates the impact of the C2-exocyclic amine on the duplex overall. All of the proposed base pairs can be justified with 8-oxoI existing as a syn-isomer (vide infra). In addition, 8-oxoG and U can be seen as two nucleobases with similar H-bonding patterns, a fact that has been observed before in their mode of binding, [39] thus justifying the trends with A or C ≈ A [40] ], an aspect that requires further inspection in other sequence contexts given that the trends between RNA:RNA and RNA: DNA, which form A-form duplexes, can be expected to be similar.

| Theoretical models-H-bonding contributions
The contribution from the hydrogen bonding was investigated by applying electronic structure calculations, which were performed using the quantum chemical program package Gaussian G16 [41] and Q-Chem 5. [42] The H-bonding energy was evaluated as the free energy difference between the dimer and the sum of two monomers, all fully optimized in structures. Geometry optimizations were carried out employing the hybrid functional B3LYP with Grimme's empirical dispersion correction DFT-D3(BJ), [43,44] and the 6-31+G* basis set. To account for the free energy correction, standard normal mode analysis and frequency calculations were performed at the same level of theory. The solvation free energies were obtained using the polarizable continuum model (PCM) with water as the solvent. In addition, to calculate accurate single point electronic energies, second order Møller-Plesset (MP2) perturbation theory [45] and a larger basis set 6-311+ +G** were used. In the SI we include some other results employing various DFT functionals and basis sets. Using MP2 theory as a gauge for a limited number of compounds, we decided the level of theory here is best compromise between accuracy and computational cost.
Despite active research on the subject, treating hydrogen bonds accurately with DFT remains a challenging task. [46] Using this methodology, we considered the following in order to establish a plausible/preferred base pair, where lower energies indicate more stable base pairs: (a) structural information (planarity; C1 0 ─C1 0 distance; and number of H-bonds) as the major means of estimating stability; (b) the calculated free energies of formation for base pairs, which serve as a partial (sometimes major) reference contribution to the overall base pair stability; and (c) the effect of backbone and π-stacking is neglected in the model. Planarity was determined by measuring dihedral angles among the atoms participating in H-bonding interactions (0 or 180 ), where most of the base pairs failing this category displayed distortions that were visibly out-of-planarity. C1 0 ─C1 0 distances were measured and all reasonable base pairs fell in the 10-11 Å range, in agreement with a base pair ability to fit within a regular helix. [47] Base pairs that displayed distances outside of this range were considered as less probable. H-bonding interactions were measured and qualified as those closer than 2 Å between the donor and the acceptor, where reasonable base pairs contained two or more of such interactions. It is known that interactions between Watson-Crick pairs and other biopolymers require two H-bonds to achieve fidelity, and that recognition from the minor groove side is not affected by base pair reversals. [48] It is important to note that the energetic contri- We initiated our analyses by building a G:C WC base pair, and explored anti-/syn-conformations; which were then compared to their corresponding purine derivative analogues ( Figure 3). Gratifyingly, the modeling that was carried out validated our approach as follows: it is reasonable to expect that discrepancies between experimental results (thermal denaturation transitions) and the modeling, arise from structural changes imposed by these groups on the overall duplex.
Furthermore, we explored syn-conformations on both purine and pyrimidine rings, to observe that all options failed at least one of the three categories suggesting a stable base pair. Notably, the syn-8oxoG: anti-C base pair (entry 9) displayed a low energy that would correspond to a stable interaction, however, the base pair is notably out-of-planarity and the C1 0 -C1 0 distance is closer than the optimal range. Similarly a syn-I: syn-C base pair (entry 11) displayed planarity but with a higher energy than its anti:anti analogue (entry 2), To this end, we combined the experimental data with modeling, via DFT. We took into consideration the Cis-orientation as the preferred geometry. [49] In addition, we took established C1 0 -C1 0 internucleotidyl distances from previous reports, with distances between 10 and 11 Å as likely base pair geometries/conformers. [50] WC base pairing (G:C). As expected, comparing the thermal stability of RNA duplexes containing purine rings lacking the C2-exocyclic amine to their analogues containing this functional group, led to decreased T m values in each case ( Figure 7A). Furthermore, the impact arising from an additional H-bonding interaction was in the 1 kcalÁmol −1 range and is within previously reported experimental values of app. 2 kcalÁmol −1 . [33] The difference between the base pairs, able to adopt various geometries in a sequence dependent manner, [51] which grants probing different sequences to establish this trend as general. Another observation that is noteworthy regards to the duplexes containing a G:8-oxoG base pair, which displayed similar values to those measured on the G:U analogues and highlights the ability of an 8-oxoG lesion to mimic the base pairing of U ( Figure 7B).  Figure 7C). Besides the importance of G:A base pairs in various biological contexts [52] our laboratory recently reported on a case where reverse transcription allowed for the incorporation of dA opposite I, but not opposite G, [32] this case was of particular interest to us. The has been observed within crystals of DNA duplexes [53] ; as well as other reports. [54] Gratifyingly, all experimental and modeling data  Figure 7D). This suggests that a combination of an anti-and a syn-conformation give rise to this interaction. This arrangement has been previously observed in crystalline duplexes of RNA [55] and established in disease models. [56] Since the I: G base pair is the only one not within the range, this suggests that the exocyclic amine plays a role in this base pairing family, where G may enable an easier anti-syn conformational change. As illustrated in Figure 7B, H-bonding in an 8-oxoG:I base pair can be rationalized by having the syn-conformation of 8-oxoG, however the G:I base pairing requires flipping of I toward its, least stable, syn-conformational isomer renders a base pair that is thermodynamically less stable. Anti-G: syn-8oxoG have been reported to have some increased stability in duplexes of DNA, even more stability than an A:8oxoG base pair. [57] We then carried out calculations on models containing the expected geometries, however we were surprised to find that an antiG:synG base pair did not lead to base pairs with a planar geometry (entries 44,52). Interestingly the 8-oxopurine:synG derivatives were found to be in the expected H-bonding interactions. These results suggest that there are factors arising from the presence of the C8-carbonyl that are contributing to the formation of a planar structure. Overall, the G: G base pair provides some thermal stability, compared to other base pairs, while not being able to form planarized structures, thus providing a degree of destabilization on the duplex.

| CONCLUSION
Overall, it is important to note that the model does not take into consideration stacking interactions and other conformational changes, thus limiting the amount of information that can be drawn from this data, which is in agreement with other models. [58] However, it does provide an important picture in some cases and also yielded good evi-