Synthesis and Incorporation of k 2 U into RNA

Lysidine (k 2 C) is one of the most modified pyrimidine RNA bases. It is a cytidine nucleoside, in which the 2-oxo functionality of the heterocycle is replaced by the ɛ -amino group of the amino acid lysine. As such, lysidine is an amino acid-containing RNA nucleoside that combines directly genotype (C-base) with phenotype (lysine amino acid). This makes the compound particularly important in the context of theories about the origin of life and here especially for theories that target the origin of translation. Here, we report the total synthesis of the U-derivative of lysidine (k 2 U), which should have the same base pairing characteristics as k 2 C if it exists in the isoC-like tautomeric form. To investigate this question, we developed a phosphoramidite building block for k 2 U, which allows its incorporation into RNA strands. Within RNA, k 2 U can base pair with the counter base U and isoG, confirming that k 2 U prefers an isoC-like tautomeric structure that is also known to dominate for k 2 C. The successful synthesis of a k 2 U phosphoramidite and its use for RNA synthesis now paves the way for the preparation of a k 2 C phosphoramidite and RNA strands containing k 2 C.


Introduction
RNA contains a vast variety of modified nucleosides. [1] Many of the modified bases are just methylated versions of the canonical nucleosides A, C, G and U, but other are highly modified. This is achieved with the help of dedicated biosynthesis machineries. [2] From a prebiotic point of view in which genotypephenotype discussions address the question of whether life started with nucleic acids or peptides, the most interesting modified nucleosides are those which are modified with amino acids. [3] These molecules are chemical structures between genotype and phenotype. They directly merge the properties of nucleic acids with those of amino acids. As such, RNA containing these amino acid-modified bases could have been a central element for the origin of life and for the origin of translation as already discussed by Grosjean and others. [4 -7] RNA containing these amino acids could establish catalytic properties next to information encoding functions. Catalytic RNAs in turn are the central elements in all discussions about how life emerged from simple starting materials. [8,9] One of the most interesting amino acid modified nucleosides is lysidine (k 2 C), [10,11] which is a cytidine base to which a lysine amino acid is attached with the ɛ-amino group to the C(2) position ( Figure 1). Agmatidine (agm 2 C) [12,13] is a close relative, which features the guanidinium group found in the amino acid arginine ( Figure 1). We also want to mention the amino acid modified RNA base puromycin in which the amino acid is attached to the C(3')-OH group. This base is in use for ribosomal studies. [14,15] In our effort to investigate the properties of RNA containing amino acid-modified nucleosides as units that equip such RNA potentially with peptide-like properties, we were interested to study the lysine modified uridine base. This compound is a close derivative of lysidine, which we name k 2 U. k 2 U is in principle the deamination product of k 2 C which has likely occurred under prebiotic conditions, where the bases were exposed to potentially hot aqueous environments. [16,17] Under these conditions, deamination of k 2 C to k 2 U is an expected process. k 2 U can in principle exist is two tautomeric states. In one it has U-type base pairing properties, while in the second it should behave like an isoC similar to k 2 C, which also exists in a quinoidlike tautomeric structure ( Figure 1) [10,18 -23] . In this latter scenario, k 2 U and k 2 C are supposed to have similar base pairing properties. A major difference could be that while k 2 C is protonated, k 2 U is likely not. k 2 C protonation is supposed to occur at N(6) and hence it does not change the base pairing properties. [10,24]

Results and Discussion
To investigate how k 2 U is affecting the structure and properties of RNA, we started the synthesis of k 2 U as its phosphoramidite building block (k 2 U-PA) and investigated procedures that allow its incorporation into RNA. The synthesis of the k 2 U-PA is depicted in Scheme 1. The synthesis was started with carboxybenzoyl-(Cbz) protected lysine 1, which we converted with 1-({[2-(trimethylsilyl)ethoxy]carbonyl}oxy)pyrrolidine-2,5-dione into the Cbz-and Teoc-protected lysine compound 2. Subsequent protection of the carboxy group with 2-(trimethylsilyl)ethanol furnished the fully protected lysine amino acid 3. After Cbz-deprotection, we obtained the amino acid coupling partner 4. At the nucleoside side, we treated uridine 5 under Mitsunobu conditions with DIAD and PPh 3 to obtain the literature-known 2',3'-protected cyclouridine compound 6. [25] Coupling of this intermediate with the lysine building block 4 in the presence of LiCl and DBU furnished the protected lysine coupled uridine derivative 7. We next protected the primary 5'-OH group with DMTCl and the secondary 2'-OH group with TBSCl to give the intermediate 9. [26 -28] Compound 9 was finally converted into the phosphoramidite building block using a standard procedure. [29] Purification of the k 2 U-PA was difficult due to its high polarity. We needed to use a rather polar mixture of dichloromethane/acetone (8 : 3) as the mobile phase for the chromatographic purification. This provided, however, the target compound k 2 U-PA in a total yield of 12 % in just eight steps in a good but not excellent purity.
To study the properties of k 2 U in RNA, we next inserted the compound into an RNA strand using solid phase RNA synthesis. [30 -33] The oligonucleotide synthesis was performed using standard coupling conditions. This allowed us to achieve a coupling yield of 30 %. Importantly, the incorporation of the next canonical bases, e. g., coupling of uridine to k 2 U, was not affected. We achieved elongation yields that reached typically 95-98 %. All together the method provided enough material for all further studies. Optimization of the k 2 U-incorporation yield was consequently not performed. In addition, we noted that the purity of the obtained strands was high allowing rapid separation of the target k 2 U containing oligonucleotide. After the RNA synthesis, we removed the Teoc-and TMSEprotecting groups from the lysine moiety with saturated ZnBr 2 solution in isopropanol/nitromethane (1 : 1) at r.t. over night. It is interesting that the RNA strand is stable under these quite Lewis acidic conditions. Our observation, however, agrees with an earlier report. [34] Own attempts to cleave the Teocand TMSE-protecting groups with HF·Et 3 N complex provided only partial deprotection which gave a mixture of products. In our hands, the reported ZnBr 2 method gave superior results. [33] RNA degradation was not observed.
We subsequently cleaved the oligonucleotide from the solid support and removed the protecting groups from the canonical bases with a mixture of methylamine and ammonia (1 : 1) at 65°C for 5 min. Figure 2 shows the sequence of the prepared oligonucleotide together with the raw HPL-chromatogram and the MALDI-TOF spectrum. These data prove the integrity Helv. Chim. Acta 2020, 103, e2000016 and the high purity of the synthesized k 2 U-containing RNA strand.
We finally studied how the k 2 U base affects the stability of the RNA duplex. Figure 3 shows the melting curves of the k 2 U-containing oligonucleotide (k 2 U:A base pair, red) in comparison to the unmodified RNA duplex containing a U : A base pair (black). The table in Figure 3 depicts all melting points measured for k 2 U facing any of the four canonical bases, together with all possible combinations of canonical bases as reference strands. While the U : A reference strand melts at 42°C, replacement of U by k 2 U reduces the melting temperature (k 2 U:A) to just 32°C, which is a dramatic destabilization. This is unexpected and not explainable with a U-type tautomeric structure, because the lysine residue can in principle point out of the large shallow groove of the RNA-duplex in A-Scheme 1. Synthesis of k 2 U and of its phosphoramidite building block k 2 U-PA. Helv. Chim. Acta 2020, 103, e2000016 conformation. Because the C(2)-O atom, which is replaced by the lysine residue, does not take part in the H-bonding to the A-counterbase as depicted in Figure 1, the lysine residue should not affect base pairing so strongly if k 2 U would exist in the Utautomeric form. Further melting point studies in which we exchanged systematically the counter base showed the k 2 U is unable to undergo any productive base pairing. The only slight stabilization that we detected is in the k 2 U:U situation (38°C), which is compared to the U : U (34°C) situation stabilized by 4°C.
This rather large global destabilization is best explained, if we assume that the k 2 U base exists indeed not in the typical U-tautomer but in the hemiquinoid tautomeric state known from isoC (Figure 4). As such, k 2 U behaves like an isoC tautomer, which indeed prohibits k 2 U to form productive base pairs with any of the other canonical bases particularly with the purine bases A and G. The slight interaction with U is in this scenario explained because the U counterbase can get engaged with isoC-tautomeric k 2 U with two H-bonds. To investigate this isoC type structure of k 2 U in more detail, we created a duplex in which k 2 U is facing isoG as the counterbase. Indeed, in this situation we measure a higher melting point of 37°C (Figure 4) close to the k 2 U:U situation (38°C), in fully agreement with the idea that k 2 U is a lysine modified isoC derivative that has an k 2 C like tautomeric structure. Further support for the idea that the k 2 U base exists predominantly in the quinoid tautomeric structure comes from NMR data. In our compounds with the k 2 U base the typical NH 1 H-NMR signal around δ = 9.5 ppm was not observed. Instead we observed the NH signal at around δ = 7.0 ppm, in line with the quinoid tautomeric structure of k 2 U ( Figure 4). This shift to around δ = 7.0 ppm agrees with literature data about such compounds. [35]

Conclusions
Here, we report the development of a phosphoramidite building block of k 2 U which is a close deamination-based relative of the non-canonical base k 2 C. We show that the modified base can be incorporated into RNA strands using standard phosphoramidite chemistry in combination with a three-stage deprotection protocol, in which we first cleave the protecting groups at the lysine residue, followed by deprotection of the nucleobases and cleavage from the resin. Melting point studies and NMR data show that k 2 U exists in a tautomeric state that resembles the situation in isoC. As such, k 2 U destabilizes RNA duplexes dramatically allowing only limited interactions with U and most importantly isoG as counterbases.

General Methods
Chemicals were purchased from Sigma -Aldrich, TCI, Fluka, ABCR, Carbosynth or Acros organics and used without further purification. Strands containing canonical bases and isoG were purchased from Metabion. The solvents were of reagent grade or purified by distillation, unless otherwise specified. Reactions and chromatography fractions were monitored by qualitative thin-layer chromatography (TLC) on silica gel F254 TLC plates from Merck KGaA. Flash chromatography was performed on Geduran® Si60 (40-63 μm) silica gel from Merck KGaA. NMR spectra were recorded on Bruker AVIIIHD 400 spectrometers (400 MHz). 1 H-NMR shifts were calibrated to the residual solvent resonan-  Oligonucleotides were detected at wavelength: 260 nm. Melting profiles were measured on a JASCO V-650 spectrometer. Calculation of concentrations was assisted using the software OligoAnalyzer 3.0. For strands containing artificial bases, the extinction coefficient of their corresponding canonical-only strand was employed without corrections. Matrixassisted laser desorption/ionization-time-of-flight (MALDI-TOF) mass spectra were recorded on a Bruker Autoflex II. For MALDI-TOF measurements, the samples were desalted on a 0.025 μm VSWP filter (Millipore) against ddH 2 O and co-crystallized in a 3-hydroxypicolinic acid matrix (HPA). (2). The reaction was performed according to the published procedure. [36] To a stirred suspension of cbz-protected lysine (1) (3 g, 10.7 mmol, 1 equiv.) in dioxane/water (1 : 1) mixture, TEA (2.23 ml, 16 mmol, 1.5 equiv.) was added. To the resultant solution, TeocÀ OSu (3.2 g, 12.3 mmol, 1.1 equiv.) was added and the mixture was left to stir overnight at room temperature. Afterwards, the mixture was diluted with water, acidified with saturated KHSO 4 solution and extracted with diethyl ether. The combined organic layers were washed with water, dried over Na 2 SO 4 and evaporated in vacuo. Yield: 85 %. 1

Synthesis and Purification of Oligonucleotide
The oligonucleotide was synthesized on a 1 μmol scale using a DNA automated synthesizer (Applied Biosystems 394 DNA/RNA Synthesizer) with standard phosphoramidite chemistry. The phosphoramidites of canonical ribonucleotides were purchased from Glen Research and Sigma-Aldrich. Oligonucleotide containing k 2 U nucleoside was synthesized in DMT-OFF mode using phosphoramidites (BzÀ A, Dmf-G, AcÀ C, U) with BTT in MeCN as an activator, DCA in CH 2 Cl 2 as a deblocking solution and Ac 2 O in pyridine/THF as a capping reagent. After the synthesis the solid support was treated with saturated solution of ZnBr 2 in i PrOH/ MeNO 2 (1 : 1, v/v, 1 ml) and left overnight at room temperature. Then the beads were washed with water and 0.1 M EDTA solution. The cleavage and deprotection of CPG-bound oligonucleotide were performed with aq. NH 4 OH/MeNH 2 solution (1 : 1, v/v, 1 ml) for 5 min at 65°C. The resin was removed by filtration, washed with H 2 O and the solution was evaporated under reduced pressure. The residue was subsequently heated with a solution of triethylamine trihydrofluoride (125 μl) in DMSO (50 μl) at 65°C for 1.5 h. Upon cooling in an ice bath, AcONa (3 M, 25 μl) and BuOH (1 ml) were added. The resulting suspension was vortexed and cooled in a freezer (À 80°C) for 1 h. After the centrifugation, supernatant was removed, and the remaining oligonucleotide pellet was dried under vacuum. The oligonucleotide was analyzed and purified using RP-HPLC. The structural integrity of RNA was analyzed by MALDI-TOF mass measurement.