Isoxazole Nucleosides as Building Blocks for a Plausible Proto‐RNA

Abstract The question of how RNA, as the principal carrier of genetic information evolved is fundamentally important for our understanding of the origin of life. The RNA molecule is far too complex to have formed in one evolutionary step, suggesting that ancestral proto‐RNAs (first ancestor of RNA) may have existed, which evolved over time into the RNA of today. Here we show that isoxazole nucleosides, which are quickly formed from hydroxylamine, cyanoacetylene, urea and ribose, are plausible precursors for RNA. The isoxazole nucleoside can rearrange within an RNA‐strand to give cytidine, which leads to an increase of pairing stability. If the proto‐RNA contains a canonical seed‐nucleoside with defined stereochemistry, the seed‐nucleoside can control the configuration of the anomeric center that forms during the in‐RNA transformation. The results demonstrate that RNA could have emerged from evolutionarily primitive precursor isoxazole ribosides after strand formation.


General Experimental Methods
Chemicals were purchased from Sigma-Aldrich, TCI, Fluka, ABCR, Carbosynth or Acros organics and used without further purification. The solvents were of reagent grade or purified by distillation. Reactions and chromatography fractions were monitored by qualitative thin-layer chromatography (TLC) on silica gel F254 TLC plates from Merck KGaA. Flash column chromatography was performed on Silicagel 60 (40-63 μm) silica gel from Macherey-Nagel. Reactions were conducted under a positive pressure of dry nitrogen in ovendried glassware, and at ambient room temperature, unless otherwise specified. NMR spectra were recorded on Bruker AVIIIHD 400 (400 MHz) or Bruker Avance III (800 MHz) spectrometers. 1 H NMR shifts were calibrated to the residual solvent resonances: DMSO-d6 (2.50 ppm), CD3OD (4.87 ppm), Acetone-d6 (2.05 ppm), CDCl3 (7.26 ppm). 13 C NMR shifts were calibrated to the residual solvent: DMSO-d6 (39.52 ppm), CD3OD (49.00 ppm), CDCl3 (77.16 ppm), Acetone-d6 (29.84 ppm). All NMR spectra were analyzed using the program MestRE NOVA 10.0.1 from Mestrelab Research S. L. Normal resolved mass spectra were measured on a LTQ FT-ICR by Thermo Finnigan GmbH. High resolution mass spectra were measured by the analytical section of the Department of Chemistry of the Ludwig-Maximilians-Universität München on the following spectrometers (ionization mode in brackets): MAT 95 (EI) and MAT 90 (ESI) from Thermo Finnigan GmbH, unless otherwise specified. IR spectra were recorded on a PerkinElmer Spectrum BX II FT-IR system.

Synthesis and purification of oligonucleotides
Phosphoramidites of canonical ribonucleosides (Bz-A-CE, Dmf-G-CE, Ac-C-CE and U-CE) were purchased from LinkTech and Sigma-Aldrich. Oligonucleotides were synthesized on a 1 μmol scale using RNA SynBase TM CPG 1000/110 as solid supports using an RNA automated synthesizer (Applied Biosystems 394 DNA/RNA Synthesizer) with a standard phosphoramidite chemistry. Oligonucleotides were synthesized in DMT-OFF mode using DCA as a deblocking agent in CH2Cl2, Activator 42® as activator in MeCN, Ac2O as capping reagent in pyridine/THF and I2 as oxidizer in pyridine/H2O. The cleavage and deprotection of the CPG bound oligonucleotides were performed with a 1:1 aqueous solution mixture (0.6 mL) of 30% NH4OH and 40% MeNH2. The suspension was heated at 65ºC for 5 min for SynBase TM CPG 1000/110. Subsequently, the supernatant was collected, and the beads were washed with water (2×0.5 mL). The combined aqueous solutions were concentrated under reduced pressure using a SpeedVac concentrator. After that, the crude was dissolved in DMSO (100 μL) and triethylamine trihydrofluoride (125 μL) was added. The solution was heated at 65 ºC for 1.5 h. Finally, the Oligonucleotides were precipitated by adding 3 M NaOAc in water (25 μL) and n-BuOH (1 mL). The mixture was kept at -80 ºC for 2 h and centrifuged at 4ºC for 1 h. The supernatant was removed, and the white precipitate was lyophilized. The oligonucleotides were further purified by semi-preparative reverse-phase HPLC using a 1260 Infinity II Manual Preparative LC System from Agilent (G7114A detector) equipped with the column VP 250/10 Nucleodur 100-5 C18ec from Macherey Nagel. A flow rate of 5 mL/min with varying gradients between 0-15% and 0-40% of buffer B in 45 min was applied for the purifications. The following buffer system was used: buffer A: 100 mM NEt3/HOAc (pH 7.0) in H2O and buffer B: 100 mM NEt3/HOAc in 80% (v/v) acetonitrile. The purified oligonucleotides were analyzed by analytical RP-HPLC on a 1260 Infinity II LC System from Agilent (G7165A detector) equipped with the column an EC 250/4 Nucleodur 100-3 C18ec from Macherey Nagel using a flow of 1 mL/min, a gradient of 0-15% or 0-20% of buffer B in 45 min was applied. Finally, the purified oligonucleotides were desalted using a C18 RP-cartridge from Waters. The absorbance of the synthesized oligonucleotides in H2O solution were measured using an IMPLEN NanoPhotometer® N60/N50 at 260 nm. The extinction coefficients of the oligonucleotides were calculated using the OligoAnalyzer Version 3.0 from Integrated DNA Technologies. For strands containing mainly Isoxazole, extinctions coefficients were calculated based on the base composition method at 223 nm using estimated extinctions coefficients (e.g., 10726 M -1 cm -1 for IO3 and 7616 M -1 cm -1 for cytidine). The structural integrity of the synthesized oligonucleotides was analyzed by MALDI-TOF mass measurement. For this purpose, the synthesized oligonucleotides (2-3 μL) were desalted on a 0.025 μm VSWP filter (Millipore), co-crystallized in a 3-hydroxypicolinic acid matrix (HPA, 1 μL) and measured on a Bruker Autoflex II. UV spectra, melting profiles and the concentrations of purified oligonucleotides were measured on a JASCO V-650 spectrometer.

In-strand cytidine formation reactions
Stock solutions of boric acid (pH 9.7, 100 mM) and Na2CO3 (1000 mM) were prepared in water. The oligonucleotide S1-S7 (2-10 nmol) was with mixed with buffer, Na2CO3, and water. The Fe 2+ source (FeS or FeS2) and DTT were added to the mixture. The final concentration of the components: 100 uM of Oligo, 50 mM Buffer, 100 mM Na2CO3, 100 mM Fe 2+ , 300 mM DTT. The mixture was heated at 90 °C for 2 h in a TAdvanced Thermocycler by Biometra. After cooling to room temperature, the solids were removed by centrifugation and washed with water (2×0.2 mL), residual solids were removed using a syringe filter (0.20 um, PTFE-membrane). The reaction mixture was concentrated by lyophilization and subsequently analyzed by reverse-phase HPLC. The yields of the reactions were calculated by integration of the chromatographic peaks of the products and the use of the calibration curves of the synthetically prepared product.

Digestion and LC-HESI-MS analysis
Reaction buffer 10X and Enzyme mix was bought as a Nucleoside Digestion Mix (M0649S) kit (New England BioLabs Inc.). The purified oligonucleotide (250-500 ng in 46 uL) was incubated with Reaction buffer 10X (5 uL) and Enzyme mix (1 uL) at 37°C for 1.5 h.
The mixture was subsequently diluted to 90 uL and then analyzed by LC-HESI-MS on a Thermo Finnigan LTQ Orbitrap XL and were chromatographed by a Dionex Ultimate 3000 HPLC system. All chromatographic separations except for nucleotides were performed on an Interchim YMC-Triart C18 column column with a flow of 0.15 ml/min and a constant column temperature of 30 °C The following buffer system was used: buffer A: 2 mM HCOONH4 in H2O (pH 5.5) and buffer B: 2 mM HCOONH4 in 80% (v/v) acetonitrile (pH 5.5).
The elution was monitored at 223 nm and 260 nm (Dionex Ultimate 3000 Diode Array Detector). The chromatographic eluent was directly injected into the ion source without prior splitting. Ions were scanned by use of a positive polarity mode over a full-scan range of m/z 80-500 with a resolution of 30000. Nucleotides were scanned by use of a negative polarity mode over a full-scan range of m/z 120-1000 with a resolution of 30000. The synthetic standards for the co-injection experiments were synthesized in our lab (see synthetic procedures or according to reported literature [3] ) or purchased. Figure S8. HPLC-MS chromatograms of strands S1-S7 (left) and their respective N-O cleavage and cyclization products S1 red -S7 red (right).(continued)

Molecular modelling of IO-containing RNA oligonucleotides
For the structure calculation the parameters for the modified base first needed to be incorporated into the topology and the parameter files used by CNSsolve to generate an RNA structure. [5] The bond length, angles as well as dihedral angles were determined using an optimized structure of the modified nucleoside. The calculation for this purpose was performed using a standard Hartree Fock method with a 6-31G base. After the integration of the modified base, the three-dimensional structure was determined by an in-silico annealing procedure. The restriction data used as input for the calculation has been directly derived from the 1 H-1 H NOESY spectra. Due to the characteristic of nucleic acids having a limited amount of protons, the data obtained from the 1 H-1 H NOESY alone is not sufficient to provide the software with enough input for a de novo structure calculation. For this reason, the sugar-phosphate backbone the dihedral angles were determined according to the literature values for an A-form RNA, which was assumed due to the consistency of the fingerprint region with said conformation. For all canonical bases (meaning all the bases except for IO3) the dihedral angles , bond length and bond angles embedded in the CNSsolve program were used. Besides the 1 H-1 H distances extracted from NOESY experiments, the distances between carbon, oxygen and nitrogen atoms were identified, gauging a 3D model of the canonical structure using Chimera 1.14. The distances were added to the restraint distance file supplementing the NOESY data. Approximate 1 H-1 H distances were defined by the NOESY spectra recorded at different (40-320 ms) mixing times. The distances could be divided into two categories based on cross-peaks intensities. Protons close in space are therefore visible for short mixing times, conversely protons further away only appear at higher mixing times. The extracted peak heights were converted to distances taking the cross-peak between C7H5 and C7H6 as a reference due to the fixed and known distance of 2.421 Å between the two protons. The values were generated according to the relationship that the intensity is reciprocal to the sixth of the distance, J∼1/r 6 . [6] Overlapping signals were excluded from the analysis and did not contribute to the molecular modelling procedure.