The Profile and Dynamics of RNA Modifications in Animals

Abstract More than a hundred distinct modified nucleosides have been identified in RNA, but little is known about their distribution across different organisms, their dynamic nature and their response to cellular and environmental stress. Mass‐spectrometry‐based methods have been at the forefront of identifying and quantifying modified nucleosides. However, they often require synthetic reference standards, which do not exist in the case of many modified nucleosides, and this therefore impedes their analysis. Here we use a metabolic labelling approach to achieve rapid generation of bio‐isotopologues of the complete Caenorhabditis elegans transcriptome and its modifications and use them as reference standards to characterise the RNA modification profile in this multicellular organism through an untargeted liquid‐chromatography tandem high‐resolution mass spectrometry (LC‐HRMS) approach. We furthermore show that several of these RNA modifications have a dynamic response to environmental stress and that, in particular, changes in the tRNA wobble base modification 5‐methoxycarbonylmethyl‐2‐thiouridine (mcm5s2U) lead to codon‐biased gene‐expression changes in starved animals.

More than ah undred distinct modified nucleosides have been identified in RNA, but little is known about their distribution across different organisms, their dynamic naturea nd their response to cellular and environmental stress. Mass-spectrometry-based methods have been at the forefront of identifying and quantifying modifiedn ucleosides.H owever, they often require synthetic reference standards, which do not exist in the case of many modified nucleosides,and this therefore impedes their analysis. Here we use am etabolic labelling approacht o achieve rapid generation of bio-isotopologues of the complete Caenorhabditis elegans transcriptome and its modifications and use them as reference standards to characterise the RNA modification profile in this multicellular organism through an untargeted liquid-chromatography tandem high-resolution mass spectrometry (LC-HRMS) approach. We furthermore show that severalo ft heseR NA modificationsh ave ad ynamic response to environmental stress and that, in particular,c hanges in the tRNA wobble base modification 5-methoxycarbonylmethyl-2thiouridine (mcm 5 s 2 U) lead to codon-biased gene-expression changes in starved animals.
Canonical nucleobases,e speciallyt hose of ribonucleic acids, are naturally subject to diversem odification. The number of identified modifiedr ibonucleosides has grown to over ah undred, and more than half of them can be found in eukaryotic RNA. [1] Althought hey have been described in ribosomal RNAs (rRNAs),m essenger RNAs (mRNAs) and various noncoding RNA species, the highest frequency and greatest chemical diversity of modifiedn ucleosides can be found in transfer RNAs (tRNAs). [2] There, their presence contributes to the correct functioning of the protein synthesis machinery by providing stability,s tructure and adding diversity in molecular recognition. [2,3] In the cases of bacteria andy east, almostacompletep icture of genes and pathways leading to tRNA modifications exists and many homologous proteins have been described in other, higher eukaryotes. [4] The recent development of sequencing-based detection methodsh as facilitated the identification of modifications in mRNAs. Examples include 5-methylcytidine (m 5 C), pseudouridine (y), N 1 -methyladenosine (m 1 A) and N 6 -methyladenosine (m 6 A). [5] The last of thesew as recentlyi dentified as the first reversible RNA modification and provided direct evidenceo ft he existence of dynamic nucleoside modification processes. [6] The idea of RNA modifications as part of ad ynamic process is not completely new or limited to mRNA, and af ew observations have collectively implicated ad ynamic mechanistic role for RNA modifications. They include the presence of tissuedependentl evels of modifications,t he existence of variable methylation of specific ribosomal base residues,l inks between specific tRNA modifications and disease and progressive RNA modification throughout neural cell ageing. [7] Furthermore, RNA modifications have been linked to stress response in multiple organisms. [8,9] It is thus becoming more evident that temporal and spatialc ontrol of RNA modifications might be ubiquitousa nd important forc orrect functioning of the RNA. The majority of studies on the dynamics of modified ribonucleosides characterise single or af ew modificationsa tatime due to technical limitations. Most sequencing-based methods rely either on antibodies that can recognise certain modifications or on reverset ranscription errors that can be interpreted for the presence of modified nucleosides,b oth of which limit the number of RNA modificationst hat can be analysed simultaneously.M ass-spectrometry-based methods require chemically synthesised nucleosides that can be used as reference standards during the measurements to establish the presence and identityo fa ny given modification. This limits the analysist o modifications for which reference standards are readily avail-able. There are af ew studies that describeM S-based methods that overcome the need for syntheticr eferences and enable the characterisation of multiple nucleoside modificationsa t the same time. [10,11] In vivo, RNA modification landscapes of animalsa nd their dynamic nature under differents tress conditions are still unknown. Here we describe am ethod that enables the untargeted and automated characterisation of the most abundantR NA modifications in am ulticellular organism under multiple stress conditions.
In this approach, we use metabolic labelling to generate bio-isotopologues of the Caenorhabditis elegans transcriptome and apply them as reference standards in aH RMS-based methodf or the identification of RNA modifications and their dynamic changes upon heat shock and starvation. Our results show that several RNA modificationse xhibit changes in abundance levels upon stress induction, in ar eversible manner.I n particular, changes in the tRNA wobble base modification 5methoxycarbonylmethyl-2-thiouridine (mcm 5 s 2 U) lead to codon-biased gene-expression changes in starved animals. We based our method on the powerful software toolM etExtract, originally developed for the automated extractiono fm etabolite-derived LC-MS signals from LC-HRMS full-scand ata obtained from isotopically labelled biological samples. [12] We reasoned that this software could be adapted for the untargeted identification of nucleoside modifications in RNA from isotopically labelled whole organismss uch as C. elegans. d-[ 13 C 6 ]-Glucoseh ad previously been used to isotopically label the bacterial transcriptome as it feeds into the pentose phosphate pathway ( Figure S1 Ai nt he Supporting Information). [10] We set out to culture C. elegans in the presence of isotopically labelled bacteria,s erving as food, to transfer the isotopes from the Escherichiac oli transcriptome to the C. elegans transcriptome and to generate bio-isotopologues of every modified ribonucleoside present in the animal.W ef irst grew HT115 E. coli in the presence of d-[ 13 C 6 ]glucoseb ya daption of previously reported methods. [10] In particular,w ea dditionally added the amino acids from Dulbecco's modified Eagle's medium (DMEM) to the culture mediumt oa void 13 Ci ncorporation into amino acids involved in de novo nucleotide synthesis and thus to achieve more uniform 13 C 6 -labelling rather than 13 C n -labelling (Figure S1 B, C).
We then cultured C. elegans in the presence of the labelled bacteria for three generations, after which the nematodes were harvested for total RNA isolation ( Figure 1). The degree of success of isotope incorporation into the C. elegans transcriptomew as then assessedb yL C-MS/HRMSa nalysis of enzymatically digested total RNA samples ands ubsequent assessment of the high-resolution mass spectrum of adenosine (Figure S1 C). Having established the 13 C-labelling of C. elegans RNA, we aimed to determine itsm ost abundant RNA modifications. We thusf ractionated total RNA into small (< 200 nt) and large (> 200 nt) RNAs ( Figure S2)t oi ncrease the sensitivity for the less abundant-primarily tRNA-modificationsi nthe small-RNAf raction.
A2:1 mixtureo fl abelled and unlabelled RNA (for each fraction) was then measured by HPLC-HRMS in the 250-500 m/z range ( Figure 1). Unlabelled RNA was obtained from control nematodes grownu nder standard conditions with unlabelled bacteria as af ood source. The recorded data weret hen analysed by using an ucleoside-adaptedv ersion of the MetExtract software.T he raw LC-HRMS dataw ere mined for all ions coeluting and showingam ass shift correspondingt on ative and 13 C 6 -labelled isotopologues to produce an m/z list containing high-resolution massesl ikely to correspondt ot he different RNA nucleosides present in the mixture and readily detectable. These masses were then manually screeneda gainst masses from known RNA modifications listed in RNA modification databases to obtain potentialstructures. [1] Thus, at this stage we had identified RNA modifications by their accurate masses and those of their corresponding, coeluting, 13 C 6 isotopologues. (Complete structural assignment was later achieved as described in the next section.) For this reasons ome residues weref ound more than once:t he mass of methylcytidine, for instance, was found three times, presumably m 5 C, 3-methylcytidine (m 3 C) and 2'-O-methylcytidine (Cm). Altogether,w ei dentified 21 and 26 modificationsi nt he C. elegans large-and small-RNA fractions, respectively (Table 1). In the small-RNA fractionw eo bserved the m/z value corresponding to 3-(3-amino-3-carboxypropyl)uridine (acp 3 U) and its corresponding 13 C 6 isotopologue, am odificationt hat has not been previously described in eukaryotic RNA, thus demonstrating the potential of our approach for the discoveryofm odified ribonucleosides in ac ompletely untargeted fashion.
Next, we aimed to quantify the RNA modification landscape of C. elegans under physiological stress to improveu nderstanding of the relationship between modified RNAs and stress response pathways. Heat stress and starvation are two stress conditions that lead to large-scale gene-expression changesi n animals. [13] We either subjected C. elegans larvae to heat shock by shiftinga dult animals from 20 to 37 8Cf or 4hor we starved the animals by removing their food source for 4h at 20 8C. Within 4heither of heat stress or of starvation, adult C. elegans animalss howed gene-expression changest hat were specific to each stress condition:u pon heat stress several heat-shockf actors were strongly upregulated ( Figure S4 A), whereas geneexpression changes upon starvation strongly overlapped with previousstarvation data on C. elegans (Figure S4 B).
Having established the stress conditions, we then aimed to quantify the changes in RNA modifications under these conditions of heat shock and starvation.W ef ocusedo nR NA modifications that we had profiled earlier (Table 1) andc oulde asily detect.B yu sing the HPLC-MS/HRMS comparativeq uantitation method, we tried to capture the changes in the RNA modification landscape of C. elegans before and after stress induction. This was achieved by spiking in digested 13 C 6 -labelled RNA at 5%, v/v into unlabelled digested RNA obtained from stressed (heat-shocked or starved) or controla nimalsa nd subjecting the mixture to targeted LC-MS/HRMS analysis.W here possible, standards from commercial and synthetic sourcesw ere used to identify the presence of most residues unequivocally (m 1 A, m 6 A, i 6 A, N 4 -acetylcytidine (ac 4 C), m 5 C, Cm, Gm, N 2 ,N 2 -dimethylguanosine (m 2 2 G), 7-methylguanosine (m 7 G), I, Um, 5-methyluridine (m 5 U), mcm 5 s 2 U);m oreover,w ith the added selectivity of tandemm asss pectrometry (MS/HRMS), other residues for which standards are not available could also be assigned by taking the co-elution of their fragments masses into account (i.e.,t 6 A, ms 2 t 6 A, acp 3 U). We were unable to resolve putative m 1 Ga nd m 2 G. However,b ecause these residues do not change we did not proceed further in structurally identifying them and refer to these residues as mG (base-methylated guanosine). The peak areas of each modified nucleoside and its corresponding 13 C 6 bio-isotopologues, presenti nt he spike-in,w ere then determined in each sample, and their ratio was calculated (A condition /A SIL ,S IL = stable isotopel abelling) to obtain an ormal- Table 1. List of all modifications that could be assigned to known ribonucleoside modificationsf rom al ist of high-resolution masseso btained from the MetExtract algorithm.
Next, we compared the normalised peak areas of stressexposed nematodes with those of control animals for every modification (Figure2A, B). We observedo nly two significant (p < 0.05) RNA modification changes in heat-shocked animals: the N 2 ,N 2 ,7-trimethylguanosine (m 2,2,7 G) residue showed a strong reduction in the large-RNA fraction,a nd mcm 5 s 2 Ul evels showedamodest decrease in the small-RNA fraction. On the other hand, starvation induced numerousR NA modification changes in both the large-and the small-RNAp opulations. Interestingly,i nt he large-RNA fraction in particular, we observed that several base methylations increased upon starvation; these included m 5 C, m 1 A, m 7 G, m 5 U, m 2,2,7 Gand m 2 2 G. Observable RNA modificationc hanges upon starvation in the small-RNA fraction were constrained to two known tRNA wobble base modifications:a c 4 Ca nd mcm 5 s 2 Ul evels both decreased in starved animals. We validated our results by absoluteq uantificationo ft wo of the RNA modifications by am ethod previously reported by us [14] in which m 5 Cl evels in the large-RNA fraction were indeed found to be significantly increased upon starvation (Figure S5 A, p < 0.05), whereas m 6 Al evels in the large-RNAf raction showedn oc hange between control and starved animals (FigureS5B). Both these absolutequantitations are in line with the relative quantification method based on bio-isotopologues presented here. Overall,t his allowed us to determine the relative abundance of multiple RNA modifications in whole animals upon stress induction.
To gain afull picture of the dynamic nature of RNA modifications, we tried to determine whether or not the stress-induced changes in RNA modificationsa re reversible. To this end we rescued starved animals by reintroducing food and compared the changes in RNA modifications to those in age-matched control animals. All RNA modifications that were significantly altered upon starvation showed ag eneralt rend of reversal upon starvationr escue, with, for example, m 5 Ca nd m 5 Ui nt he large-RNA fraction ( Figure S6)a nd mcm 5 s 2 Ui nt he small-RNA fraction reversing significantly.A ss hown in detail for mcm 5 s 2 U in Figure 2C the levels of this modification were significantly loweredu pon starvation, whereas upon rescue (p value < 0.01) its levelsn ol onger differed from those in the control population. Absolute quantification of m 5 Ca nd m 6 Ai nt he large-RNA fractionso fr escued animals and control animals ( Figure S5 C, D) were again in line with our relative quantification measurements:t hat is, we measured no differenceb etween control and rescued populations for thesem odifications. Our results show that several RNA modificationsn ot only respond to stress conditions, buta re also dynamically regulated between stress and normal growth conditions of C. elegans. We were particularly interested in the tRNA wobble base modification mcm 5 s 2 U, which showedl evels that were significantly reduced in starved animals but were then restored to wild-type levels upon reintroduction of food. mcm 5 s 2 Ui sf ound in tRNA-Lys UUU , tRNA-Glu UUC and tRNA-Gln UUG at positionU 34 of the anticodon loop andi ti sr equired for cell viability and animal development. [15] Loss of mcm 5 s 2 Uh as been linked to inefficient trans-lation, ribosomes talling and protein misfolding in cases of genes that are enriched for codons AAA, GAA and CAA. [9] To test whether starvation-induced reduction of mcm 5 s 2 U levels leads to codon-specificc hangesi ng ene expression, we analysed the codon enrichment of AAA, GAA and CAA in comparisonw itha ll other codons, among genes that show differential expression upon starvation ( Figure S4). Surprisingly,t he codons AAA and GAA show significant enrichment in differentially expressed genes after starvation ( Figure 3A). Previously, such codon effects on translation efficiency were observed upon performing either ribosomal profiling or proteomics experiments. [9] Our resultss how that such codon-biased effects of tRNA modificationsc an be captured by sequencing RNA, most likely due to stabilisation of thesem RNAs on ribosomes. [9a] We did not see any such codon enrichmenta mong the differentially expressed genes in heat-shocked animals (Figure 3B), thus indicating that the enrichmento ft he codons AAA and GAA is specific to starvation response.
In summary,w eu sed stable isotopes to label the transcriptome of the multicellular model organism C. elegans. We were able to identify the most abundant and readily detectable modifications in the large-and small-RNA fractions in an automated and untargeted fashion by using the labelled RNA in LC-MS/HRMS experiments. Furthermore, we used these bioisotopologues for relative LC-MS/HRMS quantification to show that some of these modificationse xhibit dynamic changes in their globall evels as ar esult of heats hock or dietary restriction. By combining RNA modificationm easurements with transcriptome analysis, we show that the wobble base modification mcm 5 s 2 Ul evels correlate with codon-biased gene expression. Currently,i ti sn ot easy to distinguish changes in RNA modification levels from changes in RNA levels where ac ertain modification exists. Nevertheless, by using bio-isotopologues to measureR NA modification levels and combining this with transcriptome-wide analysiso fg ene expression it is possible to uncover important links between RNA modifications and animal physiology.I tw ill be important to explore the mecha-

Experimental Section
Starter cultures of E. coli HT115 strain were grown overnight in M9 minimal medium supplemented with d-glucose ( 12 Co r 13 C, 0.2 %, w/v)a nd MgSO 4 (1 mm). Starter cultures were used for growing fresh E. coli HT115 cultures in M9 minimal medium to an OD 600 of 0.8-1.0. Bacterial cultures were pelleted for RNA isolation or C. elegans culture. Resuspended bacteria were seeded to NGM-N agarose plates, and the plates were left to dry overnight. Te nl arval stage 1a nimals were placed on seeded NGM-N agarose plates with the labelled bacteria and left to grow for two generations. Adult F1-generation animals were bleached to obtain as ynchronous population of F2-generation animals. F2 animals were placed on freshly seeded NGM-N plates with labelled bacteria and grown to adult stage. Adult animals were washed off the plates and pelleted before DNA and RNA isolation. Synchronised populations of L1 animals were grown until the young adult stage. For heat stress, animals were transferred to 37 8Ci ncubators for 4h with food. After 4h,a nimals were washed off the plates, cleaned from bacteria by washing in M9 buffer (3 )a nd stored in TRIsure (Bioline)r eagent for subsequent RNA isolation. For starvation experiments, young adult stage animals were washed off the plates using M9 buffer and plated either on food plates for control or on empty plates for starvation. Animals were left at 20 8Cf or 4h.A fter 4h,a nimals were washed off the plates, cleaned by washing in M9 buffer (3 )a nd stored in TRIsure reagent for RNA isolation. For rescue experiments, control and starved animals were plated on plates with food and left for 8hat 20 8C.
Isolated total RNA (see the Supporting Information) was fractionated into "small" and "large" RNAs with the aid of aQ uick-RNA Mini-Prep kit (Zymo Research)a ccording to the manufacturer's instructions.
Enzymatic digests of size-fractionated RNA were mixed in a1 :2 ratio (m/m based on RNA amounts). For the < 200 nt fraction the unlabelled RNA (1.5 mg) was added to the 13 CS IL small RNA (3 mg). For the > 200 nt fractions, 3a nd 6 mg, respectively,w ere used. LC-HRMS was performed with aT hermo Ultimate 3000 UHPLC system equipped with aW aters HSS-T3 column and coupled to aT hermo Qexactive hybrid mass spectrometer.L Cc onditions were as follows:H 2 O/MeCN solvent system (formic acid, 0.1 %);H RMS was performed in Full MS-SIM mode, resolution 35 000, scan range 250-500 m/z.
Using the absolute concentration of rC as ar eference for the amount of RNA in each sample (Table S2) we adjusted the samples' RNA concentrations. 13 C-Labelled RNA digest (5 %, v/v)w as added as the internal reference standard. With the same experimental setup, LC-MS/HRMS was then performed with the machine operated in MRM mode;r esolution 35 000 and inclusions as listed in Ta ble S3.
Complete experimental procedures are provided in the Supporting Information.