Targeted Detection of G-Quadruplexes in Cellular RNAs**

The G-quadruplex (G4) is a non-canonical nucleic acid structure which regulates important cellular processes. RNA G4s have recently been shown to exist in human cells and be biologically significant. Described herein is a new approach to detect and map RNA G4s in cellular transcripts. This method exploits the specific control of RNA G4–cation and RNA G4–ligand interactions during reverse transcription, by using a selective reverse transcriptase to monitor RNA G4-mediated reverse transcriptase stalling (RTS) events. Importantly, a ligation-amplification strategy is coupled with RTS, and enables detection and mapping of G4s in important, low-abundance cellular RNAs. Strong evidence is provided for G4 formation in full-length cellular human telomerase RNA, offering important insights into its cellular function.

Abstract: The G-quadruplex (G4) is an on-canonical nucleic acid structure whichr egulates important cellular processes. RNAG4s have recently been shown to exist in human cells and be biologically significant. Described herein is anew approach to detect and map RNAG4s in cellular transcripts.This method exploits the specific control of RNAG4-cation and RNAG4ligand interactions during reverse transcription, by using aselective reverse transcriptase to monitor RNAG4-mediated reverse transcriptase stalling (RTS) events.Importantly,aligation-amplification strategy is coupled with RTS, and enables detection and mapping of G4s in important, low-abundance cellular RNAs.S trong evidence is provided for G4 formation in full-length cellular human telomerase RNA, offering important insights into its cellular function.
G-quadruplex (G4) nucleic-acid structures play pivotal roles in the regulation of am yriad of cellular processes, [1] and have been demonstrated to be versatile scaffolds for ligand and biosensor development. [2] Recently,G 4s were visualized using G4-specific antibodies in human cells and tissues,and G4 ligands can stabilize such structures in cells. [3] Approaches for RNAG4identification in cellular RNAhave been largely limited to computational predictions,u sing algorithms such as Quadparser and QGRS. [4] Recent discoveries have started to specifically reveal important roles for RNAG 4s in cells. [5] It is now essential to establish ways to explicitly map RNAG 4f ormation and location in cellular transcripts to establish their role(s) and facilitate smallmolecule intervention strategies.
G4s form from guanine-rich sequences which self-assemble by stacked G-tetrads that are further stabilized by cations such as K + (Figures 1a nd 2A). [6] G4 structures exhibit characteristic features in circular dichroism, UV-thermal melting analysis,f luorescence,a nd NMR spectroscopy. [7] Thes ynthetic or in vitro transcribed (IVT) RNAc an be subjected to in-line probing to detect the presence of RNA G4s. [8] Compared to the abundant rRNAs and tRNAs,m ost cellular transcripts are of lower abundance,t herefore it is challenging to adapt existing approaches to probe the formation and location of G4s in full-length cellular RNAs.
Herein, we show that RNAG 4-mediated reverse transcriptase stalling (RTS) can be rationally controlled in ac ation-and G4 ligand-dependent fashion ( Figure 1). Notably,w eh ave integrated RTSw ith hybridization-based ligation-mediated PCR (RTS-HBLMPCR) to demonstrate and positionally map RNAG 4f ormation in cellular transcripts. This new approach has the sensitivity to enable G4 mapping of functionally important transcripts at natural abundance levels ( Figure 1).
To identify asuitable reverse transcriptase and conditions for RNAG4d etection and mapping, we rationally designed the IVT RNAc onstruct such that the RNAo fi nterest is within as tructure cassette,w hich contains a5 'hairpin that serves as internal RTSc ontrol, and a3 'hairpin for primer binding ( Figure 2B). [9] Using awell-studied RNAG4which is present in the 5'UTR of human NRAS, [10] GGGAGGGGCGGGUCUGGG,( see Table S1 in the Supporting Information), we evaluated as eries of commercially available reverse transcriptases such as SuperScript III (Life Te chnologies) and AMV (Roche) using the reverse tran- Figure 1. Targeted detection of G4s using RTS or RTS-HBLMPCR. IVT or total cellular RNA was used. Reverse transcription was conducted with Cy5-labeled/unlabeled gene-specific primer (green) for RTS/RTS-HBLMPCR,r espectively.F or ligand treatment experiments, ligand was added prior to reverse transcription (not explicitly illustrated). After reverse transcription, RNA was hydrolyzed,and the Cy5-labeled cDNAs were analyzed by denaturing PAGE. ForRTS-HBLMPCR, the unlabeled cDNAs were ligated to as ingle-stranded DNA (ssDNA) linker (blue and orange). The ligated cDNA was PCR with linker-specific unlabeled forward primer (blue) and gene-specificC y5-labeled reverse primer (pink). The PCR products were analyzed by denaturing PAGE. scription buffer provided (containing K + )a nd observed strong RTSnear the RNAG4(see Figure S1 in the Supporting Information). This observation is similar to ap revious report, [11] thus confirming that in the presence of K + ,reverse transcriptases stall at G4 sites frequently.Acloser inspection of the stalling position revealed that it usually occurs one nucleotide (nt) before the 5' end of the G4 of the reverse transcribed cDNA, and corresponds to one nt after the 3' end of G4 in RNA (Figures 1and 2C). Next, we sought conditions to alleviate RTSa tG 4s.G iven that G4 stability has as trong cation dependence, [12] in the order K + > Na + > Li + ,w e performed ion-dependent reverse transcription and showed that one of the enzymes,Superscript III reverse transcriptase, stalled at the G4 site in a1 50 mm K + -containing buffer, but not in either a1 50 mm Na + -o rL i + -containing buffer (same ionic strength;F igure S1). Specifically,16-fold higher stalling was observed in the presence of K + versus Li + ( Figure 2C;see Table S2 in the Supporting Information), where we define the RTSeffect as the fraction of stalling observed as aproportion of the total RT events under the conditions employed (e.g. K + in this case) over Li + condition ( Figure 2C,l anes 1a nd 8). Moreover,wedemonstrated that the RTSeffect in NRAS G4 is progressively suppressed with decreasing [K + ]( Figure 2C, lanes 1-7), thus corroborating that RTSc orrelates to G4 stability.Further experiments confirmed the RTSe ffect with other well-known and validated RNAG4s from human genes including TRF2, [13] MT3, [14] BCL2, [15] and others (see Figures S2-S8 in the Supporting Information), [16] whereas control RNAs tructural motifs,s uch as the hairpin (HP) and pseudoknot (PK), did not display RTSe ffect (see Figures S9-11 in the Supporting Information). This data strongly indicates that the observed RTSi sG 4-specific.T hese results show that ion-dependent RTScan detect and map RNAG4s in the context of extended transcripts.
Thet argeting of G4 structures using small molecules offers the potential to intervene with biological processes. [3a,b, 17] We are therefore interested to see whether RTScould probe RNAG4targeting by astabilizing ligand, as this could validate the molecular target and provide insights into ligand selectivity.W estarted by using astrong G4 stabilizing ligand called pyridostatin (PDS; Figure 3A) . [17] Using the NRAS G4 RNAsystem under Li + conditions,which do not promote G4 formation, we observed aP DS concentration-dependent increase in RTS( Figure 3B,l anes 2-7), thus yielding af ivefold RTSe ffect at 1 mm PDS (see Table S2). In ac ontrol experiment we introduced excess of aD NA G4 competitor (30 mm c-MYC) for PDS binding and observed areduction in RTS( Figure 3B,l ane 8), thus verifying that the ligand effect was ar esult of G4 recognition and stabilization. We also demonstrated that the excess DNAG4spiked in did not affect the reverse transcription ( Figure 3B,l ane 9). To provide insights into the effect of different G4 ligands on RNAG4, we tested other G4 ligands such as cPDS, [2a] PhenDC3, [18] and TMPyP4 [19] under the same dosage at 1 mm (see Figure S12 in the Supporting Information). We noted that higher G4 ligand concentration started to inhibit the activity of reverse transcriptase (data not shown). We found that cPDS (see Figure S12A) has asimilar structure to that of PDS and showed almost the same RTSeffect, whereas PhenDC3 and TMPyP4 (see Figure S12A), which have quite different structures from PDS,e xhibited either weaker or no RTSe ffect, respectively (see Figure S12B). We also performed the same experiments using telomeric TERRA RNAG 4, [20] and observed an identical trend for the RTSe ffect, that is,P DS % cPDS >  . .

Angewandte Communications
PhenDC3 > TMPyP4 (see Figure S12C). These findings suggest that these G4 ligands each have adifferent potential for targeting RNAG4s in extended transcripts.
Whilst the above results were performed in IVT RNAs, detecting and mapping RNAG4within aspecific transcript in total cellular RNAi sm ore challenging.F or low-abundance cellular transcripts,t he cDNAs generated by reverse transcription are scarce and cannot be detected by RTSa ssay without further amplification. To achieve this,wecoupled the RTSassay with ligation-mediated PCR [21] (LMPCR), in which single-stranded DNA(ssDNA) ligation was first performed to ligate the cDNAs to aknown DNAlinker,and then followed by PCR (Figure 1). Fort he ssDNAl igation step,w ei mplemented ah ybridization-based (HB) ligation strategy [21b] and designed as sDNAl inker containing as table hairpin with adegenerate hexamer (N6) tail for efficient hybridization and ligation to incoming cDNAs (Figure 1, ssDNAl igation step; and see Figure S13 in the Supporting Information). Our ligation result showed that the ligation yield was up to about 90 %in2hours (see Figure S13), and shows that HB ligation is robust and nearly quantitative.
We then applied RTS-HBLMPCR to investigate RNAG4 formation in human telomerase RNA(hTERC), a451 nt long noncoding RNA( lncRNA) critical for the regulation of telomere length. [22] hTERC RNAi sp resent at only about 1000 copies/cell in most human cells, [23] thus making it ac hallenging test case for our method. Thel ow-abundance of hTERC relative to 5.8S rRNA( ca. 200 fold higher) and actin ACTB mRNA( ca. 10-fold higher) in HeLa cells was evident by qRT-PCR (see Figure S14 in the Supporting Information). Cation-and PDS-dependent RTS-HBLMPCR were conducted ( Figure 5A), and revealed an umber of key findings.F irst, astrong RTSeffect near the 5 end of hTERC RNAw as detected, in which ac lose examination on the sequence alone suggested putative G4 formation ( Figure 5). Thus,o ur RTS-HBLMPCR result here provides robust and direct experimental evidence for the formation of G4s in fulllength hTERC in total cellular RNA. Second, two major stalling bands were observed ( Figure 5, asterisks), thus indicating that more than one RNAG 4w as formed, which can be reasoned by the fact that it has nine G-tracts available for involvement in G4 formation ( Figure 5B,bolded). Based on the stalling location, it is most likely that two major RNA G4s (nt 1-17 and nt 21-34) are formed in full-length hTERC. Recently,t he nt 1-17 RNAG 4h as been solved by NMR spectroscopy, [24] which supported our finding here.T hese results also suggest RTSa nd RTS-HBLMPCR can detect multiple or tandem RNAG 4s in as ingle experiment, and could be useful given the prevalence of RNAG4s in human transcriptome. [4b] Lastly,t he result of full-length cellular hTERC ( Figure 5) was comparable to hTERC IVT RNA (see Figure S8), thus showing that the hTERC RNAG4s are stable and the flanking sequences did not preclude G4 formation.
Thep hylogenetic RNAs econdary structure of hTERC shows that some of the Gtracts are involved in the formation of aP1helix ( Figure 5B), akey element defining the template boundary of reverse transcription in telomerase. [25] UVmelting experiments on the hTERC G4 sequence (nt 1-41)

Angewandte
Chemie gave am elting temperature (Tm) of greater than 80 8 8Ca t 100 mm K + . [26] Thea ddition of the complementary sequence (nt 184-208) yielded aT mo f7 3 8 8 Cf or the duplex. [26] Combined with our RTS-HBLMPCR result here (Figure 5A), it is likely that the G4s and P1 helix coexists in full length hTERC,with G4s being more stable than the P1 helix (weak GU pairs and internal loop) under physiological conditions which have high [K + ], about 150 mm.W ec annot rule out the possibility that both structures could form simultaneously with aq uadruplex comprising aCbulge (nt 1-17). [24] Recent protein binding and RNAmutational studies have suggested that the RNAG4-specific helicase DHX36 is involved in the binding and unwinding of aputative RNAG4 located near the 5 end of hTERC, [27] and that the G4 may facilitate the accumulation of mature hTERC needed for telomere maintenance. [27a] Our finding here that G4s are stable and detectable in full-length cellular hTERC support the fact that enzymes which can resolve G4 in RNA, such as DHX36, [27] may play ar ole in ensuring active hTERC RNA conformation for telomerase function.
In summary,weintroduce the RTS-HBLMPCR approach for probing G4 formation and location in full-length lowabundance cellular RNAs.W ee xemplify the approach by detecting and mapping G4 formation in the biologically important cellular lncRNA, hTERC.Moreover,weshow that the approach is applicable to detect and validate RNAtargets for G4 ligands.I nt he future,w ew ill adapt the approach described here to map G4s throughout the transcriptome.