Advances in bacterial transcriptome understanding: From overlapping transcription to the excludon concept

Abstract In the last decade, the implementation of high‐throughput methods for RNA profiling has uncovered that a large part of the bacterial genome is transcribed well beyond the boundaries of known genes. Therefore, the transcriptional space of a gene very often invades the space of a neighbouring gene, creating large regions of overlapping transcription. The biological significance of these findings was initially regarded with scepticism. However, mounting evidence suggests that overlapping transcription between neighbouring genes conforms to regulatory purposes and provides new strategies for coordinating bacterial gene expression. In this MicroReview, considering the discoveries made in a pioneering transcriptome analysis performed on Listeria monocytogenes as a starting point, we discuss the progress in understanding the biological meaning of overlapping transcription that has given rise to the excludon concept. We also discuss new conditional transcriptional termination events that create antisense RNAs depending on the metabolite concentrations and new genomic arrangements, known as noncontiguous operons, which contain an interspersed gene that is transcribed in the opposite direction to the rest of the operon.


| INTRODUC TI ON
It has been a decade since the publication of the first complete unbiased transcriptome analysis in the bacterial pathogen Listeria monocytogenes (Toledo-Arana et al., 2009). This seminal study used high-resolution tiling microarrays to investigate the transcriptional profiles of wild-type and transcriptional regulatory mutants of Listeria monocytogenes grown in several conditions: (a) in vitro (exponential and stationary phase, hypoxia and low temperature); (b) ex vivo (human blood); and (c) in vivo (intestine of axenic mice). The results of the study anticipated a complex scenario in the bacterial genome transcription that has been confirmed by further studies in different bacteria (Cohen et al., 2016;Conway et al., 2014;Dornenburg, DeVita, Palumbo, & Wade, 2010;Kröger et al., 2012;Mitschke et al., 2011;Sharma et al., 2010;Thomason & Storz, 2010;Wade & Grainger, 2014).
The bacterial transcriptome contains a substantial fraction of RNA sequences that overlap with other RNAs. For example, transcriptomes contain hundreds of noncoding regions, including trans-acting small RNAs (sRNAs), cis-acting antisense RNAs (asRNAs) and long 5′ and 3′ untranslated regions (5′ and 3′ UTRs) whose transcription start sites (TSSs) or transcription termination sites (TTSs) are often located inside the coding sequence of the neighbouring gene. This review will describe how the initial observations showing frequent overlapping between transcripts of the neighbouring genes in the Listeria transcriptome (Toledo-Arana et al., 2009) have been enriched with new studies in other bacteria that are paving the way to the understanding of antisense transcription as a new mechanism to coordinate the bacterial gene expression.

| Riboswitch-dependent regulation of antisense RNAs
Riboswitches are regulatory elements that sense the fundamental metabolites or ions to control the expression of the genes encoding proteins involved in the metabolism or homoeostasis of these molecules (Winkler & Breaker, 2005). Despite the wide diversity of molecules that riboswitches are able to recognise, the regulatory activity of most of them is dedicated to modulating either transcription or translation by changing mutually exclusive RNA conformations. Regarding transcription attenuation, one of the alternative RNA structures serves as a Rho-independent terminator while the other forms anti-terminator hairpins that allow transcription. Analogously, translation could be inhibited or activated by alternative RNA structures that sequester or release ribosome-binding sites (RBS), respectively. In both cases, the RNA structures are reorganised upon metabolite binding to select the appropriate one that will allow activation/inhibition of the required gene to respond according to the metabolite concentration (Serganov & Nudler, 2013).
The initial L. monocytogenes transcriptome analysis uncovered the transcription of a large number of long 5′ UTR containing riboswitches. As expected, most were located upstream of a coding sequence but, interestingly, some were transcribed in the opposite direction to a coding gene (Toledo-Arana et al., 2009). One example of this configuration is rli39, a vitamin B 12 riboswitch, that is positioned downstream of the lmo1149 gene and in the opposite and convergent orientation to the next adjacent gene, pocR (lmo1150) (Toledo-Arana et al., 2009). Depending on the vitamin B 12 concentration, this riboswitch controls the transcription of an asRNA (aspocR) that overlaps pocR mRNA (Mellin et al., 2013) (Figure 1a).
PocR is a transcriptional factor that, in the presence of propanediol, mediates propanediol catabolism by activating pduCDE genes. Propanediol catabolism requires a B 12 -dependent diol dehydratase. Binding of vitamin B 12 avoids aspocR transcription and, consequently, PocR protein is expressed promoting propanediol catabolism. In contrast, when there is not enough vitamin B 12 , the asRNA is expressed thereby inhibiting the PocR expression. Therefore, asRNA regulation by this riboswitch ensures that pdu genes are only expressed when the vitamin B 12 cofactor required for propanediol catabolism is present (Mellin et al., 2013). A similar scenario where a riboswitch regulates the expression of an antisense transcript was previously described in Clostridium acetobutylicum (Andre et al., 2008) (Figure 1b). In this case, cysteine conversion from methionine is produced by the proteins encoded in the ubiGmccBA operon, whose expression is controlled by two functional convergent promoters associated with transcriptional antitermination systems, a cysteine-specific T-box and S-box riboswitch, respectively. The S-box riboswitch modulates the transcription of an asRNA overlapping the ubiG operon. Because the expression of this asRNA in trans did not affect the expression of the ubiG operon, the authors proposed a cis-acting regulatory model via transcription interference at the ubiG locus (Andre et al., 2008).
Another important insight in the riboswitch field came from the application of Term-seq methodology to L. monocytogenes . This method enables the quantitative mapping of all exposed RNA 3′ ends and allows the unbiased genome-wide identification of genes that are regulated by premature transcription termination, including riboswitches . The application of Term-seq to L. monocytogenes revealed that many of the previously annotated sRNAs were indeed cis-acting regulatory 5′ UTRs. In particular, two sRNAs of unknown function, rli53 and rli59, were found to function as antibiotic-responsive riboregulators that control the expression of lmo0919 and lmo1652 genes, respectively, both encoding ABC transporter genes of unknown function (Figure 1c). Inspection of the regulatory 5′ UTR sequence of lmo0919, which is highly specific to lincomycin, revealed two alternative stem-loop structures, a transcriptional terminator, and an antiterminator, respectively. Deletion of eight nucleotides from the antiterminator kept the regulator in a constitutively 'closed' state, even in the presence of lincomycin antibiotic, rendering the bacteria sensitive. In contrast, the deletion of eight nucleotides from the anti-antiterminator released the antiterminator to interfere with the terminator structure. As a result, this mutation produced a constitutive read-through ('open' state), even in the absence of antibiotics, increasing resistance to lincomycin. A three-amino-acid upstream open reading frame (uORF) exactly overlapping the inhibitory anti-antiterminator sequence forms the basis for attenuation-mediated regulation. The association of a ribosome with the antibiotic leads the ribosome to stall on the uORF, releasing the antiterminator to interfere with terminator folding and, thus, allowing read-through into the antibiotic resistance gene . The application of Term-seq in other model organisms (Bacillus subtilis and Enterococcus faecalis) and human oral microbiomes identified numerous riboswitches, suggesting that termination-based regulation in response to antibiotics and other metabolites is very common in Gram-positive bacteria . These considerations must be taken into account when studying sRNAs because, similar to what happens with rli39, rli53 and rli59, it is likely that some of the annotated sRNAs are indeed riboswitches whose transcription terminates under specific environmental conditions in which the sRNA has been detected, while in a different condition, transcription continues generating a productive mRNA or asRNA.

| Overlapping transcription between neighbouring genes
Another intriguing finding from the L. monocytogenes transcriptome was that often long 5′ or 3′ UTRs of well-annotated genes overlap with neighbouring genes or UTRs (Toledo-Arana et al., 2009). An example illustrating 5′ overlapping transcription corresponds to a long 5′ UTR of the mogR-lmo0673 operon ( Figure 2). L. monocytogenes is highly flagellated and motile at low temperatures (30°C and below) but non-motile at host-related temperatures (37°C). Flagella biosynthesis requires dozens of genes included in a large operon (from lmo0673 to lmo0718). Most of these genes are encoded in the positive DNA strand, with the exception of lmo0673 and mogR (lmo0674), which are transcribed opposite to them. MogR is a transcriptional repressor that is essential for the temperature-dependent transcription of motility genes (Gründling, Burrack, Bouwer, & Higgins, 2004). Listeria transcriptome data showed that the MogR protein is expressed from two alternative mRNAs that are transcribed from promoters P1 and P2 located 1,697 and 45 nucleotides upstream This was confirmed by the analysis of the short RNA fraction of an RNase III mutant where the number of short RNA reads was drastically reduced. An important conclusion of this finding is that both sense and antisense overlapping transcripts have to be present simultaneously in the cytoplasm of the cell (Lasa et al., 2011;Lasa & Villanueva, 2014). Further, a study on E. coli indicated that overlapping sense/antisense transcripts are digested by RNase III (Lybecker, Zimmermann, Bilusic, Tukhtubaeva, & Schroeder, 2014).
In this study, authors used a monoclonal antibody that recognises double-stranded RNA (dsRNA) molecules to pull them down from a total RNA sample extracted from E. coli and its corresponding RNase III mutant. Sequencing of the purified dsRNAs showed that the transcripts of the dsRNA regions remain protected and more stable in the absence of an active RNase III. The majority of overlapping regions identified in this study (50%) correspond to the 5′ region of genes, whereas only 0.5% of the overlapping transcripts correspond to the 3′ region. Contrary to this, overlapping between 3′ UTRs of contiguous genes in S. aureus was found to be more frequent than overlapping between 5′ UTRs (Lasa et al., 2011;Ruiz de Los Mozos et al., 2013). Overlapping between SigB-dependent asRNA (Yan, Boitano, Clark, & Ettwiller, 2018), named Smart-Cappable-seq, on E. coli has shown that 40% of transcription termination sites have read-through that can alter the gene content of the define operons (http://bioco mputo2.ibt.unam.mx/Opero nPred ictor /) (Taboada, Estrada, Ciria, & Merino, 2018). When the downstream genes were in the same direction, the extended transcript included at least an additional gene. This situation occurred in 34% of the known operons. In contrast, when the downstream genes were in the opposite direction, the extended transcript overlapped them, generating an antisense transcript.
The levels of pervasive read-through transcription have been shown to be affected by the presence of the transcription terminator factor Rho (Bidnenko & Bidnenko, 2018). In B. subtilis, transcriptional and physiological studies demonstrated that the absence of Rho impairs the bacterial motility due to the extended transcription of genes that generate transcripts that overlap with neighbouring genes important for flagella apparatus, biofilm formation and sporulation (Bidnenko et al., 2017). Because the levels of Rho can fluctuate between cells and temporally, within a single cell, Rho-dependent overlapping transcription can be a source of transcriptional noise in the bacterial population. In conclusion, the existence of overlapping transcription between 5′ and 3′ UTR of contiguous genes together with the existence of an RNase III-dependent mechanism to process overlapping transcripts provides new evidence that the gene location in bacterial genomes obeys, in many cases, gene regulation criteria.

| The excludon concept
The finding that overlapping between 5′ and 3′ UTRs of contiguous genes is common in bacterial genomes together with the fact that RNase III digests sense/antisense overlapping transcripts inspired the Cossart's and Sorek's groups to propose a new paradigm of regulation based on overlapping transcription, termed 'excludon' (Sesto, Wurtzel, Archambaud, Sorek, & Cossart, 2013). The excludon con- The synthesis of menaquinone, a component of the electron-transport system (Bentley & Meganathan, 1982), illustrates well how the noncontiguous operon genetic arrangement may play a key role in the capacity of pathogenic bacteria to grow inside cells.
Inhibition of the synthesis of menaquinone (or haemin) produces SCVs in S. aureus (Eiff et al., 1997). SCVs are usually isolated from patients experiencing chronic infections because bacteria showing this phenotype are able to persist better in mammalian cells and are less susceptible to aminoglycosides than their wild-type counterparts (Proctor et al., 2014). The molecular mechanisms underlying the generation of SCVs remain, nonetheless, poorly understood because the subcultivation of SCVs in the laboratory reverts its phenotype to normal colony growth. The rapid switch between SCVs and normal cells strongly suggests that the phenotype is transient and is not mediated by genetic changes (Proctor et al., 2014).  Figure 5). Interestingly, a second noncontiguous operon configuration is found among men genes. Specifically, MW0924 is co-transcribed with menFDHB genes forming a long polycistronic transcript that overlaps the menA mRNA, which is encoded between MW0924 and menF in the opposite direction (Sáenz-Lahoya et al., 2019). It is tempting to speculate that both antisense F I G U R E 3 The excludon concept. Schematic representation of putative gene organisations that produce overlapping transcripts in bacteria. (a) Long 5′ UTR overlapping. (b) Long 3′ UTR overlapping among convergent genes that lack a transcriptional terminator between them. (c) Long 3′ UTR overlapping generated by transcriptional termination read-through events. If a transcriptional terminator exists between two convergent genes, the RNA polymerase occasionally reads through the terminator signal generating long overlapping 3′ UTRs. (d) Noncontiguous operons that contain an interspersed gene that is transcribed in the opposite direction, generating two overlapping mRNAs that are reciprocally regulated. Chart drawing references are included. Different putative mRNA transcripts are represented as dashed arrows. (+) and (−) indicates DNA strands, respectively. P fw and P rv represent promoters encoded at the forward and reverse DNA strands, respectively with the noncontiguous operon structure (Yan et al., 2018). Moreover, transcriptome analyses of different phages of S. aureus revealed that noncontiguous operon arrangements are also present in phage genomes (Chen et al., 2018;Quiles-Puchalt et al., 2013). Together, these data indicate that noncontiguous organisation may be widespread in both Gram-positive and Gram-negative bacteria, as well as in bacteriophages.

| Advantages of overlapping transcriptionmediated regulation
The antisense regulatory mechanisms derived from the initial ob- It is also important to highlight that few genomic changes are sufficient to create novel antisense regions. For example, few nucleotide mutations can create or modify promoters and transcriptional terminator signals. Another advantage of the regulation mediated by overlapping transcription is that it permits the evolution of each of the overlapping genes by nucleotide changes that simultaneously affect both mRNA transcripts without altering their binding affinity (Brantl, 2015). From an evolutionary perspective, this has important consequences because it allows changes in the genome that affect, for instance, the promoter region of one of the partners while preserving the regulatory mechanisms.
Unsurprisingly, bacteria have taken advantage of such versatility and make widespread use of overlapping transcription to coordinate gene expression.

| FINAL REMARK S
Breakthroughs in methods to analyse total bacterial RNA content (tiling array and RNA-seq sequencing technologies) lead to F I G U R E 5 Genomic organisation of the genes required for menaquinone biosynthesis in S. aureus. The genes required for menaquinone biosynthesis are distributed in three operons (A, B and C) that are distantly encoded in the S. aureus genome (represented as a circle). A and B genomic regions are organised as noncontiguous operons. Triangles and rectangles represent putative promoters and transcriptional terminators, respectively. Different putative transcripts are shown as dashed arrows. (+) and (−) indicate forward and reverse DNA strands, respectively. The menaquinone biosynthetic pathway is represented at the right of the figure

Staphylococcus aureus
the complete characterisation of transcriptomes with a precision previously unimaginable. A limitation of these methods, due to the requirement of a minimal amount of RNA for the analysis, is that they are conducted at a 'population level', with the resulting transcriptome being an average of the transcriptomes of millions of prokaryotic cells (Kang, McMillan, Norris, & Hoang, 2015;Saliba, Santos, & Vogel, 2017;Saliba, Westermann, Gorski, & Vogel, 2014). Therefore, specific patterns of gene expression that occur in one cell (correlation of the expression of the sense/ antisense mRNAs in noncontiguous operons) are diluted among cell-to-cell heterogeneity within the whole population. We foresee that the next technological breakthrough to progress the knowledge in antisense-mediated regulation will be related to the capacity to analyse bacterial transcriptomes at a single cell level. to A.T-A. and Agencia Española de Investigación/Fondo Europeo de Desarrollo Regional, European Union (BIO2017-83035-R) to I.L.

CO N FLI C T O F I NTE R E S T
The authors declare to have no conflict of interest.