There is a significant body of work suggesting that sRNA-mediated post-transcriptional regulation is a conserved mechanism among pathogenic bacteria to modulate bacterial virulence and survival. Porphyromonas gingivalis is recognized as an etiological agent of periodontitis and implicated in contributing to the development of multiple inflammatory diseases including cardiovascular disease. Using NimbleGen microarray analysis and a strand-specific method to sequence cDNA libraries of small RNA-enriched P. gingivalis transcripts using Illumina's high-throughput sequencing technology, we identified putative sRNA and generated sRNA expression profiles in response to growth phase, hemin availability after hemin starvation, or both. We identified transcripts that mapped to intergenic sequences as well as antisense transcripts that mapped to open reading frames of the annotated genome. Overall, this approach provided a comprehensive way to survey transcriptional activity to discover functionally linked RNA transcripts, responding to specific environmental cues, that merit further investigation.
Porphyromonas gingivalis is a nonmotile asaccharolytic Gram-negative obligate anaerobe that requires heme (iron and protoporphyrin IX) for growth. It is recognized as a major pathogen of severe adult periodontitis (Griffen et al., 1998) and implicated in systemic inflammatory conditions including cardiovascular disease (Desvarieux et al., 2005; Rosenfeld & Campbell, 2011; Belstrom et al., 2012), rheumatoid arthritis (Hitchon et al., 2010; Routsias et al., 2011), preeclampsia (Barak et al., 2007; Lachat et al., 2011), and preterm delivery (Lachat et al., 2011). Porphyromonas gingivalis is able to invade multiple cell types, including human vascular and oral cell lines (Lamont et al., 1995; Deshpande et al., 1999; Progulske-Fox et al., 1999; Dorn et al., 2000). Invasion of host cells is very likely a key virulence factor for this bacterium as it provides (1) a ‘privileged niche’ with access to host protein (nutritional) and iron substrates; (2) a sequestration from the humoral and cellular immune response; and (3) a means for persistence that is essential for a chronic pathogen. As a host-associated pathogen that can cause disease pathology using various adaptive mechanisms, whether forming tissue disseminating biofilm communities in the oral cavity and other sites in the body or residing in host cells as an intracellular pathogen, P. gingivalis must be able to sense and rapidly respond to changes in its environment. For example, heme limitation in subgingival plaque transitions to heme excess during active disease in vivo. Studies have shown that the expression of virulence factors increase after hemin starvation in P. gingivalis (Genco et al., 1995; Kesavalu et al., 2003; Dashper et al., 2009). Furthermore, studies assessing gene expression profiles of P. gingivalis strain W83 have suggested that there is a regulatory switch that initiates periodontal pathogenesis during mid-log phase under hemin limitation after hemin starvation (Kiyama-Kishikawa et al., 2005). However, the regulatory systems that P. gingivalis uses to rapidly respond to these and other environmental cues remain unclear.
Regulation of gene expression by small noncoding regulatory RNA (sRNA) is an exciting and rapidly growing field in biology. This mechanism of gene regulation is now recognized as a common mechanism employed by bacteria to rapidly respond to dynamic environmental cues. Furthermore, there is a significant body of work suggesting that sRNA-mediated post-transcriptional regulation is a conserved mechanism among pathogenic bacteria to modulate bacterial virulence and survival (Bardill & Hammer, 2012; Mann et al., 2012; Ortega et al., 2012). Most chromosomally located sRNAs currently characterized in bacteria are regulated by Hfq, a sRNA chaperone conserved across many bacterial species (Beisel & Storz, 2010; Richards & Vanderpool, 2011; Storz et al., 2011; Vanderpool et al., 2011; Bossi et al., 2012; Richter & Backofen, 2012; Shao & Bassler, 2012; Sobrero & Valverde, 2012). However, P. gingivalis is an Hfq-negative bacterium, with as yet uncharacterized sRNA regulatory systems. Moreover, the only sRNA in the Bacteroidetes phylum currently characterized is RteR (Waters & Salyers, 2012).
Materials and methods
Porphyromonas gingivalis strain W83 was cultured overnight in hemin-rich (5 μg mL−1) modified TSB media (TBS with 5 μg mL−1 yeast extract, 0.5 μg mL−1l-cysteine hydrochloride, 1 μg mL−1 Vitamin K1) prior to hemin starvation for 2 days under anaerobic conditions at 37 °C. The cultures were split and cultured in hemin-rich modified TSB or hemin-limiting (0.001 μg mL−1) modified TSB media to mid-log or stationary phase. The cells were collected and processed to extract RNA.
Porphyromonas gingivalis W83 small RNA-enriched extracts (< 200 nt) were prepared using mirVana™ miRNA isolation kit (Ambion, Foster City, CA). As bacterial regulatory sRNAs average 100 nt in size, the resulting small RNA-enriched samples greatly reduce the complexity of the sample and minimize the ‘noise’ of mRNA expression, thus simplifying both microarray and sequencing readout. The improved resolution of this method was verified by the ability to distinguish two P. gingivalis tRNAs found in the same intergenic sequence (IGS) separated by only 30 bp.
Microarray cDNA library construction and analysis
Microarray analysis was performed on libraries constructed from three of the four culture conditions: 2-day hemin-starved bacterial cells, subsequently cultured to mid-log or stationary in hemin-rich media, or subsequently cultured to stationary in hemin-limiting media. DNase-treated small RNA from each condition was C-tailed at the 3′ end using poly(A)polymerase (TakaRa Bio Inc., Shiga, Japan). The first strand of the cDNA was synthesized using Superscript III reverse transcriptase (Invitrogen, Carlsbad, CA) and a custom oligonucleotide containing a 3′ oligo G tail, primer (P1) sequence, and a NotI site at the 5′ end. A double-stranded DNA linker (L/P2) was ligated to the double-stranded RNA/cDNA molecules then digested with NotI restriction endonuclease to remove linkers that ligated to the 3′ ends. P1 and P2 primers and Phusion high-fidelity PCR Master Mix (New England Biolabs, Ipswich, MA) were then used to generate double-stranded cDNA to probe custom-designed, high-density, whole-genome oligonucleotide arrays synthesized by NimbleGen Systems (Roche NimbleGen Inc., Madison, WI). Oligonucleotide sequences are provided in the supplement (Supporting Information, Table S1). The arrays contained 65–75 mer oligonucleotide probes, synthesized in duplicate, on glass slides using a maskless array synthesizer. Starting at bp 1 on the P. gingivalis W83 genome (GeneBank accession: AE-015924), 180 071 probes overlapping every 10–13 bp was generated to provide full coverage of the 2.34 Mb genome.
Illumina cDNA library construction and analysis
To obtain sequence information of the small RNA transcripts while confirming our microarray data, we used a strand-specific method to sequence cDNA libraries of small RNA-enriched P. gingivalis W83 transcripts using Illumina's high-throughput sequencing technology. DNase-treated small RNA-enriched samples from each condition (n = 2) were used to create eight cDNA libraries using various combinations of 16 custom primers each containing a unique 6-bp barcode for paired-end Illumina sequencing. The mean size of the P. gingivalis RNA molecules in the DNase-treated small RNA-enriched samples was found to be c. 70 nt (Bioanalyzer; Agilent Technologies, Santa Clara, CA). It is generally known that bacterial rRNA transcripts are processed yielding 5′ monophosphate rRNAs, while bacterial mRNA has a 5′ triphosphate cap and a relatively short half-life. To prepare the full-length cDNA Libraries of P. gingivalis sRNA for paired-end Illumina sequencing, DNase-treated sRNA-enriched samples were treated with Terminator 5′-Phosphate-dependent exonuclease (TIP; Epicentre Biotechnologies, Madison, WI) to selectively remove ribosomal RNA. TIP will also digest 5′ phosphorylated ssDNA and dsDNA. The samples were then treated with Tobacco Acid Pyrophosphatase (Epicentre Biotechnologies) to remove 5′ caps and convert triphosphate RNA to 5′ monophosphorylated RNA. A 46-bp custom-designed RNA oligonucleotide adaptor (5′-IlluminaSeq-NNNN-BarCode-3′) was then ligated to the 5′ end of the sRNA samples using T4 RNA ligase (Epicentre Biotechnologies). The sRNA was then C-tailed at the 3′ end using poly(A)polymerase (TakaRa). The first strand of the cDNA was synthesized using a 59-bp custom DNA oligonucleotide primer (5′-IlluminaSeq-NNNN-Barcode-oligoG(15)-3′) and Superscript III reverse transcriptase (Invitrogen). E.coli RNase H was used to nick/degrade the RNA strand leaving single-strand cDNA. PCR was used to amplify the single-strand cDNA samples to make double-strand cDNA libraries using custom primers P1 and P2 and Phusion high-fidelity PCR Master Mix (New England Biolabs) for Illumina sequencing. Oligonucleotide sequences are provided in the supplement (Table S2). The majority of the final cDNA product was observed to be 150–250 nt long using denaturing 5% Urea-PAGE electrophoresis. The quality and average length of the cDNA product was assessed using Bioanalyzer (Agilent) before the eight cDNA libraries were pooled for direct Illumina sequencing. The cDNA pool was loaded on two lanes to obtain sequencing run replicates.
Each cDNA library generated for RNA-seq analysis was designed to include unique 6-bp barcodes, specific to the 5′ or 3′ end of the RNA transcripts, to determine the strand specificity of each mapped small RNA, and to differentiate libraries during data analysis (Table S3). The raw sequencing read data were converted to fastq format files and sorted by lane, 3′ or 5′ end, and barcode. The sequencing reads were mapped against the genomic reference sequences using the software maq (Li et al., 2008). The maq-mapped and mapview-generated text output files were converted into SQlite (language) database files, and the data were mined in Base (Mac OSX application). Based on the length of the adaptors and depth of the reads (35 cycles), none of the 3′ reads could be stringently mapped. However, the 5′ reads could be mapped. We identified the 5′ end sequence of putative sRNA and generated sRNA expression profiles in response to growth phase and/or hemin availability after hemin starvation (Table 1, Table S3).
Table 1. List of expressed small transcripts mapped to IGS (> 100 reads)
Sequence relative to the Porphyromonas gingivalis W83 reference genome. If encoded on the negative strand, the 5′ end sequence of the sRNA transcript is the complement of the sequence listed.
Extremely high reads may possibly result from processed product of a single transcript of a highly expressed multigene operon.
Biotinylated probes to sRNA 42 and 101 were prepared from PCR fragments generated from genomic P. gingivalis W83 using the primers described in Table S4. The PCR products were biotinylated using a BrightStar Psoralen-Biotin Nonisotopic Labeling Kit (Ambion). Small RNA-enriched P. gingivalis RNA was fractionated in 10% TBE-urea acrylamide gels and transferred by electro-blotting onto BrightStar-Plus Nylon membranes (Ambion). Probe hybridization onto the blots was carried out in UltraHyb (Ambion) medium overnight at 42 °C. Blots were washed using NorthMax Low Stringency Buffer (Ambion) at room temperature (2 × 15 min) and with NorthMax High Stringency Buffer (Ambion) at 42 °C (2 × 30 min). Northern blots were developed using alkaline phosphatase and streptavidin according to manufacturer's protocols (BrightStar BioDetect, Ambion).
Results and discussion
Prokaryotic post-transcriptional regulation by and large employs mechanisms that function to stabilize or destabilize mRNA. Regulation of gene expression by small noncoding regulatory RNAs (sRNA) is an emerging paradigm in microbiology. The current understanding of sRNA regulatory systems, described in several recent reviews (Vogel & Wagner, 2007; Sharma & Vogel, 2009; Beisel & Storz, 2010; Richards & Vanderpool, 2011; Storz et al., 2011; Vanderpool et al., 2011; Bossi et al., 2012; Richter & Backofen, 2012; Shao & Bassler, 2012; Sobrero & Valverde, 2012), generalize that most bacterial sRNAs (like eukaryotic micro-RNA) regulate expression at the post-transcriptional level, usually by binding multiple target mRNA(s). Typically, the size of bacterial sRNAs ranges from 50 nt to 250 nt. Bacterial sRNA transcripts typically contain their own promoters and Rho-independent terminators. Cis-encoded sRNAs are fully complementary antisense regulatory RNAs encoded within the target mRNA's open reading frame (ORF), leader, or trailer sequence. Trans-encoded (intergenic) sRNAs bind to target mRNA(s) by imperfect base pairing, dependent on the sRNA's secondary structure. A single sRNA may repress and/or activate translation. Most chromosomally located sRNAs currently characterized in bacteria are trans-encoded, and their mRNA targets are not simply determined by sequence analysis or relative loci within the genome.
Porphyromonas gingivalis encounters many different environmental conditions during colonization and growth within the host that require integration of sensory input and coordination of complex mechanisms of gene regulation. We hypothesized that P. gingivalis employs sRNA regulatory elements to rapidly respond to environmental cues. We used NimbleGen microarray analysis to identify transcripts found in cDNA libraries generated from small RNA-enriched P. gingivalis W83 samples expressed in response to growth phase and/or hemin availability, after 2 days of hemin starvation. These conditions were selected based on published studies that suggest that periodontal pathogenesis is initiated during mid-log phase under hemin limitation after hemin starvation (Genco et al., 1995; Kesavalu et al., 2003; Kiyama-Kishikawa et al., 2005; Dashper et al., 2009). Following hybridization, microarrays were scanned and the median signal intensity for each probe on the array calculated by NimbleGen. A region was considered to contain a small transcript if four or more probes in succession were > 10-fold above background (signal intensity > 10 000). This corresponds to RNA of about 75 bp or larger. Each small transcript was analyzed for the presence of a transcription terminator based on the prediction program TransTermHP (http://transterm.cbcb.umd.edu/). Transcripts containing a predicted terminator were analyzed to determine the location on the genome, the presence of an ORF, and its length estimated based on the first and last positive probe. Only those transcripts larger than 70 nt containing a predicted terminator and did not contain an ORF on the same strand were considered candidate sRNA. 8% of the 180 071 probes covering the entire 2.34 Mbp P. gingivalis W83 genome were highly expressed. Thirty-seven putative sRNAs were identified whose terminators could be predicted and are summarized as follows (Table S5): Thirty mapped to IGSs. Three mapped to IGSs with slight ORF 5′ overlap on the opposite strand based on terminator prediction. Four mapped to ORF on the opposite strand based on terminator prediction. All but thirteen putative sRNAs were regulated in response to limited hemin availability after hemin starvation. Those transcripts identified by microarray that also had significant counts detected by RNA-seq analysis are indicated in Table S5.
Mobile genetic elements contribute to the genetic plasticity of bacterial species (Tribble et al., 2007; Naito et al., 2011; Watanabe et al., 2013). There are numerous short and long mobile elements including complete and degenerative (truncated) insertion sequences (IS) and transposons found throughout the genome of P. gingivalis. Studies to date have shown that P. gingivalis has a high degree of genetic variation among strains. For example, conjugal transfer is proposed to be the mechanism by which P. gingivalis undergoes allele exchange, contributing to genetic variation (Tribble et al., 2007; Naito et al., 2011). Using RNA-seq analysis, we found that all five ISPg3 and all ten ISPg4 transposase ORFs encoded highly expressed cis-encoded antisense transcripts to the 5′ untranslated region of these genes with a 2 nt overlap of coding region. We also identified cis-encoded antisense transcripts to both ORFs (PG0827, PG1446) encoding putative MatE family multidrug efflux pumps (Fig. 1). These genes were located within regions flanked by integrases. Sequence analyses indicate that these regions are located within large multigene transposons. We also identified highly expressed transcripts of CRISPR regulatory sRNA transencoded in the IGS upstream of CRISPR-associated (CAS) gene arrays. A recent study analyzing the spacer content among the genomes of 60 P. gingivalis isolates has indicated that IS-mediated transposition may be limited by CRISPR interference (Watanabe et al., 2013). Spacer analysis showed a high degree of similarity to P. gingivalis genome sequence, most of which matched sequences within ORFs. We found that inverted repeats of ISPg1 type transposases appear associated with CRISPRs in W83. The identification of these highly expressed sRNA transcripts indicates another mechanism that P. gingivalis employs to limit genetic exchange among a population with a genotype clearly predisposed to a high degree of genetic exchange. CRISPR expression was only detected under logarithmic growth and not during stationary, indicating that P. gingivalis uses this mechanism to limit genetic exchange only during active growth phase. In contrast, the cis-encoded antisense transcripts to ISPg3s and ISPg4s UTRs and the MatE efflux pump ORFs were expressed under all conditions assayed.
Overall, we identified 186 sites containing 5′ end sequence reads, other than tRNA or rRNA, with significant counts (> 5) that mapped to IGSs or were clearly antisense to identified ORFs on the P. gingivalis W83 genome. Transcripts from 49% of these sites (not encoding tRNA/rRNA) were highly expressed (> 10 counts). Of the sequenced reads within the pooled cDNA, consisting of six libraries, 85% of the mapped reads were to tRNA/rRNA for all culture conditions (i.e. 107773/141115 of library containing 5′ reads with barcode #1; Supporting Information). Under these parameters, read depth of the sample was limited; thus, > 10 counts of the same transcript were considered highly expressed. The expression profile was determined by normalizing the sequencing read counts of each analyzed transcript to the relative tRNA/rRNA counts within each library. Twenty-three putative sRNAs were clearly regulated in response to growth phase. Twenty-two putative sRNAs located within IGSs were clearly up-regulated in response to hemin after hemin starvation (Table S6). Northern analysis was performed on several of the small RNA transcripts that had high microarray signal intensity. For example, a transcript (sRNA 101) estimated to be 118 nt long, encoded on the negative strand of the IGS between PG2089 and PG2090 (both encoded on the positive strand), was detected by microarray. This transcript had a maximal count of 88 5′ end sequence reads (ATAAGCCGCACTGTTAGATCGGGG) as detected by RNA-seq analysis. Northern analysis confirmed the expression of this IGS-encoded transcript (Fig. 2). In contrast, microarray analysis detected a transcript mapping to the positive strand of the IGS between PG0715 and PG0717 (sRNA 42) that had high signal intensity, while none was detected by RNA-seq or by Northern analysis (Fig. 2). We believe RNA-seq analysis is an improved method to determine relative expression profiles of microbial sRNA compared with microarray. We propose that an additional alteration in the methodology to reduce tRNA in library generation would improve sRNA transcript identification by RNA-seq analysis, but may eliminate the ability to generate comparable relative expression profiles normalized to tRNA read counts.
By employing RNA-seq analysis to identify and characterize the expression profile of small RNA transcripts expressed in P. gingivalis, we have generated a list of possible regulatory transcripts. This technology will now allow further dissection of the factors, whether virulence related or associated with other survival pathways, controlling gene expression in P. gingivalis in response to environmental signals (i.e. hemin availability or growth phase).
The data reported in this manuscript were generated from a NIH grant funded study (#R21 DE01986).