SEARCH

SEARCH BY CITATION

Summary

  1. Top of page
  2. Summary
  3. Introduction
  4. Results
  5. Discussion
  6. Experimental procedures
  7. Acknowledgements
  8. Supplementary material
  9. References
  10. Supporting Information

Natural genetic transformation in Streptococcus pneumoniae is controlled in part by a quorum-sensing system mediated by a peptide pheromone called competence-stimulating peptide (CSP), which acts to coordinate transient activation of genes required for competence. To characterize the transcriptional response and regulatory events occurring when cells are exposed to competence pheromone, we constructed DNA microarrays and analysed the temporal expression profiles of 1817 among the 2129 unique predicted open reading frames present in the S. pneumoniae TIGR4 genome (84%). After CSP stimulation, responsive genes exhibited four temporally distinct expression profiles: early, late and delayed gene induction, and gene repression. At least eight early genes participate in competence regulation including comX, which encodes an alternative sigma factor. Late genes were dependent on ComX for CSP-induced expression, many playing important roles in transformation. Genes in the delayed class (third temporal wave) appear to be stress related. Genes repressed during the CSP response include ribosomal protein loci and other genes involved in protein synthesis. This study increased the number of identified CSP-responsive genes from approximately 40 to 188. Given the relatively large number of induced genes (6% of the genome), it was of interest to determine which genes provide functions essential to transformation. Many of the induced loci were subjected to gene disruption mutagenesis, allowing us to establish that among 124 CSP-inducible genes, 67 were individually dispensable for transformation, whereas 23 were required for transformation.


Introduction

  1. Top of page
  2. Summary
  3. Introduction
  4. Results
  5. Discussion
  6. Experimental procedures
  7. Acknowledgements
  8. Supplementary material
  9. References
  10. Supporting Information

Bacteria commonly utilize multiple sigma factors, together with a single RNA polymerase core enzyme to regulate differential transcription of distinct sets of genes. The majority of operons in a genome are controlled by a single sigma factor which recognizes one class of promoters, whereas alternative sigma factors allow activation of specialized gene sets, e.g. heat-shock genes, sporulation genes, etc., in response to specific environmental signals. Among the several completely sequenced streptococcal genomes, the number of recognizable alternative sigma factors is small. Recently, a new alternative sigma factor, ComX, was identified in S. pyogenes and S. pneumoniae (Lee and Morrison, 1999; Opdyke et al., 2001; Luo and Morrison, 2003). Although apparently orthologous genes are found in L. lactis, S. mutans and S. gordonii, the biological role of these proteins is uncertain. In S. mutans, for example, comX mutants form atypical biofilms (Li et al., 2002). In S. gordonii (Challis) corresponding mutants are defective in production of the STH1 and STH2 bacteriocins and are also deficient in competence development (N. Heng, pers. comm.). Similarly, comX mutants of S. pneumoniae and S. mutans are defective in competence for genetic transformation (Lee and Morrison, 1999; Li et al., 2002). Finally, Wood and Buttaro (pers. comm.) have obtained evidence that comX is expressed in stationary phase in S. pyogenes, but no phenotype has yet been associated with a comX mutation in this species.

The best studied context for comX-dependent gene expression is during competence for genetic transformation in S. pneumoniae. In this species, competence develops for a brief period during exponential growth, under coordination of a quorum-sensing circuit that employs a 17-aa peptide pheromone (CSP) as a cell–cell signalling molecule (Håvarstein et al., 1995) that causes a transient global shift in the gene transcription and protein synthesis patterns. The response to a sudden dose of pheromone is synchronized, in partial recapitulation of the events occurring during endogenous competence induction, with maximal competence at 20 min and a decline nearly to zero by 30–40 min (Morrison and Baker, 1979; Morrison et al., 1980; Alloing et al., 1998; Peterson et al., 2000; Rimini et al., 2000). In two overlapping waves of transcription, five genes of the quorum sensing circuit are induced early, followed 3 min later by induction of at least 16 ‘late’ competence genes. These two waves of gene expression are linked through the activity of comX, which is regulated as an early gene by the quorum sensing circuit but is required for induction of many late genes previously identified in this class, including many known to be required for DNA processing (Lee and Morrison, 1999; Peterson et al., 2000). Both populations of mRNA decay in parallel, dropping nearly to basal levels by 25–35 min after the initiating exposure to CSP.

Before this report, approximately 40 genes activated during the response to CSP were identified. In addition to mutagenesis studies, bioinformatic approaches were reported, searching the genome for an 8-bp consensus shared by the promoters of several late genes. Empirical tests of such candidates using DNA microarrays or reporter fusions identified novel induced genes and promoters (Campbell et al., 1998; Peterson et al., 2000). A promoter trap screen identified 12 CSP responsive promoters (Bartilson et al., 2001). Finally, an array of overlapping cloned DNA fragments covering 4301 genomic segments was used together with probes derived from RNA samples extracted at 5-min intervals during the first 30 min of the CSP response. Several (18) additional genes were found to be induced in patterns consistent with those of known early or late genes (Rimini et al., 2000). Two additional response patterns were reported for a few other genes: three induced genes reached maximal expression after the late genes, whereas seven others were repressed during competence (Rimini et al., 2000). As each of these surveys failed to identify many of the CSP responsive genes reported in the others, it appears that none of the surveys was carried close to saturation. Here, we report the results of a new DNA microarray survey of CSP-responsive loci carried out on a genome-wide scale, which confirms nearly all of the reports mentioned above, extends the roster of ComX-dependent genes, and indicates that the number of CSP responsive genes is much larger than previously realized.

Results

  1. Top of page
  2. Summary
  3. Introduction
  4. Results
  5. Discussion
  6. Experimental procedures
  7. Acknowledgements
  8. Supplementary material
  9. References
  10. Supporting Information

We used a DNA microarray containing PCR products designed to target 95% of the non-repetitive ORFs in the sequenced genome of strain TIGR4 (Tettelin et al., 2001), as described in Experimental procedures, to determine whether non-competent wild-type and comX mutant strains were transcriptionally comparable. Direct hybridization of a mixture of differentially labelled cDNAs to DNA microarrays revealed no discernable expression differences between log phase cultures of the two strains (Fig. 1). This result is consistent with the finding that comX is expressed at a very low or zero level unless induced by CSP (Luo and Morrison, 2003), and supports the view that ComX participates in transcription only under special conditions.

Figure 1. Competence pheromone – dependent expression of selected S. pneumoniae genes. A. mRNA levels measured with the 15-fold-redundant microarray SPMAv2 (details of data available in Supplementary material Tables S7 and S8). Left, induction of wild type with CSP, averaged over four replicate hybridizations; right, induction of an isogenic comX double deletion mutant with CSP, averaged for two replicate hybridizations. Results for 384 genes were ordered vertically by similarity of temporal patterns (TIGR Multi-Experiment Viewer, MEV); most temporally invariant genes are omitted. Colour codes represent RNA levels at indicated times after administering CSP, over the range from 0.12 (bright green) to 8 (bright red). Reference RNA was a mixture of all samples from the wild type culture. Expansion panels show expression patterns for representative genes. B. Comparison of gene expression in isogenic wild type and comX deficient strains before CSP treatment. Microarray SPMAv3 was used in five replicate hybridizations (Supplementary material Table S10). C. Comparison of RNA level changes estimated using microarrays with determinations made by real-time PCR, as described in Experimental procedures. D. Kinetics of competence induction in culture sampled for RNA preparations analysed in A. Transformation was determined after 2.0-min exposures of culture samples to donor DNA.

Download figure to PowerPoint

image

To obtain a more complete description of the regulatory events accompanying development of competence, the one condition in which comX is known to be expressed in S. pneumoniae, and to define further the role of ComX in the response to the competence pheromone, we used DNA microarrays to identify genes displaying distinctive temporal expression patterns during competence induction. RNA was prepared from cultures of the wild type and of an isogenic comX deficient strain, CPM8, harvested at short intervals after treatment with CSP. Determination of relative mRNA levels for individual genes was carried out using an internal hybridization reference cDNA prepared from an equimass mixture of all kinetic RNA samples. This reference provides a reliable means for establishing expression ratios for all genes, even those with very low basal expression. Gene expression patterns were compiled from several independent induction experiments and multiple microarray hybridization assays using two distinct microarrays of different design structure, gene coverage and spotting replicates.

The significance of signals obtained from microarray hybridizations was verified in several ways. Direct comparison of the array expression results to quantitative real-time PCR determinations for 20 induced genes revealed a strong correlation of results for the two methods (Fig. 1). For ∼ 20 other genes, CSP-dependent expression was verified by reporter studies using lacZ transcriptional fusions in vivo. Additionally, concordant patterns for 64 array elements representing 16 different copies of IS1167 (Supplementary material, Table S9) illustrated that replicate hybridizations regularly detected a twofold late induction pattern. To determine the statistical significance of the microarray results, we calculated the coefficient of variation (cv) for expression ratios obtained using SPMAv2 and SPMAv3 (Supplementary material). More than 90% of the calculated ratios resulted in coefficients of variation between 0 and 60%. Finally, we used SAM (Significance Analysis of Microarrays), an algorithm that determines statistical significance of expression profiles, to validate our identification of expression patterns that are distinct from genes whose expression are not CSP-regulated.

Expression patterns during response to competence pheromone

Upon sudden exposure to CSP, non-competent cells displayed, after a 10 min lag period, a rapid increase in competence, reaching a maximum at 20 min. This competent state also decayed rapidly, dropping below 10% of maximum by 30 min (Fig. 1). Consistent with previous reports, we observed several distinct temporal gene expression patterns during 20 min of CSP treatment, as illustrated by the clustering of gene expression patterns shown in Fig. 1. The majority of genes produced transcripts varying by less than a factor of two. However, transcription levels for 188 unique genes varied twofold or more, in one of a few specific and consistent patterns. For genes in one class (Early), a low basal signal increased almost immediately upon CSP exposure, reached a maximum between 7.5 min and 10 min, and then declined toward basal levels. In the comX mutant background, these genes were induced similarly, but the sharp decline in mRNA levels was strongly alleviated. The expression pattern of a second class of genes (Late) was generally similar to that of the first, but was distinguished by a lag of ∼5 min before induction and a delay of about 5 min in reaching a maximum. Genes in this class depended on comX for induction of expression. In a third pattern, which we term Delayed, mRNA levels increased continually during the first 15–17 min of CSP treatment; this increase was generally modest, typically four to eightfold overall, and was similar in the comX background. Finally, some genes were transiently repressed during the response to CSP; for these, RNA levels dropped during competence induction, usually two to fourfold, and recovered to pretreatment levels by 20 min. Expression patterns for genes of specific loci typical of the three temporal patterns of induction are displayed and compared with patterns for flanking invariant genes in Fig. 2. Corresponding figures for all other CSP-responsive genes characterized here and the complete underlying data sets are available in Supplementary material.

Figure 2. Expression profiles and organization of genes at three CSP-induced loci. A. SP2234-SP0001, including early and delayed loci with flanking invariant genes. B. SP0593-59, a late locus. Fluorescence signals averaged from four hybridizations to SPMAv2 are expressed as log2 ratios to signals from reference RNA mixture (Supplementary material Tables S7 and S8). Expression patterns are displayed for isogenic ComX+ and comX mutant backgrounds, as indicated. C and D. Gene organization and expression pattern summary. C, invariant; IE, early; IL, late; ID, delayed. com, combox; DR, ComE direct repeat. Data for characterization of invariant genes SP0001 and SP0953 are in Supplementary material Table S9.

Download figure to PowerPoint

image

Organization of genes induced or repressed at competence

Nearly all of the CSP inducible genes detected in this study are organized in clusters of adjacent genes that lie in co-transcribed orientations and exhibit similar temporal expression patterns. Clusters of adjacent genes regulated coordinately provide experimental evidence to define presumptive operons. Figure 3 shows the clusters of genes for which such parallel expression patterns were observed, as well as some isolated CSP-induced genes for which independent corroboration of CSP induction is available. For clusters identified for the first time in this study, inclusion in the figure reflects consistent induction patterns in at least two array hybridization experiments. The physical limits of nearly all of these clusters were established by the occurrence of flanking genes with an invariant pattern of expression, as illustrated in Fig. 2. Like the induced genes, many transiently repressed genes are organized into apparent operons (see below). Altogether, these experimental results increased the number of known CSP responsive genes approximately fourfold.

Figure 3. Organization and functional characterization of CSP-induced genes . Genes required for competence for transformation by chromosomal donor DNA, in red, those that are dispensable for competence, in green (Tables 1–3 ), and essential or otherwise unanalysed genes, in grey . ORF orientation within each cluster is indicated by the direction of the pentagons (not to scale); presumptive combox promoter or direct repeat ComE binding sites from Fig. 4 are marked as ‘c’ or ‘DR.’ Vertical lines indicate location of putative terminators ( Supplementary materialTable S3 ) . Arrows indicate sense of transcription for individual genes, as determined by reporter fusions in this study or elsewhere ( Tables 1 and 2 ) . Green ‘c-boxes’, octamers deleted without compromising competence ( Supplementary materialTable S2 ).

Download figure to PowerPoint

image

Cis-acting elements adjacent to CSP-induced gene clusters

Noting the presence of a conserved octamer, tacgaata, upstream of several late gene loci, Campbell et al. (1998) suggested that this element acts as a promoter. They showed that mutations in and near the octamer associated with the cinArecA operon abolish the induction of these genes at competence, and further demonstrated in one case that the octamer was about 6 bp upstream of an mRNA start site. Supporting this hypothesis, Opdyke et al. (2001) and Luo and Morrison (2003) have shown that ComX-directed polymerase initiates near such sites in vitro. At most of the late gene clusters defined in this study, a similar octamer, referred to as a combox or cinbox, lies upstream of the gene at the 5′ end of the cluster (Figs 3 and 4). This suggests that most late genes are directly regulated by ComX. However, one similar octamer (ttagaata) proposed to be a late gene promoter (Bartilson et al., 2001) may not be active as a late promoter, as it is associated with the early gene SP0018. Several others that appear to be inactive are located within apparent orfs (described in Supplementary material). These occurrences suggest that a specific sequence context (proximity to other promoter elements or flanking sequences) is required for cis-activity of these elements. It may be significant, for example, that only the active combox elements are associated with both upstream regions rich in pyrimidines and downstream regions rich in purines. Several late genes with no discernable combox element may be regulated only indirectly by ComX, or may reflect an alternative ComX promoter specificity. As for early genes, it is known that ComE binds to a direct repeat (CanTT-16-CanTT), found upstream of three early genes, SP1717, comA and comC. It has been proposed that the protein acts as a transcriptional activator at such sites near sigma-A promoters (Ween et al., 1999). A similar direct repeat site was found upstream of most of the newly identified early gene clusters, lending further support to the proposal that this consensus is a site of ComE activity (Fig. 4) and suggesting that many early genes are directly regulated by ComE. Finally, although CSP-induced transcripts have not been analysed in detail, the majority of clusters of induced genes reported here are associated with stem–loop structures at their 3′ ends, most of which were characterized as likely terminator signals by Unniraman et al. (2002) (Fig. 3).

Figure 4. Putative regulatory sites at early and late CSP-induced loci. A. Alignment of DNA sequences upstream of clusters of early genes. Presumptive canonical promoter − 10 sites underlined; bases matching the direct repeat consensus of Ween et al. (1999) in bold. The gene for 1cn972 is found only in strain G5A. B. Alignment of DNA sequences found upstream of late gene clusters. Distances to the start codon of first ORF are shown. Matches to combox consensus octamer (Campbell et al., 1998) in bold. C. Inactive candidate combox sites, from Peterson et al. (2000). Matches to consensus in bold; locations shown in Supplementary materialFig. S2.

Download figure to PowerPoint

image

Early genes

Early genes, organized in 13 apparent operons (Table 1), are rich in a few functional classes of proteins: transporters (comAB, SP1110 SP1717 and SP1918), bacteriocin related genes (SP0530, and SP0545–46) and regulators (comDE, comX and SP1942). Several early genes, including comAB, comCDE and comX, were previously characterized as being strongly induced in response to CSP and as having important functions in competence regulation (Alloing et al., 1998; Lee and Morrison, 1999; Peterson et al., 2000). Twelve genes were added to this class. One early gene, SP0018, has been proposed to be a comX-dependent gene (Bartilson et al., 2001). However, its temporal expression pattern was that typical of a strongly induced early gene and it was also induced in a comX deficient mutant, as is typical of early genes. The adjacent gene, purA (SP0019), followed a parallel early pattern, although a canonical pneumococcal promoter between SP0018 and SP0019 suggests that purA may also be regulated independently of SP0018.

Table 1. . Early CSP induced genes.
no.Genea orientationExpression patternbOther induction reportscCompetence in the mutant straindDescription of producta; orthologue in B. subtilisOther names for gene or locuse
WTcomX
SP0014vIE305, 7, 9NO (5)Alternate sigma factorcomX1
SP0018vIE12IE289, 11NO, NO (9)Hypothetical proteincomW
SP0019vIE16IE3411YESAdenylosuccinate synthetasepurA
SP0042vIE32IE1335, 7, 9NO *Transport ATP-binding proteincomA, CPIP782
SP0043vIE32IE1718NO *Transport protein ComBcomB
SP0530^IE3IE4 YESAntibiotic ABC exporterblpA
SP0545vIE6 YESImmunity proteinblpY
SP0546vIE4 YESProtein, fusionblpZ
SP0547vIE16 Conserved domain protein 
SP1110vIE8 Macrolide-efflux protein 
SP1548^IE8IE11 YESHypothetical protein 
SP1549^IE8IE119YESPolypeptide deformylaseCPIP57
SP1716^IE8IE148YESConserved hypothetical protein; yhaP 
SP1717^IE8IE178YESABC transporter; yhaQ 
SP1918^IE89YESABC transporter, ATP-binding protein 
SP1942^IE8IE10 YESTranscriptional regulator, putative 
SP1943^IE8IE13 YESAcetyltransferase, GNAT family 
SP1944^IE8IE14 YESConserved hypothetical protein 
SP1945^IE16IE44 YESHypothetical protein 
SP2006^IE325, 7, 9NO (5)Alternate sigma factorcomX2 , CPIP663
SP2156^IE2 SPFH domain/Band 7 family 
SP2235^IE16IE547NOResponse regulatorcomE
SP2236^IE64IE627NOPutative sensor histidine kinasecomD
SP2237^IE42, 6, 10, 11NO (10)Competence stimulating peptide 2comC

Late genes

The largest group of CSP-responsive genes exhibited the late expression pattern (Table 2). This class contains, in eight of its operons, all the non-regulatory genes previously recognized as both induced by CSP and necessary for transformation. Several of these, SP0124, SP0125, SP1266 (dalA), SP1908 (ssbB) and SP1937-SP1941 (cinArecA operon) were reported previously to be CSP inducible and were confirmed as such in this study without alteration to the number of operon members. The late class was enlarged, principally by identifying additional induced genes adjacent to individual genes or sites previously reported to be CSP responsive, but also by discovery of five clusters of induced genes that were not known to be associated with competence or CSP responses. In total, the expression profiles presented here define 81 late genes organized in approximately 21 clusters. However, induction of the corresponding protein products has not yet been extensively surveyed and a few of the genes were transcribed as antisense products. At the cgl locus, where cglA, C, and F were previously shown to be induced in the late pattern, five additional genes (SP2052, SP2050, SP2047, SP2046 and SP2045) were found to be strongly induced, with the same expression pattern. CSP induced expression of SP2047 and SP2046 was confirmed by use of a lacZ reporter in vivo (data not shown). Mutational analysis (below) showed SP2047 to be required for transformation, but deletion of SP2046 and SP2045 had no effect on competence. Several of the other late genes, although dispensable for competence (below), do have predicted functions which might affect cell wall remodelling (pepF, pgdA, lytA). Interestingly, one of the late genes encodes a homologue of the E. coli protein RadA, which has redundant DNA repair and recombination activities that are revealed by recG mutations (Beam et al., 2002). The late gene SP1088 is orthologous to a gene of unknown function (radC), formerly thought to have a function in repair and recombination (Lombardo and Rosenberg, 2000).

Table 2. . Late CSP induced genes.
no.Genea orientationExpression patternbOther induction reportscCompetence in the mutant straindDescription of producta; orthologue in B. subtilisOther names for gene or locuse
WTcomX
  • a

    . Gene number, orientation and product from Tettelin et al. (2001 ); v = clockwise and ^ = counter clockwise.

  • b

    . Change in expression in wild type (SPMAv3) or in comX mutant (SPMAv2) as listed in Supplementary materialTable S1 . IEn, ILn: Early or late n-fold induction; C = Invariant pattern. Individual microarray datasets are available in Supplementary material Tables S7, S8 and S9.

  • c

    . Refer to Table 1 legend for respective references.

  • d

    . Competence data from this work ( Table S2 ): NO, less than 1% of normal competence; YES, more than 70% of normal competence; YES/NO with an associated number, competence data from the source cited.

  • e

    . ccs(n) and CPIP(n) locus names were assigned in references 7 and 9 respectively.

  • ∼, Genes or mutants not analysed; *, mutant with 70% reduction in competence.

SP0021vIL10C7YESdUTPaseccs19
SP0022vIL10C YESConserved hypothetical protein 
SP0023vIL10C YESDNA repair proteinradA
SP0024vIL8C YESConserved hypothetical protein 
SP0025vIL4C YESHypothetical protein 
SP0026vIL8C YESHypothetical protein 
SP0029^IL4 YESHypothetical protein 
SP0030^IL32 YESCompetence-induced protein 
SP0031^IL167YESHypothetical proteinccs16
SP0124^IL328YESGG peptideorf62
SP0125^IL647, 8YESGG peptideccs1, orf51
SP0200vIL16C7YESHypothetical proteinccs4
SP0201vIL8C Hypothetical protein 
SP0782vIL8C YESConserved hypothetical protein 
SP0954vIL32C3, 5, 7, 8NOCompetence protein; comEAcilE, celA
SP0955vIL64C4, 5NOCompetence protein; comECcelB
SP0956vIL16 YESHypothetical protein 
SP0957vIL16C YESABC transporter, ATP-binding protein 
SP0958vIL16C YESHypothetical protein 
SP0978vIL16C4, 7, 8, 9NOCompetence protein CoiACPIP104, coiA
SP0979vIL8C YESOligoendopeptidase FpepF
SP0980vIL6C YESO-methyltransferase 
SP0981vIL4C YESProtease maturation protein, putative 
SP1065^IL16 Hypothetical protein 
SP1072vIL8C DNA primasednaG
SP1073vIL8C7RNA polymerase sigma-70 factorrpoD
SP1074vIL8C Conserved hypothetical protein 
SP1088vIL32C8YESDNA repair protein RadC 
SP1089^IL32C8YESGlutamine amidotransferase, class I 
SP1090^IL10C YESConserved hypothetical protein 
SP1092^IL4C Hypothetical protein 
SP1093^IL4C Hypothetical protein 
SP1094^IL4C Aminotransferase, class-V 
SP1095^IL4C YESRibose-phosphate pyrophosphokinaseprsA
SP1096^IL8C7YESConserved hypothetical proteinccs38
SP1097vIL8C YESConserved hypothetical protein 
SP1098vIL8C Conserved hypothetical protein 
SP1099vIL8C Ribosomal LS pseudouridine synthase 
SP1100vIL4C Phosphate acetyltransferasepta
SP1264^IL4C Conserved domain protein 
SP1266^IL64C3, 7, 8NODNA processing protein DprA, putativedprA , dalA , cilB
SP1478^IL8C YESOxidoreductase, aldo/keto reductase family 
SP1479^IL16C YESPeptidoglycan N-acetylglucosamine deacetylase ApgdA
SP1480^IL8C YESHypothetical protein 
SP1808vIL16C3, 7NOType IV prepilin peptidase, putative; comCcclA , cilC
SP1809^IL32C8YESTranscriptional regulator, HTH_3 family, putative 
SP1810^IL32C YESHypothetical protein 
SP1811^IL2C YESTryptophan synthase, alpha subunit 
SP1897^IL4 Sugar ABC transporter, binding protein 
SP1908^IL643, 5, 7, 8NO*Single strand binding protein; ssbssbB, cilA
SP1937^IL16IE47, 8YESAutolysinlytA
SP1939^IL16IE58YESMATE efflux family protein DinFdinF
SP1940^IL16IE51, 7, 8NORecA proteinrecA
SP1941^IL32IE103, 7, 8, 9YESCompetence-inducible protein CinAcinA , CPIP116
SP1980^IL8C YEScmp-bf-1 DNA binding proteincbf1
SP1981^IL87, 9YES, YES (9)Competence-induced proteinCPIP767, ccs50
SP2013^IL6C Conserved hypothetical protein 
SP2014^IL4C IS630-Spn1, transposase Orf2IS630
SP2015^IL4C IS630-Spn1, transposase Orf1IS630
SP2016^IL16C YESNicotinate-nucleotide pyrophosphorylasenadC
SP2017vIL16C7YESMembrane proteinccs12
SP2018vIL4 Transposase, IS3 family, degenerate 
SP2019^IL49YES (9)ABC transporter, ATP-binding protein, truncationCPIP788
SP2045^IL8 YESConserved hypothetical protein 
SP2046^IL45C11YESUbiquinone methlytransferase 
SP2047^IL64C11NOConserved domain proteincglG
SP2048^IL64C8Conserved hypothetical proteincglF
SP2049^IL64C4Conserved hypothetical proteincglE
SP2050^IL64C NO (10)Competence protein; comGDcglD
SP2051^IL64C8NO (10)Competence protein; comGCcglC
SP2052^IL64C NO (10)Competence protein; comGBcglB
SP2053^IL64C3, 5, 7, 8, 9NO (10)Competence protein; comGAcglA , CPIP186, cilD
SP2196^IL8C YESABC transporter, ATP-binding protein 
SP2197^IL16C YESPyrimidine precursor biosynthesis enzyme 
SP2198^IL16C YESABC transporter, permease protein 
SP2199^IL4C YESConserved hypothetical protein 
SP2200vIL8 Hypothetical protein 
SP2201^IL64C7, 8YES, NO (8)Choline binding protein DcbpD , cbp3 , ccs46
SP2206^IL32C YESRibosomal subunit interface proteinyfiA
SP2207^IL64C5, 7NOCompetence protein, putative; comFBcflB
SP2208^IL128C5, 8, 9NO (10)Helicase, putative; comFAcflA , CPIP901

The expression of rpoD (SP1073, sigma-A) is induced 10-fold in a manner dependent on ComX (Peterson et al., 2000). Both flanking genes, SP1072 (dnaG) and SP1074, shared this pattern with rpoD. Failure to identify a combox upstream of these genes suggests that the induction may reflect an indirect response to the presence of the alternative sigma factor, ComX. Uniquely among the late genes, the cinArecA operon exhibited a reduced but non-zero degree of induction by CSP in the comX background (10-fold in comX versus 60-fold in wild type; Supplementary materialFig. S1); this induction parallels the pattern of early genes in the comX background and may represent read-thru by RNA polymerase transcribing the adjacent early genes, SP1942-5.

Antisense transcription of late genes

Polymerase chain reaction amplicon-based microarrays do not reveal the polarity of RNA transcripts. Because many operons lack a strong terminator, induction of one operon in a densely organized bacterial genome can give rise to an induction signal for downstream genes and independent data are necessary to evaluate the biological meaning of each signal. In the special case of antiparallel genes, antisense transcription may reveal instances wherein an observed hybridization signal is not indicative of gene expression. The orientation of CSP-induced transcription of many genes has been verified previously using lacZ reporter fusions. To evaluate the sense of induced transcription for additional sites, new reporter insertion mutations were constructed. Near SP2019, where a CSP-inducible promoter of known orientation was recovered in a promoter trap vector (Bartilson et al., 2001), six genes, SP2013–SP2018, were expressed in the late pattern. We found that at SP2017, which lies in a reversed orientation relative to neighbouring genes, the CSP-induced mRNA was copied from the non-template strand (Fig. 3; data not shown). A dozen genes downstream of SP1088 may represent a larger case of antisense transcription, as 11 genes in this region are induced in parallel with SP1088 but seven of them are in reversed orientation. This possibility is supported by our finding that SP1089 is read as an antisense CSP induced message (data not shown). The late gene cclA (SP1808) is an orthologue of the Bacillus comC gene and type IV prepilin peptidases. It abuts head-to-head four genes (SP1809–SP1812) of opposing polarity that are also induced up to 32-fold by CSP in parallel with SP1808. As suggested previously (Rimini et al., 2000), this also may reflect reverse transcription from the cilC (cclA) promoter. A small late gene (SP2200) adjacent to SP2201 may represent a similar case of reverse transcription. In view of the organization of these late gene loci, a minimum of two and a maximum of 14 of the late genes may be transcribed as antisense RNA. These instances may reveal interesting cases of antisense regulation, but more likely reflect mistaken ORF identifications or simply an absence of efficient terminators downstream of some ComX-dependent gene clusters.

Delayed genes

Nineteen genes exhibited the delayed expression pattern, in which mRNA continued to accumulate from the first minute of pheromone treatment until after expression peaked for both the early and late classes of genes. Expression of some of these genes also differed from the early and late patterns in being largely unaffected by inactivation of comX. This class includes the entire dnaK operon and eight additional operons (Table 3). For four operons (SP0338, SP0785-87, SP0798-99 and SP1380), induced expression in the comX background was not determined. The delayed expression pattern was reported by Rimini et al. (2000) for three genes in strain G54: dnaK (SP0517), ypjC (SP1111) and ‘orf190.’ We did not observe induction of SP1111 and orf190 is unique to G54.

Table 3. . Delayed CSP induced genes.
no.Genea orientationExpression patternbOther induction reportscCompetence in the mutant straindDescription of producta
WTcomX
  1. a. Gene number, orientation and product from Tettelin et al. (2001 ); v = clockwise and ^ = counter clockwise.

  2. b. Change in expression in wild type (SPMAv3) or in comX mutant (SPMAv2) as listed in Supplementary material , Table S1 . Individual microarray datasets are available in Supplementary material , Tables S7, S8 and S9. IDn = Delayed n-fold induction; C = Invariant pattern.

  3. c. References cited: 1Peterson et al. (2000 ); 2Rimini et al. (2000 ).

  4. d. Competence data from this work ( Supplementary material , Table S2 ): YES, more than 70% of normal competence; ∼, Genes or mutants not analysed.

SP0338vID4 YESATP-dependent Clp protease, ATP-binding subunit; ClpL
SP0515vID3ID4 Heat shock transcription repressor HrcA
SP0516vID3ID4 Heat shock protein GrpE
SP0517vID3ID31, 2DnaK protein
SP0519vID2ID3 DnaJ protein
SP0785vID2 Conserved hypothetical protein
SP0786 SP0787 SP0798v v vID2 ID2 ID2∼ ∼ ∼ ∼ ∼ ∼ABC transporter, ATP-binding protein, ABC transporter, ATP-binding protein, DNA-binding response regulator CiaR
SP0799vID2 Sensor histidine kinase CiaH
SP1027vID6ID6 YESConserved hypothetical protein
SP1029vID4ID4 YESRNA methyltransferase, TrmA family
SP1380^ID4 Hypothetical protein
SP1714vID6C YESTranscriptional regulator, GntR family
SP1715vID6C YESABC transporter, ATP-binding protein
SP1906^ID4ID3 GroEL, chaperonin,60 kDa
SP1907^ID4ID3 GroES, chaperonin, 10 kDa
SP2239vID8ID10 YESSerine protease
SP2240vID8ID9 YESSpspoJ protein

One delayed operon, the two-component regulatory system of ciaRH (SP0798 and SP0799), is known to have a strong effect on competence regulation (Guenzi et al., 1994; Zahner et al., 1996; Giammarinaro et al., 1999; Sebert et al., 2002). As most ciaR disruption mutations stimulate competence, it may be that the delayed induction of ciaR contributes to establishing the CSP-refractory state that follows competence (Echenique et al., 2000; Martin et al., 2000; Mascher et al., 2003). Four delayed operons included chaperones or proteases. The DnaK heat-shock operon includes genes for the DnaK DnaJ GrpE chaperone machine (Hartl, 1996) and HrcA, the heat-shock repressor. A second delayed operon encodes the GroEL/GroES chaperone and a third contains the gene for the chaperon protein HtrA (SP2239) (Sebert et al., 2002). Fourth, the ClpL protease subunit and chaperone was also induced as a delayed gene (SP0338). The role of these products in competence may be complex, as the chaperone machinery is known to perform several functions including refolding misfolded proteins (Hartl, 1996), folding of nascent peptides (Teter et al., 1999) and controlling the degradation of some proteins (Georgopoulos and Welch, 1993). Two other delayed gene clusters included ABC transporters. Uniquely among genes analysed, SP1714-15 were induced about 10-fold in the delayed gene pattern in wild type, but were induced very little in the comX mutant. Although the mechanisms of regulation of delayed genes have not yet been identified, the prominence of chaperones and proteases among them suggests it may be related to the gross shift in protein synthesis associated with competence development.

Repressed genes

Whereas message levels for many genes increased transiently during the response to CSP, they decreased for several specific sets of genes (Table 4). Among seven genes previously reported to behave this way in strain G54 (Rimini et al., 2000), four exhibited invariant expression in CP1250. The remaining three are adjacent ribosomal protein genes in a cluster of 24 genes that all exhibited identical profiles in this study: mRNA decreased abruptly but briefly to about half the precompetence levels at 12.5 or 15 min after CSP treatment (a pattern we designate RI) (Fig. 5). Several genes displayed a distinct repressed pattern (RII) in which message levels appeared to decrease continuously from the start of CSP treatment, and recovered to basal levels only after 15 min. Many of the 18 RII genes appear to be involved in carbohydrate or amino acid metabolism. In a distinct class (RIII) were two genes, SP0285 and SP2026, whose RNA levels decreased immediately on CSP exposure and remained low for 15 min. Both encode alcohol dehydrogenases. For SP2026, the decrease in level of mRNA was severe, 16-fold, but for SP0285 it was more modest. Whereas the mechanisms of repression are not known, the differences in the kinetics of RNA loss among the three classes of repressed genes suggest distinct regulatory mechanisms. For class RI genes, message minima coincided with the maximum of ComX sigma factor protein, for example, whereas class RIII message was severely reduced by 2 min, when the only elements of the regulatory circuits known to be active are ComD and ComE.

Table 4. . CSP repressed genes.
no.Genea orientationExpression patternbOther repression reportscDescription of producta
WTcomX
  1. a. Gene number, orientation and product from Tettelin et al. (2001 ); v = clockwise and ^ = counter clockwise.

  2. b. Change in expression in wild type (SPMAv3) or in comX mutant (SPMAv2) as listed in Suplementary material , Table S1 . Individual microarray datasets are available in Suplementary material, Tables S8 and S9. RI-n, RII-n and RIII-n = Repressed patterns I, II and III with n fold reduction; as defined in the text. C = Invariant pattern; AN = Anomalous pattern.

  3. c. Reference cited: 1, Rimini et al. (2000 ).

SP0208vRI-2  Ribosomal protein S10
SP0209vRI-2C Ribosomal protein L3
SP0210vRI-2  Ribosomal protein L4
SP0211vRI-2  Ribosomal protein L23
SP0212vRI-2  Ribosomal protein L2
SP0213vRI-2  Ribosomal protein S19
SP0214vRI-2  Ribosomal protein L22
SP0215vRI-2C Ribosomal protein S3
SP0216vRI-2  Ribosomal protein L16
SP0217vRI-2  Ribosomal protein L29
SP0218vRI-2  Ribosomal protein S17
SP0219vRI-2  Ribosomal protein L14
SP0220vRI-2 1Ribosomal protein L24
SP0221vRI-2C Ribosomal protein L5
SP0222vRI-2 1Ribosomal protein S14
SP0224vRI-2 1Ribosomal protein S8
SP0225vRI-2  Ribosomal protein L6
SP0226vRI-2  Ribosomal protein L18
SP0227vRI-2  Ribosomal protein S5
SP0228vRI-2  Ribosomal protein L30
SP0229vRI-2  Ribosomal protein L15
SP0230vRI-2  Preprotein translocase, SecY subunit
SP0231vRI-2  Adenylate kinase
SP0232vRI-2  Translation initiation factor IF-1
SP0268^RII-4  Alkaline amylopullulanase, putative
SP0282^RII-4  PTS system, mannose-specific IID component
SP0283^RII-4  PTS system, mannose-specific IIC component
SP0284^RII-4  PTS system, mannose-specific IIAB components
SP0285^RIII-4  Alcohol dehydrogenase, zinc-containing
SP0287vRII-2  Xanthine/uracil permease family protein
SP0288vRII-2  Conserved hypothetical protein
SP0289vRII-2  Dihydropteroate synthase
SP0290vRII-2  Dihydrofolate synthetase
SP0291vRII-2  GTP cyclohydrolase I
SP0292vRII-2  Bifunctional folate synthesis protein
SP0501vRII-4  Transcriptional regulator, MerR family
SP0502vRII-4C Glutamine synthetase, type 1
SP1123vRII-4  Glycogen biosynthesis protein GlgD
SP1174^RII-2  Conserved domain protein
SP1175^RII-2  Conserved domain protein
SP1241vRII-2C Amino acid ABC transporter, permease protein
SP1242vRII-2C Amino acid ABC transporter, ATP-binding protein
SP1243vRII-2C Glucose-6-phosphate 1-dehydrogenase
SP1244^RII-2C Signal recognition particle-docking protein FtsY
SP1249^RII-6AN Conserved hypothetical protein
SP1415^RII-2C Glucosamine-6-phosphate isomerase
SP1429^RII-2C Peptidase, U32 family
SP1631vRII-4  Threonyl-tRNA synthetase
SP1999^RII-2AN CcpA, catabolite control protein A
SP2026^RIII-16  Alcohol dehydrogenase, iron-containing
SP2070^RII-2  Glucose-6-phosphate isomerase
SP2078vRII-2  Arginyl-tRNA synthetase
SP2091vRII-2  Glycerol-3-phosphate dehydrogenase (NAD(P) +)
SP2092vRII-2  UTP-glucose-1-phosphate uridylyltransferase
SP2108vRII-2  Maltose/maltodextrin ABC transporter, binding protein
SP2109vRII-2  Maltodextrin ABC transporter, permease protein
SP2110vRII-2  Maltodextrin ABC transporter, permease protein
SP2111vRII-2  MalA protein
SP2112vRII-2  Repressor protein
SP2132vRII-4  Conserved hypothetical protein
SP2148vRII-2  Arginine deiminase
SP2150vRII-2  Ornithine carbamoyltransferase
SP2151vRII-2  Carbamate kinase
SP2152vRII-2 1Conserved hypothetical protein

Figure 5. Expression profiles of genes at selected CSP-repressed loci . Fluorescence signal ratios from duplicate elements of microarray SPMAv3 averaged over two hybridizations, expressed as ratios to signals from reference RNA mixture ( Supplementary material Table S9). A. Class R-1 genes SP0218-SP0222. B. Class RII genes SP2108-SP2112 and Class RIII gene SP2026.

Download figure to PowerPoint

image

Mobile elements

Interpretation of temporal expression patterns for pneumococcal transposable elements is complicated by the circumstance that multiple copies of nine different IS elements are distributed throughout the genome (Tettelin et al., 2001). Most probes containing these elements displayed invariant signals during the CSP response. However, every IS1167 array element displayed a pattern of twofold late induction (data not shown; Supplementary material). Similarly, IS630-Spn-1 elements displayed a late fourfold induction pattern. As all copies of each of these two elements are approximately 99% similar in sequence, hybridization results represent the average expression change of all contributing elements. The two IS induction patterns may reflect location of a copy of each IS in physical proximity to a late promoter. Indeed, the R6 chromosome (Hoskins et al., 2001) contains one IS630 copy within the SP2016 (SPR1829) (late) locus. As two different sites of possible competence specific transcription were found for IS1167 [adjacent to the SP1100 (SPR1008) – SP1101 (late), and SP1905 (SPR1721) – SP1906 (delayed) loci], the expression pattern of IS1167 may represent a combination of both late and delayed temporal patterns.

Analysis of induced loci by directed mutagenesis

To begin to assess the importance of CSP inducible genes with respect to competence, targeted deletions were constructed at most of the induced loci that had not previously been subjected to genetic analysis, as described in Experimental procedures. The transformation efficiency of each mutant strain was determined by treatment of non-competent log phase cells with CSP in the presence of donor Nov–R DNA. The results are summarized in Tables 1–3 and details are available in the Supplementary material. Interestingly, with one exception (SP2047), most inducible genes identified as such for the first time in this study proved to be individually dispensable both for induction of competence by synthetic pheromone and for subsequent DNA uptake and integration. In addition to deletions of individual CSP inducible genes, in several cases, deletions of the putative combox were also constructed to abolish the pheromone response of entire sets of genes in induced operons (green ‘c’ boxes in Fig. 3). In the SP0200 locus, SP0201 was not individually included in our mutagenesis assay, but its putative combox was deleted to abolish CSP dependent expression. SP1092-94 were also indirectly assayed using same strategy.

As the CSP-induced expression of early genes persists much longer in a comX mutant than in wild type, it has been suggested that a late gene might participate in the shutoff of early gene expression (Lee and Morrison, 1999; Claverys and Håvarstein, 2002). Therefore, many of the mutants were examined more closely to determine whether the kinetics of competence appearance or decay were altered. In none of the mutants examined in this way was the temporal profile of competence decay significantly altered (Supplementary material, Table S2).

Two specific loci were scrutinized in light of previous gene disruption data. An insertion-duplication mutation disrupting SP0018 downstream of amino acid residue 51 was reported to abolish plasmid transformation in strain D39 (Bartilson et al., 2001). To determine which of the genes at this early locus is involved in transformation, each ORF within the operon was individually targeted for deletion analysis. Deletion of SP0018 abolished chromosomal transformation, whereas deletion of SP0019 (purA) resulted in purine auxotrophy but did not block transformation (Supplementary material,Table S2). Induction of both of these genes by CSP was also confirmed by lacZ fusion reporters (Supplementary material,Table S2). Because SP0018 was expressed in the early gene pattern and the gene is required for competence development, we propose that it be designated comW. A mutation that truncated cbp3 (SP2201) after codon 236 and introduced a terminator signal between that gene and the downstream genes blocked transformation in strain G54 (Rimini et al., 2000). In contrast, deletion of 90% of gene cbp3 allowed a normal level of CSP-induced competence in the Rx-1 background, as did the deletion of four downstream genes together (SP2196–SP2199) (Table 2; Supplementary material, Table S2). These contrasting results show that an intact cbp3 gene is not required for high efficiency transformation and suggest that either the cbp3 mutant in G54 strain failed to transform due to an independent defect outside of cbp3, or that competence depends on induced expression of some part of a region including the last 102 bp of cbp3 and 510 bp downstream that was not deleted by our mutation.

Discussion

  1. Top of page
  2. Summary
  3. Introduction
  4. Results
  5. Discussion
  6. Experimental procedures
  7. Acknowledgements
  8. Supplementary material
  9. References
  10. Supporting Information

We examined the role of ComX in gene regulation under two circumstances. During the exponential growth phase in a rich medium, inactivation of comX had little detectable effect on any of the orfs surveyed, consistent with the very low levels of ComX in such cells. In the special context of cells responding to the pheromone CSP, ComX is present at elevated levels and proved to be required for the induced expression of a majority of the CSP-induced genes. It remains possible that ComX also participates in transcription of different genes under other circumstances. The expression data reported here revealed that genes regulated in response to the pheromone signal are considerably more numerous than previously recognized, and can be regarded as comprising at least four regulons – the early, late, delayed and repressed classes. Five of the early operons are required for transformation, whereas six are not, and two have not been tested. All late genes examined in the comX background were dependent on comX for inducible expression. Being independent of comX, delayed genes may respond to stress signals arising from the sudden induction of early genes or to cross-talk between competence and stress response pathways. Finally, several sets of genes were transiently repressed during the response to CSP, although a specific linkage to ComX activity was not established.

Power and limitations of expression surveys

Various potentially ‘global’ expression strategies have been applied to identify sets of genes differentially regulated as cells develop competence for genetic transformation in S. pneumoniae. As none of the strategies was pursued to saturation, it is not surprising that different but overlapping sets of regulated genes were identified. In the current study, DNA microarrays were used to make quantitative estimates of RNA level changes for nearly all ORFs, making possible the definition of the physical limits and identity of many clusters of co-regulated genes. The experience reported here illustrates several strengths of the DNA microarray as a tool for bacterial gene expression surveys. First, the hybridization results indicated that 90% of assayed ORFs exhibited little expression variation over the 20 minute duration of experimental CSP treatment. Second, changes in gene expression were faithfully reproduced with an apparent time resolution on the scale of minutes and a sensitivity sufficient to reveal twofold changes in expression. Third, the microarray assay was capable of quantitative accuracy over a dynamic range in excess of 100-fold, as established by comparison to expression profiles generated using real-time PCR.

However, there are also limitations in the use of expression surveys. As one gene is often transcribed and regulated from several promoters, the measure of ‘fold change’ for any gene is by definition related to its basal level of expression in a given context. It is probable that some genes have various expression states depending on the particular combination of promoters active at any given time. One must therefore use some caution in even the simple comparison of the magnitudes of expression of genes assayed in different environments. It is also apparent from our real-time PCR results that for some CSP-induced genes with very low basal levels of expression, the microarray data underestimated the degree of induction (by two to 16-fold). This apparent dampening may be related to low-level non-specific hybridization (cross-hybridization) causing an overestimate of basal expression levels. Improvements in microarray design and hybridization chemistry may help to increase the quantitative dynamic range of the DNA microarray assay. Finally, microarrays constructed with PCR amplicons represent both strands of the target and thus do not distinguish RNA polarity; thus interpretation of the biological relevance of a measured expression change must be regarded as somewhat provisional, pending more detailed investigation of each gene of interest.

It is common that microarrays based on a sequenced reference strain are used in conjunction with RNA isolated from other strains. It is obvious that differences in genomic content between the strains will affect the number of genes one can study. In the present study, for example, genes specific to either R6 or TIGR4, such as that for the competence-specific methylase DpnA (Lacks et al., 2000), were not analysed for this reason. Less clear is the impact of sequence divergence between target and probe on quantitative expression data and its interpretation. Finally, even when changes in RNA levels are accurately measured, the biological significance of a given increase in mRNA remains an experimental question. In the case of recA, for example, mRNA has been estimated to increase up to 30-fold in competent cells, whereas the RecA protein itself increases only fourfold; yet, the fourfold increase in protein is responsible for 95% of the yield of genetic recombinants (Mortier-Barrière et al., 1998; Peterson et al., 2000).

Regulation of early genes

The distinction between early and late CSP responsive genes was first described by Alloing et al. (1998), who found that comDE mRNA was maximal about 5 min before that of recA. Subsequently, SP1706–07 (yhaPQ), comAB and comX were added to the class (Peterson et al., 2000; Rimini et al., 2000). The present results identify additional early operons, expanding the number of known early operons from five to 13. The mechanism by which early genes are regulated is not fully understood. Ten of the early operons are associated with a direct repeat similar to those described by Ween et al. (1999) as ComE binding sites (Fig. 4), and it may be that ComE acts similarly at most or all of them. However, no RNA start sites have been mapped for these genes, nor has ComE dependent activation of transcription been demonstrated in vitro. Furthermore, many other genes appear to affect this regulation in unknown ways (Martin et al., 2000; Claverys and Håvarstein, 2002), and the basal expression levels of several of these genes are critical determinants of the relation between cell number and endogenous competence induction. The three early operons without discernable direct repeat sites may be regulated indirectly or may reflect a broader binding specificity of ComE than has been previously recognized.

A new regulatory link?

The coordinated temporal expression of the late genes suggests they share a common regulatory mechanism. However, although they are linked by a shared dependence on several early gene products for quorum sensing and on ComX as a sigma factor, regulation of ComX dependent genes is also not completely understood. A critical piece of the competence regulatory circuit apparently remains to be identified, as recent results (Luo et al., 2003) showed that induction of transcription of comX was not by itself sufficient to achieve normal levels of competence. This implies that some other product of the response to CSP is also required for effective ComX activity or for post-transcriptional regulation of ComX. The identity of this missing factor is unknown, but it is clear that ComE-dependent early genes would be reasonable candidates and it is interesting that the new early gene comW (SP0018) is in fact essential for competence. Sequence comparisons of its predicted small protein product provide no clue to its function. As depicted in Fig. 6, theoretical possibilities include a direct role in DNA processing in parallel with late genes, a regulatory role in parallel with ComX, perhaps to activate a separate set of ‘late’ genes, or a role in the synthesis or the activity of ComX.

Figure 6. Summary of CSP-dependent gene regulation . Regulation in the autocatalytic quorum-sensing circuit, blue solid arrows . Known links to late genes and DNA processing proteins via comX, red solid arrows. Green dotted arrows indicate hypothetical links of early gene products to comX or late genes, to repressed genes, and to delayed class genes . Dashed red arrow, hypothetical retro-regulation.

Download figure to PowerPoint

image

Roles of the CSP-responsive genes

Some early genes act in setting the basal level production of CSP and in the autocatalytic circuit by which the level of CSP is amplified (Martin et al., 2000). Another early gene product, ComX, links the pheromone signal to late gene expression, and to the rapid shutoff of both early and late genes. Yet another early gene is also needed for competence development, but it is not yet known how it acts. Altogether, 14 late genes are also known to be required for competence. However, the majority of the induced genes identified by expression analysis are dispensable for competence (Fig. 3). Their presence as the majority class in CSP induced operons may reflect a loose organization of competence operons, with read-through past competence genes creating adventitious induced genes that may persist during evolution when not deleterious to fitness. Antisense transcription of several genes that reside next to induced late gene operons also indicates that the size of CSP dependent transcripts is not strictly controlled. Indeed, if the competence regulon is expressed rarely and briefly, there may be little cost associated with co-transcribing superfluous competence induced genes. The extent to which the extant genome achieves a highly conservative gene organization to maximize co-regulation of genes with common activities is complex but may be thought partly to represent a balance between selective disadvantages stemming from ‘loose’ organization and the value that a high frequency of genome rearrangements may provide.

Competence-induced genes that are not required for efficient transformation may have roles in transformation that our assays would not detect or roles that are masked by functional redundancy, or they may contribute to other traits dependent on the same quorum sensing regulation that controls competence. Many of them are hypothetical or conserved hypothetical proteins without orthologues of known function. As the ‘trigger’ for development of competence is not the presence of external DNA, but culture population density combined with aspects of the metabolic state of the cells, it would not be surprising if genes for traits other than competence were activated under the control of the same quorum-sensing circuit. Indeed, although none of the well-studied pneumococcal virulence factor genes appears in Table 1 or Table 2, comD mutations have been repeatedly reported to reduce virulence in murine models of pathogenesis, leading to the suggestion that some CSP-regulated genes may have important roles in virulence (Bartilson et al., 2001; Lau et al., 2001; Hava and Camilli, 2002).

Bacteriocins and competence

Because genetic exchange in natural genetic transformation must depend on a source of bacterial donor DNA, it is interesting that production of bacteriocins has recently been associated with competence both in S. sanguis (N. Heng, pers. comm.) and in the pneumococcal strain G54 (Rimini et al., 2000) and that DNA release in late stages of pneumococcal culture growth has recently been described as dependent on CSP and comE (Steinmoen et al., 2002). We observed two sets of CSP-responsive pneumococcal genes that may also participate in production of bacteriocin-like proteins. Two genes (SP0124/5) that encode short bacteriocin-like GG peptides were induced as late genes. Because these genes are not associated with other elements of a typical bacteriocin elaboration locus, comAB can be considered as a candidate for the transporter responsible for their processing and export. In addition, blpA, blpY and blpZ were among the identified early genes. The blp locus produces at least one bacteriocin and is regulated by a peptide pheromone, through a two component system, BlpH and BlpR, homologous to the competence regulatory TCS, ComD and ComE (Supplementary material,Fig. S3) (de Saizieu et al., 2000).Claverys and Håvarstein (2002) reported that the blp pheromone, BIP, does not induce competence and argued that the blp and com quorum-sensing systems operate independently. The present data suggest that although no ComE direct repeat is apparent at any of these three genes, it is possible that there may be cross-talk between CSP signalling and bacteriocin regulation. As blpZYA are thought to be regulated by the BlpC pheromone via BlpH and BlpR, such cross-talk might occur at the level of the receptor (ComD), the response regulator (ComE), or even through ComA, if it can transport BIP. This idea is supported by the observation that induction of these genes is strong in the comX deficient strain, where early genes are overexpressed, but only marginal in the wild type. If any of these mechanisms is active, it is curious that other blp genes sharing the same BlpR binding sites (blpBHRST) were not induced during competence.

Experimental procedures

  1. Top of page
  2. Summary
  3. Introduction
  4. Results
  5. Discussion
  6. Experimental procedures
  7. Acknowledgements
  8. Supplementary material
  9. References
  10. Supporting Information

Strains and media

Streptococcus pneumoniae strains CP1250 ( hex malM511 str-1 bgl- 1; Pestova et al., 1996 ) and CPM8 (CP1250, but Δ comX1 :: Em Δ comX2 :: Tet; Lee and Morrison, 1999 ) used for mutagenesis and for mRNA preparation were derived from strain Rx which is closely related to the R6 strain recently sequenced by Eli Lilly ( Tiraby et al., 1975 ; Hoskins et al., 2001 ) . The template for array element amplification was DNA from strain TIGR4 ( Tettelin et al., 2001 ) . Media for growth and competence induction in S. pneumoniae were as described ( Lee and Morrison, 1999 ) . Plasmids used were pEVP3 ( Claverys et al., 1995 ), pR410 ( Sung et al., 2001 ) and pR412 ( Martin et al., 2000 ) . Escherichia coli strains were grown in Luria–Bertani medium . DNA from strain CP1500 ( hex nov-r1 bry-r str-1 ery-r1 ery-r2 ; Cato and Guild, 1968 ) was the donor DNA for transformation assays . Antibiotics were used at the following concentrations: chloramphenicol, 34 µg ml −1 for E. coli and 2.5 µg ml −1 for S. pneumoniae ; novobiocin, 2.5 µg ml −1 ; kanamycin, 200 µg ml −1 ; spectinomycin, 100 µg ml −1 .

Construction of replacement mutations and reporter fusions

Kanamycin resistance cassettes (KAN (AY334018) and KANT (AY334019)) were amplified from plasmid pR410 by PCR using primer pairs DAM301-302 (for KAN) and DAM303-304 (for KANT) (Supplementary material, Table S6). A spectinomycin resistance cassette [SPC (AY334020)] was obtained from plasmid pR412 using EcoRI and XbaI. Gene deletion-replacement mutations were constructed by the method described in Lau et al. (2002), in which assembly of two PCR fragments flanking a resistance cassette allows targeted gene replacement by double-crossover recombination in transformation of competent CP1250 cells. After single-colony purification or a backcross, each gene replacement was verified by showing the expected loss of an internal PCR fragment and the gain of both expected junction PCR fragments. Some junctions were also verified by sequencing. To construct reporter fusions, single 200–500 bp PCR targeting fragments were cloned in plasmid pEVP3. After transformation of CP1250, the clones resulting from a circular integration of the plasmid were verified by PCR to detect both expected new junction fragments, and by sequencing the target::lacZ junction itself. To make strains containing SP2017 or SP1089 fusions, targeting fragments were amplified by PCR using primers DAM633 and DAM634, DAM637 and DAM638, respectively, and template DNA from CPM16 (Lee and Morrison, 1999). To obtain vector insertions in both orientations, pEVP3 was digested with XbaI and either BamHI or BglII, and ligated with the PCR fragments digested with BamHI and XbaI. Chloramphenicol-resistant (CmR) transformants of E. coli DH5α carrying a chimeric plasmid were identified by restriction analysis. A single confirmed plasmid of each type was used for transformation of CP1250 and CmR colonies were selected, resulting in strains CP1779 (SP2017 fwd), CP1780 (SP2017 anti), CP1781 (SP1089 fwd) and CP1782 (SP1089 anti). The integrate structures were confirmed by PCR amplification of a junction fragment using primers complementary to pEVP3 and to a sequence upstream of the integrated vector.

Pheromone treatment and RNA extraction

For mRNA analysis, a 1.5-litre culture growing at 37°C in CAT, supplemented with 10 mM HCl to ensure the absence of endogenous induction of competence, was mixed at OD550 0.05 with 0.3 mg CSP-1, 75 ml of 4% BSA, and 7.5 ml of 0.1 M CaCl2 to initiate competence development. Sample ‘0’ was taken 10 s before addition of the CSP, whereas remaining samples were taken at the indicated times thereafter. Samples were harvested exactly as described in Peterson et al. (2000) by pouring 100 ml of culture into 100 ml phenol, 0.1 M citrate buffer pH 4.3, and 0.1% SDS at 80°C. After preliminary purification with phenol and chloroform and precipitation with isopropanol, the crude RNA was treated with DNase and purified further using RNeasy columns (Qiagen). The typical yield from 100 ml of cells was approximately 500 µg of purified RNA.

DNA microarray design

Two distinct microarrays were used to obtain quantitative kinetic expression data. SPMAv2 was a subgenomic microarray representing approximately 215 apparently CSP-inducible genes and a group of apparently invariant genes, totalling 340 unique genes, printed at a redundancy of 15. For SPMAv2, PCR products (200–500 bp) were designed to be maximally unique within the TIGR4 genome, using a modification of the Primer3 program (Rozen and Skaletsky, 1998), which uses the coordinates of each ORF and user specified inputs including Tm range and permissible ranges for primer length and PCR product size. The candidate primers were analysed to minimize self-hybridization and primer dimer forming potential, and were ranked for selection of an optimized primer pair. In a second step, the program used blast alignment of the expected PCR product to the complete TIGR4 genome sequence to reject primer pair candidates that result in DNA sequence identities > 90% over any length of sequence greater than 25 nucleotides. The primer pairs used for PCR amplification are listed in Supplementary material, Table S4. SPMAv3 was a PCR-amplicon microarray designed on the basis of the complete sequence of the TIGR4 genome as described above for SPMAv2. Among 2236 annotated ORFs, only 2129 represent single copy sequences. 2013 of these ORFs were represented on the array in quadruplicate (91% of the total number of ORFs, 95% of the unique ORFs). Nearly all of the unique ORFs not included in SPMAv3 are instances of ORFs smaller than 200 bp. The PCR amplification of ORFs using these primers (Supplementary material, Table S5, Illumina) and TIGR 4 genomic DNA template (20 ng reaction−1) resulted in over 99% success, as assessed by the formation of a single PCR product and a minimum yield of 2000 ng. The PCR products were sequence validated across their entire length and were stored at −80°C.

Array construction

Polymerase chain reaction conditions were as described in Peterson et al. (2000). Polymerase chain reaction products were dissolved in 50% DMSO and deposited onto 25 mm × 75 mm CMT-UltraGAPS amino silane coated glass microscope slides (Corning, Acton MA) using an Amersham Lucidea Printer. Humidity was maintained at ∼65% during printing. After printing, slides were air-dried for 30 min, and DNA was cross-linked to the surface by using a Stratalinker (Stratagene) to deliver 50 mJ/cm2 of short-wave-length UV energy, and stored in a dessicator at room temperature.

Probe preparation

Six µg of Random Hexamer Primers (Invitrogen, Carlsbad, CA) were annealed to 2 µg of RNA in a total volume of 18.5 µl by heating the reaction to 70°C for 10 min, followed by quick chilling on ice. To this reaction was added 6 µl of 5× Reverse Transcriptase cDNA reaction buffer (250 mM Tris-HCl, pH 8.3, 375 mM KCl, 15 mM MgCl2), 3 µl 0.1 M DTT (Invitrogen), and 0.6 µl of a dNTP solution containing (25 mM each dATP, dGTP, and dCTP, 8 mM dTTP (Invitrogen) and 17 mM amino allyl-dUTP (Sigma)). Reverse Transcriptase Superscript II (Invitrogen) (400 U in 2 µl) was added and reactions were then incubated at 42°C for three hours to overnight. The RNA template was hydrolysed by adding 10 µl 1 M NaOH and 10 µl 0.5 M EDTA and heating to 65°C for 15 min. The solution was neutralized with 25 µl 1 M Tris pH 7.0 and cDNA was purified over QIAquick PCR purification columns replacing the Tris buffers with phosphate buffers (5 mM KPO4 pH 8, 80% EtOH) and (4 mM KPO4 pH 8) for column washing and cDNA elution respectively. Reactions were dried to completion in a Speed-Vac centrifuge. Aminoallyl-labelled cDNA was resuspended in 4.5 µl 0.1 M Na2HCO3, pH 9.3, plus 4.5 µl containing 63 µg of the appropriate ester dye (Cy3 or Cy5). The coupling reaction proceeded for 2 h at room temperature in the dark. Thirty-five µl 100 mM NaOAc, pH 5.2 was added to coupling reactions which were then purified with QIAquick PCR purification columns, using the supplied buffers. The cDNA probes, Cy3 and Cy5, were combined and dried to completion.

Pretreatment, hybridization and slide washing

Slides were pretreated for 2 h at 42°C in a 50-ml solution of 5 × SSC, 0.1% SDS and 1% BSA followed by four washes in MilliQ water, and three washes in isopropanol. Residual isopropanol was removed by brief centrifugation at 2500 g. Dried probes were resupended in 30 µl hybridization buffer (50% formamide, 5× SSC, 0.1% SDS and 100 µg ml−1 salmon sperm DNA). This mixture was heated for 10 min at 95°C and then applied to a pretreated slide under a cover slip. Hybridizations were carried out at 42°C for 16 h in a sealed hybridization chamber (Corning ♯2551), which was humidified with 20 µl of 5× SSC. Microarrays were washed once in ∼ 200 ml 2× SSC/0.1% SDS at 55°C for 10 min, once in 0.1× SSC/0.1% SDS for 10 min at room temperature, three times in 0.1× SSC for 2 min each, and finally in MilliQ water. They were scanned immediately or following storage in a dessicator in the dark.

Array validation using real-time PCR

After cDNA templates were created from 1 µg sample and control RNA's using a TaqMan Reverse Transcription Kit (Applied Biosystems ♯N808-0234), qRT-PCR reactions were carried out in an iCycler RT-PCR system (Bio-Rad) using the QuantiTect SYBR Green PCR Kit (Qiagen ♯204143) and primers designed with Primer Express 2.0 (Applied Biosytems) for PCR products that were ∼100 bp in length. Genes selected for this real-time PCR quantification were SP0023, SP0042, SP0043, SP0954, SP1073, SP1096, SP1478, SP1714, SP1810, SP1937, SP1940, SP1945, SP2049, SP2050, SP2053, SP2236 and SP2239. SYBR Green signal measurements were collected for experimental samples at each time point of the competence induction assessed using the SPMAv2 DNA microarray, in duplicate, and all experiments were performed at least twice. A standard curve was prepared using seven serial 10-fold dilutions of a cDNA template of known concentration. Relative RNA abundance measurements at individual time points throughout competence were made by comparing PCR formation profiles to each other.

Experimental data and replicates

RNA preparations obtained from cultures of CP1250 at short intervals after a sudden dose of CSP were analysed using DNA microarrays. Signals obtained from replicate spots were averaged to obtain a mean signal strength and standard deviation for the measurement of mRNA level for each gene and each RNA extract. Independent cultures of CP1250 were analysed with the two microarrays. Four replicate hybridizations to SPMAv2 provided up to 60 hybridization measurements per gene, affording excellent statistical reliability for the quantitative expression profiles. RNA obtained in a similar induction experiment using the comX deficient strain, CPM8, was also used to probe the SPMAv2 array (Supplementary material, Table S8). Results obtained for wild type with SPMAv2 (Fig. 1) were essentially reproduced using SPMAv3 (Supplementary material, Table S9), which extended coverage to nearly 90% of ORFS. Supplementary materialTable S1 lists all CSP response classifications from the two microarray versions used. Most induction patterns for CSP-responsive genes we report have both biological and physical replicates.

The program sam (TIGR Multi-Experiment Viewer) was used to determine significance of the data using ratios at the peak of expression and ratios from the two time-points on either side of the peak compared to a similar number of data points representing basal expression ratio measurements. A delta setting for significance of 2.0 was used to set the criterion for inclusion in Tables 1 and 2 and for all genes in Table 3 except SP0338, SP0798 and SP0799. The magnitude of induction for several genes at the blp locus (SP0530, SP0545-47) was marginal in experiments using wild-type cells, but was scored as significant by SAM in the comX expression analysis. We included three other marginal cases (SP0979, SP1092 and SP1811) as late genes despite the lack of support from SAM since these genes are members of multigene clusters (apparent operons) containing 4, 12 and 4 genes, respectively, which were scored as significant by SAM. Several genes were not classified as CSP responsive despite at least one indication of a response to CSP. At three sites, the genes SP1022 and SP1948, and the gene cluster SP1930–1936, twofold or fourfold late gene induction was observed in one experiment but not in a biological replicate. As two of these sites are immediately downstream of strongly induced late genes, this may reflect variable read through past a transcription terminator signal. Four genes with apparent early induction patterns (SP0429–430, SP0635 and SP1547) and two with apparent late induction patterns failed to meet the SAM criterion. Four genes reported previously to be repressed (Rimini et al., 2000) were invariant in this work: SP0418, SP1105, SP1128 and SP1354 (Supplementary material,Fig. S1). SP0921, SP1147, SP1660, SP2065, and SP2144 were isolated genes with twofold induction patterns in our experiments. SP0965 was reported as induced threefold by CSP (Rimini et al., 2000), but with SPMAv3, the signal for this gene was invariant. Other similar cases are SP1154, SP1409, SP1468, SP1482, SP1583 and SP1976. Finally, genes at one site, SP0202-0208, exhibited a pattern different from all others, with RNA levels like those in a fourfold induced late gene, but with a high initial value.

Competence assays

CP1250 and mutant derivatives were grown at 37°C in CAT supplemented with 8 mM HCl. At OD550 0.05, 1 ml of each culture was added to 9 ml of warm CAT containing 8 mM HCl (to avoid premature endogenous induction of competence), 0.5 mM CaCl2 and 0.2% BSA, and 1.6 ml of the dilution was transferred to a warm Eppendorf tube. After 20 min at 37°C, CSP was added to 200 ng ml−1. Immediately, and at successive 5-min intervals for 50 min, 0.1-ml culture samples were mixed with 10 ng of NovR donor DNA, incubated 5 min at 37°C, then diluted 1:150 into 1.5 ml of CAT containing 7.5 µg DNase. After 60 min at 37°C, the entire culture was mixed with 1.5 ml of CAT agar (49°C) and poured onto a 3 ml layer of agar in a 75-mm Petri dish. The plate was then covered with 3 ml of melted CAT agar, and finally with 3 ml of CAT agar containing 10 µg Nov ml−1. Transformants were counted after 20 h at 37°C.

Beta-galactosidase reporter assays

Beta-galactosidase activity was measured in non-competent cultures, and in a parallel sample of the same culture treated with CSP at OD 0.05 for 20 min or for times between 10 and 70 min. After lysis with 0.1% Triton X-100 for 10 min at 37°C, lysates were used for beta-galactosidase assay as described in Pestova and Morrison (1998).

Acknowledgements

  1. Top of page
  2. Summary
  3. Introduction
  4. Results
  5. Discussion
  6. Experimental procedures
  7. Acknowledgements
  8. Supplementary material
  9. References
  10. Supporting Information

This work was supported in part by the U.S. National Science Foundation (MCB-0110311, to D.A.M.) and through support to S.N.P. and R.D.F. from the NIAID contract N01-AI-15447. The generosity of Hervé Tettelin in making available genome sequences prior to publication is gratefully acknowledged. We are grateful to Jean-Pierre Claverys for plasmids pR410 and pR412 and to Susan Hollingshead for generously providing certain primer pairs used in microarray construction. We thank Chris Mader for bioinformatics support and J. Quackenbush and members of his group for analysis tools and support. We thank Maria Giovanni and Michael Gottlieb for their support and dedication to the pathogen and parasite research communities.

Supplementary material

  1. Top of page
  2. Summary
  3. Introduction
  4. Results
  5. Discussion
  6. Experimental procedures
  7. Acknowledgements
  8. Supplementary material
  9. References
  10. Supporting Information

Fig. S1. Expression patterns of CSP-inducible loci and of certain invariant genes described elsewhere as CSP responsive.

Fig. S2. Silent combox genes.

Fig. S3. Cross-regulation between the com and blp two-component systems.

Table S1. Expression patterns in WT and comX mutant.

Table S2. Mutational analysis of competence induced genes.

Table S3. Terminator site candidates in TIGR4 near CSP-inducible loci.

Table S4. Primers used for constructing SPMAv2.

Table S5. Primers used for constructing SPMAv3.

Table S6. Primers used for constructing mutations.

Table S7. Normalized microarray data (SPMAv2) for kinetics of response to CSP by strain CP1250.

Table S8. Normalized microarray data (SPMAv2) for kinetics of response to CSP by comX deficient strain CPM8.

Table S9. Normalized microarray data (SPMAv3) for kinetics of response to CSP by strain CP1250.

Table S10. Comparison of gene expression in WT and comX deficient strain before CSP treatment.

Table S11. Coefficients of variation for the data in Table S7.

Table S12. Coefficients of variation for the data in Table S8.

Table S13. Coefficients of variation for the data in Table S9.

References

  1. Top of page
  2. Summary
  3. Introduction
  4. Results
  5. Discussion
  6. Experimental procedures
  7. Acknowledgements
  8. Supplementary material
  9. References
  10. Supporting Information

Supporting Information

  1. Top of page
  2. Summary
  3. Introduction
  4. Results
  5. Discussion
  6. Experimental procedures
  7. Acknowledgements
  8. Supplementary material
  9. References
  10. Supporting Information

Table S1. Expression patterns in WT and comX mutant. Table S2. Mutational analysis of competence induced genes. Table S3. Terminator site candidates in TIGR4 near CSP-inducible loci. Table S4. Primers used for constructing SPMAv2. Table S5. Primers used for constructing SPMAv3. Table S6. Primers used for constructing mutation. Table S7. Normalized microarray data (SPMAv2) for kinetics of response to CSP by strain CP1250. Table S8. Normalized microarray data (SPMAv2) for kinetics of response to CSP by comX deficient strain CPM8. Table S9. Normalized microarray data (SPMAv3) for kinetics of response to CSP by strain CP1250. Table S10. Comparison of gene expression in WT and comX deficient strain before CSP treatment. Table S11. Coefficients of variation for the data in Table S7. Table S12. Coefficients of variation for the data in Table S8. Table S13. Coefficients of variation for the data in Table S9. Fig. S1. Expression patterns of CSP-inducible loci and of certain invariant genes described elsewhere as CSP responsive. Fig. S2. Silent combox genes. Fig. S3. Cross-regulation between the com and blp two-component systems.

FilenameFormatSizeDescription
MMI_3907_sm_TableS1.pdf99KSupporting info item
MMI_3907_sm_TableS2.pdf10KSupporting info item
MMI_3907_sm_TableS3.pdf59KSupporting info item
MMI_3907_sm_TableS4.pdf75KSupporting info item
MMI_3907_sm_TableS5.pdf183KSupporting info item
MMI_3907_sm_TableS6.pdf16KSupporting info item
MMI_3907_sm_TableS7.pdf179KSupporting info item
MMI_3907_sm_TableS8.pdf143KSupporting info item
MMI_3907_sm_TableS9.pdf394KSupporting info item
MMI_3907_sm_TableS10.pdf647KSupporting info item
MMI_3907_sm_TableS11.pdf77KSupporting info item
MMI_3907_sm_TableS12.pdf78KSupporting info item
MMI_3907_sm_TableS13.pdf153KSupporting info item
MMI_3907_sm_TableS13.pdf153KSupporting info item
MMI_3907_sm_TableS13.pdf153KSupporting info item
MMI_3907_sm_FigS1.pdf383KSupporting info item
MMI_3907_sm_FigS2.ppt70KSupporting info item
MMI_3907_sm_FigS3.ppt68KSupporting info item

Please note: Wiley-Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.