In many streptococci, competence for natural DNA transformation is regulated by the Rgg-type regulator ComR and the pheromone ComS, which is sensed intracellularly. We compared the ComRS systems of four model streptococcal species using in vitro and in silico approaches, to determine the mechanism of the ComRS-dependent regulation of competence. In all systems investigated, ComR was shown to be the proximal transcriptional activator of the expression of key competence genes. Efficient binding of ComR to DNA is strictly dependent on the presence of the pheromone (C-terminal ComS octapeptide), in contrast with other streptococcal Rgg-type regulators. The 20 bp palindromic ComR-box is the minimal genetic requirement for binding of ComR, and its sequence directly determines the expression level of genes under its control. Despite the apparent species-specific specialization of the ComR–ComS interaction, mutagenesis of ComS residues from Streptococcus thermophilus highlighted an unexpected permissiveness with respect to its biological activity. In agreement, heterologous ComS, and even primary sequence-unrelated, casein-derived octapeptides, were able to induce competence development in S. thermophilus. The lack of stringency of ComS sequence suggests that competence of a specific Streptococcus species may be modulated by other streptococci or by non-specific nutritive oligopeptides present in its environment.
Lateral gene transfer (LGT) plays a dominant role in driving the evolution and survival of microbial populations. Among LGT mechanisms, natural transformation of extracellular, naked double-stranded DNA contributes to genome plasticity through the stable acquisition/loss of genetic material (Lorenz and Wackernagel, 1994). Only a fraction of the population is usually able to transform DNA, by entering a transient physiological state termed competence during which proteins necessary for DNA uptake and processing are produced (Johnsborg et al., 2007). Competence for DNA transformation is usually an inducible trait. Notably, Gram-positive species respond to secreted signalling oligopeptides referred to as competence pheromones/alarmones. They belong to a class of cell–cell communication molecules which enable bacterial (sub)populations to coordinate a physiological function. Their production is initiated in response to specific environmental stresses or conditions (Claverys et al., 2006). At a threshold concentration, competence pheromones activate a master regulatory system, which ultimately leads to the induction of genes required for transformation (for a review see Johnsborg et al., 2007).
In streptococci, the competence state is initiated by the specific production of the alternative sigma factor ComX, which induces a transcriptional reprogramming of cells (Lee and Morrison, 1999). ComX transiently associates with the core RNA polymerase to specifically target the ComX-box DNA-binding motif located at position −10 in the promoter of competence-related genes (Johnsborg et al., 2007). Based on genome sequence analysis, the ComX sigma factor and gene products required for natural transformation sensu stricto are present in all streptococci, indicating that probably all species are competent (Claverys and Martin, 2003; Claverys et al., 2006; Johnsborg et al., 2007). However, control of comX expression differs between streptococci: at least two unrelated master signalling systems – with different activation mechanisms – have evolved to directly govern the expression of comX (ComCDE and ComRS) (Havarstein, 2010; Mashburn-Warren et al., 2010; Fontaine et al., 2010a).
The ComCDE system is prevalent in species of the mitis and anginosus groups and has been extensively studied in S. pneumoniae (for a review see Johnsborg et al., 2007). The pheromone precursor ComC is secreted and matured through a specific ABC-transporter (ComAB) into a 17-mer peptide, CSP (competence stimulating peptide). CSP is sensed extracellularly by a specific membrane histidine kinase, ComD, which transduces the signal through phosphorelay to the cytoplasmic transcriptional regulator ComE. Phosphorylated ComE directly binds as a dimer to a DNA motif (ComE-box) present in the sequence of ComCDE-regulated promoters, and stimulates their expression (for a recent study of the activation mechanism, see Martin et al., 2012). Besides comX, comAB and comCDE are also part of the core ComCDE regulon, resulting in a positive feed-back loop.
The ComRS system was recently discovered in species of the salivarius (Fontaine et al., 2010a), mutans, pyogenic and bovis groups (Mashburn-Warren et al., 2010). Its implication as the master activator of comX expression was demonstrated experimentally in one representative species of each group (Fontaine et al., 2010a; Mashburn-Warren et al., 2010; 2012), except for the bovis group. ComR is a transcriptional regulator of the Rgg family of pleiotropic transcriptional regulators, which are characterized by a N-terminal helix–turn–helix (HTH) DNA-interacting domain and a C-terminal alpha-helical domain. ComS is the precursor of the secreted competence pheromone XIP (ComX inducing peptide). The determinants of ComS secretion and maturation are yet undiscovered but the C-terminal part of ComS (minimal active size: 7 to 8 residues) was shown to recapitulate the pheromone activity in S. thermophilus (Fontaine et al., 2010a,b), S. mutans (Mashburn-Warren et al., 2010; Khan et al., 2012) and S. pyogenes (Mashburn-Warren et al., 2012). Based on the primary sequence of XIP peptides, the ComRS systems of salivarius and pathogenic streptococci have been classified into type I and type II respectively (Mashburn-Warren et al., 2010). The C-terminal pheromone domain of type I XIP is characterized by the presence of a (V/L)P(F/Y)F motif, and contains no charged residues. Type II XIP peptides contain a C-terminal WW motif and in some cases, a basic or acidic residue, or both (Mashburn-Warren et al., 2010). While the ComCDE system has been extensively studied, identification of the ComRS system is very recent and the proposed model of competence regulation still lacks important experimental evidence. In this model, mature extracellular pheromone XIP is produced and then re-imported via the Opp (Ami) oligopeptide transporter. In the cytoplasm, XIP would then directly interact with the competence regulator ComR, which would in turn stimulate the binding activity of ComR to a 20 bp motif (ECom-box) identified in the promoter sequence of comS and comX, and activate their transcription (Mashburn-Warren et al., 2010; 2012; Fontaine et al., 2010a).
The hypothesis of a direct interaction between XIP and ComR has been inferred from predicted structural similarities between ComR and members of the RNPP superfamily (for Rap/NprR/PlcR/PrgX) (Mashburn-Warren et al., 2010; Fontaine et al., 2010a), which includes pheromone-responsive regulators involved in cell–cell signalling pathways (Declerck et al., 2007). These regulators are characterized by a C-terminal alpha-helical domain containing tetratricopeptide repeats (TPR), to which short pheromone oligopeptides bind following re-importation through Opp. Interaction of the TPR domain with the cognate pheromone modulates the DNA binding activity of the N-terminal HTH domain (for a review see Rocha-Estrada et al., 2010). Different mechanisms of transcriptional activation have been reported among RNPP members. In the well-characterized PlcR–PapR system of Bacillus cereus, interaction of PlcR with the pheromone PapR results in a rearrangement of the HTH domains of PlcR dimers, which stimulates binding of PlcR to a specific DNA sequence (the PlcR-box) in target promoters. Interactions between PlcR and the RNA polymerase would then activate transcription (Declerck et al., 2007). In the cCF10-PrgX de-repression system of Enterococcus faecalis (Shi et al., 2005), interaction of PrgX with the cCF10 pheromone destabilizes a head-to-head interaction between two DNA-bound PrgX dimers, resulting in the disruption of a DNA-looping structure in promoters, which would then allow for the binding of the RNA polymerase.
Recently, a novel class of pheromone-interacting regulators was described in streptococci: the small hydrophobic peptides (SHP)-associated Rgg (Fleuchot et al., 2011). Although sharing predicted structure similarities with RNPP members, in vitro examination of individual systems further expanded the range of possible mechanisms of transcriptional regulation. Indeed, for all SHP-Rgg systems described so far, binding of the Rgg regulators to their DNA operator occurs in the absence of the cognate SHP pheromone. In the case of the transcriptional activators Rgg2 from S. pyogenes (LaSarre et al., 2013) and Rgg1358 from S. thermophilus (Fleuchot et al., 2011), subsequent interaction with the cognate SHP is believed to induce a conformational change in the DNA-bound regulator, which is required for interaction with the RNA polymerase. In S. pyogenes, the repressor activity of Rgg3 is relieved by SHP3, which stimulates the release of the regulator from its DNA binding site (Chang et al., 2011; LaSarre et al., 2013).
In view of the diverse activation mechanisms already reported for regulatory pathways that involve pheromone-interacting regulators, several scenarios of competence activation by ComRS may be considered. The main objective of the present work was to study the impact of ComS on the DNA binding activity of ComR on the promoters of comS and comX. First, a detailed analysis of the core regulon of the type I ComR of S. thermophilus, a salivarius model species, was conducted to unravel the role of ComS and the ECom-box motif, for both ComR binding and activity of controlled promoters. Second, the activation mechanism and specificity of type I and type II systems were compared by analysing the type II ComRS system of one representative species of the mutans, pyogenic and bovis groups. Finally, the criticality of ComS residues with respect to its biological activity and the possibility of a cross-talk between ComR and non-cognate ComS or peptides derived from the ecological niche, were investigated.
ComS-stimulated DNA-binding activity of ComR
Previous studies identified ComR as a key player in the induction of competence (Fontaine et al., 2010a; Mashburn-Warren et al., 2010). The transcriptional regulator ComR, together with the DNA motif ECom-box present upstream of comS (encoding the precursor of the competence pheromone) and comX (encoding the competence-specific sigma factor ComX), were shown to be required for the induction of their expression in S. thermophilus (Fontaine et al., 2010a). In agreement, heterologous expression of ComR from S. thermophilus in S. pneumoniae – a species lacking both ComR and ComS – was recently reported to result in successful expression of genes inserted downstream of an ECom-box motif, in the presence of extracellular XIP (Berg et al., 2011). The proposed activation mechanism involves direct binding of ComR to the ECom-box (Fontaine et al., 2010a; Mashburn-Warren et al., 2010; 2012) but this interaction has not been demonstrated so far.
In order to provide direct evidence for binding of ComR to the PcomS promoter, electrophoretic mobility shift assays (EMSA) were performed using a purified preparation of ComR from S. thermophilus LMD-9 fused to a C-terminal StrepII-tag (ComRLMD-9-Strep), and a 5′-fluorescently-labelled DNA probe encompassing the promoter region of comS (Cy3-PcomS). Mixing ComRLMD-9-Strep and Cy3-PcomS alone in the reaction, resulted in an extremely weak binding, as observed in Fig. 1A. In contrast, when the mature form of ComS (ComS17–24, previously shown to be sufficient for competence induction) (Fontaine et al., 2010b) was added, binding of ComR to the DNA probe was strongly enhanced. In the presence of ComS17–24 alone (no ComR in the reaction), no gel retardation of the probe was observed (data not shown), confirming that ComS17–24 does not bind to the DNA by itself. The specificity of the ComR-PcomS complex was confirmed by competition experiments with either unlabelled PcomS or PcomR probes (PcomR does not contain the ECom-box motif). As expected, the labelled ComR-PcomS complex gradually disappeared in the presence of increasing concentrations of unlabelled PcomS, but was not affected by increasing concentrations of the non-specific competitor PcomR probe (Fig. 1A). EMSA experiments suggested the presence of two different shifted ComR–DNA complexes (C1 and C2) (Fig. 1A). The nature of these complexes was not investigated, but might reflect various stoichiometry of the interacting partners. In any case, these two complexes are specific, as shown in Fig. 1A.
The specific interaction of ComR with the previously identified ECom-box in the presence of ComS17–24 was further investigated by repeating EMSA experiments with a Cy3-labelled double-stranded DNA probe comprising the ECom-box from PcomS (20 bp), flanked by 5 bp on each side (probe Cy3-boxPcomS) (Fig. 1B). As observed with the full-length PcomS promoter, binding of ComR was strongly enhanced in the presence of ComS17–24. These results indicate that the ECom-box is sufficient for interaction with ComR.
Further EMSA experiments were performed with digested double-stranded PcomS probes labelled on each strand (Cy3 and Cy5 for the top and bottom strands respectively; probe Cy3/5-PcomS; Fig. 1C). As expected, digestion outside the ECom-box (SfcI) did not affect binding of ComR to PcomS. In these conditions, only the fragment containing the ECom-box (Cy3-labelled) was shifted. In contrast, digestion with DrdI or Tsp45I, which cut between or within the inverted repeats of the ECom-box, respectively, completely abolished binding of ComR to the probe.
Altogether, these results formally demonstrate that ComR directly binds to the PcomS DNA region. This binding activity of ComR is strictly dependent on the presence of the activating ComS pheromone, and exclusively involves the 20 bp ECom-box. The ECom-box was therefore renamed ComR-binding motif or ComR-box.
Stoichiometry of the ComR–DNA complex
The specificity and stoichiometry of the ComS-activated DNA-binding activity of ComR was further investigated using surface plasmon resonance (SPR). The SPR sensor chip was functionalized with a 125 bp biotinylated probe comprising the ComR-box of PcomS.
A first set of experiments was performed in order to determine the ratio of ComS to ComR required to reach saturation of the DNA binding activity. As shown in Fig. S1A, no specific interaction of ComRLMD-9-Strep was observed in the absence of ComS17–24, confirming the strict dependence of ComR DNA-binding activity toward ComS. Under the conditions of the assay, the SPR signal increased with increasing ComS17–24 concentrations, and reached a plateau at a molar ratio of 2 ComS17–24 per ComRLMD-9-Strep (Fig. S1A). These conditions, i.e. a 2:1 molar ratio between ComS and ComR, were used for all subsequent SPR experiments.
In a second set of SPR experiments, increasing concentrations of a mixture of ComS17–24 and ComRLMD-9-Strep were injected on the biochip, until saturation of the DNA ligand was reached (Fig. S1B). The deduced stoichiometry of the ComR–DNA complex involves 2 mol ComRS per mol DNA (Fig. 1D), suggesting a possible dimerization of the complex ComRS on the DNA. This observation is in agreement with studies of other HTH domain-containing proteins, which are dimeric in their simplest oligomeric state (Aravind et al., 2005). Based on the architecture of the ComR-box, it can be inferred that one monomer of ComR is bound to each of the ComR-box inverted repeats.
Definition and function of the ComRS core regulon of S. thermophilus
Previous transcriptomic analyses identified putative ComR-boxes in the promoter regions of comS (shp0316), comX, ster_1655, ster_1643 (blp_orf4) and ster_1720 (Fontaine et al., 2010a). Based on the sequence of these five ComR-boxes, a genome-wide search was performed for ComR-boxes in S. thermophilus LMD-9. This search retrieved three additional boxes, located in the promoter regions of ster_1817, ster_1731, and of the non-annotated gene cds0064 (Fig. S2). All identified ComR-boxes are immediately followed by a T-tract and are respectively located 22 bp and 34 bp upstream of the −10 box and the transcriptional start site, which was determined experimentally (PcomS, PcomX, Pster_1655 and Pster_1643) or deduced from the alignment between promoters sequences (Pster_1720, Pcds0064, Pster_1817 and Pster_1731) (Fig. S2). As demonstrated by EMSA experiments, ComR was able to bind to all eight ComR-box containing promoters, albeit with different apparent affinities (Fig. 2A). In all cases, binding was dependent on the presence of ComS17–24. In contrast, no binding to the PcomR probe could be observed.
Overall, the eight ComR-boxes potentially control the expression of 17 genes (Fig. 3), which we propose to define as the ComRS core regulon, i.e. genes under direct transcriptional control of ComRS. Three genes are non-annotated in public databases, but were identified in a previous study (cds0189, cds0064 and comS, which are respectively located downstream of comX, ster_0064 and comR) (Ibrahim et al., 2007), and three are pseudogenes-encoding transporters (comA, ster_1718 and ster_1719). In order to assess the involvement of the ComRS core regulon in the development of competence in S. thermophilus LMD-9, systematic deletion of each of the eight clusters was performed in a PcomX–luxAB reported strain. Transformation efficiency and luciferase activity of the reporter strains were compared to the wild-type strain. Based on their predicted functions, genes of the ComRS core regulon can be classified in competence- (clusters comX-cds0189 and comS-ster_0319) and bacteriocin-related genes (all other clusters) (Fig. 3). In agreement, only comS and comX were found to be required for the induction of comX expression (PcomX-driven luciferase activity) and for subsequent DNA transformation sensu stricto (transformation frequency), respectively, while deletion of all other clusters of the ComRS core regulon had no effect on these parameters (Table 1).
Table 1. Competence development in S. thermophilus LMD-9 derivative strains deleted for clusters of the ComRS core regulon
aCalculated as the ratio of transformants (erythromycin-resistant cfu) to the total cfu count per 1 μg of pGIUD0855ery plasmid DNA. Transformation frequencies are expressed as the arithemtic mean of three independent experiments. Geometric means ± standard deviations (expressed in log10) are provided between brackets.
bStudent's t-test performed on log10-transformed data. The asterisk ‘*’ indicates a significant difference (P < 0.05) compared with LMD-9.
cArithmetic mean of three independent experiments. Geometric means ± standard deviations (expressed in log10) are provided between brackets.
dStudent's t-test performed on log10-transformed data. The asterisk ‘*’ indicates a significant difference (P < 0.05) compared to CB001.
1.5E-04 (−3.9 ± 0.2)
2.6E-04 (−3.6 ± 0.1)
2.2E-04 (−3.7 ± 0.2)
1.6E+07 (7.2 ± 0.1)
5.0E+03 (3.7 ± 0.1)
2.6E-04 (−3.6 ± 0.2)
1.6E+07 (7.2 ± 0.0)
2.1E-04 (−3.7 ± 0.1)
1.7E+07 (7.2 ± 0.0)
2.1E-04 (−3.5 ± 0.2)
1.6E+07 (7.2 ± 0.2)
2.9E-04 (−3.8 ± 0.4)
2.0E+07 (7.3 ± 0.2)
2.6E-04 (−3.6 ± 0.1)
1.4E+07 (7.1 ± 0.1)
1.9E-04 (−3.8 ± 0.3)
2.3E+07 (7.4 ± 0.1)
Differential in vivo expression from ComR-box containing promoters
ComR-driven transcriptional regulation of the 8 ComR-box containing promoters was further studied in vivo. Reporter strains carrying transcriptional fusions between each of the identified promoter region and the reporter genes luxAB, were constructed, as previously described (Fontaine et al., 2007). The promoter region of comR was used as a negative control, since it does not contain a ComR-box and was shown not to bind ComR in EMSA experiments (Fig. 2A). First, the time-course of specific light emission (RLU OD600−1) was studied during growth of the reporter strains in CDML medium, i.e. conditions previously shown to spontaneously induce the competence phenotype in strain LMD-9 (Gardan et al., 2009). As shown in Fig. 2B, all eight reporter strains with ComR-box containing promoters displayed a similar profile of light emission, with induction starting about 1 h and 40 min after inoculation of the culture (early exponential phase), and a transitory maximum specific light emission being reached between 2 and 3 h after inoculation. In contrast, light production in the PcomR reporter strain was detected immediately, while the maximum specific light emission was reached about 1 h and 30 min after inoculation. These results strongly suggest that the eight promoters containing a ComR-box share a common activation mechanism. The involvement of ComR and ComS in this mechanism was confirmed by deletion of comR or comS in reporter strains for PcomS and PcomR (Fig. S3). Deletion of either gene completely abolished light production in a PcomS reporter strain, confirming results previously obtained for PcomX (Fontaine et al., 2010a), while it did not dramatically affect the light emission profile in a PcomR reporter strain. These observations confirm that promoters containing a ComR-box are specifically induced by ComR in the presence of ComS.
Although sharing similar induction profiles, the different ComR-regulated promoters displayed differences in their maximum specific light production levels (Fig. 2B). Notably, the three promoters displaying the lowest maximum light emission (Pcds0064, Pster_1817 and Pster_1731) were the same displaying the weakest binding affinity by EMSA (Fig. 2A). The relative ‘strength’ of the promoters was measured by comparing the maximum light production of each reporter strain cultivated in the presence versus in the absence of 1 μM of the competence-inducing pheromone ComS17–24, in THBL-based medium i.e. conditions previously shown to be non-permissive for competence development (data not shown and Blomqvist et al., 2006). Light emission in THBL medium in absence of supplemented ComS17–24 therefore reflects the basal activity of the promoters. For each reporter strain, an induction factor (IF) was calculated as the ratio of maximum specific light emission in the presence and absence of ComS17–24. As summarized in Fig. 2C, all reporter strains with ComR-box containing promoters displayed increased light production (IF above 10) in the presence of ComS17–24, whereas transcription from PcomR was not affected (IF = 1.16). As already observed under conditions of spontaneous induction (CDML medium; Fig. 2B), significant differences were observed between the eight ComR-regulated promoters, in terms of IF: three promoters display an IF above 1000 (Pster_1655, PcomS and Pster_1643), three have an IF comprised between 100 and 1000 (PcomX, Pster_1720 and Pcds0064), and two have an IF comprised between 10 and 100 (Pster_1817 and Pster_1731).
Definition of a ComR-box consensus sequence and prediction of ComR binding affinities
A position weight matrix (PWM) was created from the sequence of the 8 ComR-boxes of S. thermophilus LMD-9. Perfect symmetry was assumed between the two 9 bp inverted repeats – based on the hypothesis that each inverted repeat (arrows in Fig. 2E) represents the binding sequence for one ComR monomer – but not between the two central nucleotides, which were assumed not to be part of the ComR binding sites. The deduced ComR-box PWM for S. thermophilus LMD-9 is provided in Dataset S1 and shown as a logo in Fig. 2E.
The relative binding affinity of ComR for the different ComR-boxes was predicted from the ComR-box PWM, except that only positions corresponding to the two inverted repeats were taken into account. The approach was based on a previous publication (Hardiman et al., 2010), with modifications described in Experimental procedures. All predicted binding affinities were calculated relative to the PcomS ComR-box. A linear relationship was observed between the predicted binding affinity and the measured IF in THBL medium (Fig. 2D). The ranking of promoters based on the prediction of binding affinities, also correlated with their ranking based on maximum light emission in CDML conditions (Fig. 2B). Similarly, the ComR-boxes from the three promoters displaying a weaker ComR binding affinity in EMSA experiments (cds0064, ster_1817 and ster_1731), also had the lowest predicted affinity. These observations indicate that affinity of ComR for the ComR-box is the main determinant for expression level from these promoters. Yet, it is possible that other factors are involved – at least for some promoters – since in two cases, the measured relative IF was significantly different from the predicted binding affinity based on ComR-box inverted repeats alone (ster_1720 and ster_1643) (Dataset S1).
Interestingly, promoters of S. thermophilus that display the weakest measured IF and predicted ComR binding affinity (cds0064, ster_1817 and ster_1731) specifically harbour one substitution in the central GACA/TGTC motif, which is not observed in other ComR-boxes. To assess the role of this motif in the binding of ComR to DNA, EMSA experiments were performed with DNA probes containing substitutions at either of the four positions of the GACA motif. The ComR-box of Pster-1655 was selected as a probe template since it displays the highest predicted relative affinity for ComR and shows a perfect symmetry between the two inverted repeats. To respect the palindromic nature of the two ComR monomer-binding sites, each base substitution was mirrored in the two inverted repeats of the mutagenized ComR-boxes. In agreement with in silico predictions based on the S. thermophilus ComR-box PMW (data not shown), results show that any substitution in the GACA motif dramatically reduce or completely abolish the ComS-dependent binding of ComR to DNA (probes boxPster_1655 G5T/C16A; A6C/T15G; C7A/G14T; A8C/T13G in Fig. 4). Mutation at position 2 of the ComR-box – which is located outside the GACA motif and is predicted to have the weakest impact on binding affinity (lowest information content; Fig. 2E) – results in a much slighter decrease in ComR binding probes boxPster_1655 A2C/T19G in Fig. 4). These observations confirm that the highly conserved GACA motif is a critical determinant of the ComR-box.
Mechanism of activation of type II ComRS systems
EMSA experiments were performed in order to see whether the mechanism of ComRS activation identified for S. thermophilus LMD-9, can be extended to type II ComRS systems of pathogenic Streptococcus species from the bovis, pyogenic, and mutans groups. Purified recombinant StrepII-tagged ComR proteins from one representative strain of each group (S. gallolyticus UCN34, S. pyogenes M1GAS and S. mutans UA159) were incubated with Cy3-labelled double-stranded DNA probes corresponding to the PcomX and PcomS regions from the same strain (Fig. 5). Binding was assayed in the presence or absence of the cognate ComS octapeptides [ComSUCN34(8–15), ComSM1GAS(24–31) and ComSUA159(10–17) respectively]. In all cases, binding was observed only when both cognate ComR and ComS were present, indicating that the mechanism of activation of the binding activity of ComR to promoter regions containing a ComR-box, is conserved among type I and type II ComRS systems. More specifically, it can be concluded that the induction of comX and comS expression is also under direct control of the ComRS system in pathogenic streptococci from pyogenic, bovis and mutans groups.
Cross-species specificity of DNA binding activity of ComRS systems
Sequence comparison between ComR-boxes from comX and comS of the four Streptococcus species investigated, indicates absolute conservation of the 4 bp inverted repeat GACA/TGTC immediately surrounding the central two nucleotides of the ComR-box (highlighted nucleotides in Fig. S4A) (Mashburn-Warren et al., 2010), indicating that this motif is important for both type I and type II ComR binding (see above). However, ComR-box sequence conservation outside these repeats is significantly lower (Mashburn-Warren et al., 2010), resulting in overall ComR-box sequence identities ranging from 50% to 90% between the different species (Fig. S4B).
Exploration of cross-species specificity of the DNA binding activity of the different ComRS systems was performed by EMSA experiments (Table 2 and Fig. S4C). Recognition of non-cognate promoter regions varied among the different ComR variants, reflecting the intra-species divergence of ComR-box sequences. The ComR/ComS pair of S. pyogenes M1GAS displayed the broadest specificity, and was able to bind PcomS and PcomX from all other species investigated, including those from S. thermophilus LMD-9, with an apparent similar efficiency. This is in agreement with the observation that PcomS and PcomX ComR-boxes from S. pyogenes M1GAS display the lowest sequence identity among the other tested streptococci (only 80% identity; Fig. S4B). The ComRS system of S. thermophilus LMD-9 was also able to recognize all probes containing a ComR-box, despite the fact that sequence identity between ComR-boxes from S. thermophilus and other species was low (between 50% and 75%; Fig. S4B). However, binding to non-cognate probes was apparently less efficient (Fig. S4C and Table 2). In the case of S. gallolyticus UCN34, binding of ComRS was observed on all probes except those from the more distantly related S. thermophilus LMD-9. Finally, ComRS from S. mutans UA159 displayed a narrow binding specificity, restricted to cognate PcomS and PcomX, as well as PcomX from S. pyogenes M1GAS, which shows the highest sequence similarity with S. mutans ComR-boxes (85% and 90% identity versus S. mutans UA159 PcomX and PcomS respectively; Fig. S4B). These results indicate that the N-terminal DNA-binding domains of ComR proteins have co-evolved with their target DNA sequence (ComR-box), resulting in a species-specific level of specialization in extreme cases (S. mutans), while other variants retained broader specificity (such as S. pyogenes).
Table 2. Specificity of type I and type II ComRS systems from streptococci (EMSA experiments)
aDeduced from EMSA experiments (qualitative results). +: complete shift of the Cy3-probe; −: no shift of the Cy3-probe; (+): partial shift of the Cy3-probe. The PcomR region from S. thermophilus LMD-9 was included as a negative control. 150 ng Cy3-labelled DNA probes were incubated with ComR/ComS pairs from each species. Separation was performed on a TBE 5% gel under non-denaturating conditions.
S. thermophilus LMD-9
S. mutans UA159
S. pyogenes M1GAS
S. gallolyticus UCN34
Criticality of ComS residues for ComR activation
The importance of ComS amino acid sequence for activation of the ComR DNA-binding activity was investigated by measuring the ability of S. thermophilus ComS17–24 variants to induce light emission in a S. thermophilus LMD-9 ΔcomS PcomS–luxAB reporter strain. In this strain, light emission driven by PcomS expression is extremely low due to absence of comS (Fig. S3). Because PcomS is under direct control of ComR, light production in this strain indirectly reflects the activity of ComS17–24 variants. Peptides were added to CDML-grown cells of the ComS- reporter strain at a final concentration of 1 μM (saturating concentration for ComS17–24; data not shown).
In a first assay, alanine-scanning mutagenesis was performed by individually replacing each residue with alanine (except for A21, which was substituted by glycine). As shown in Fig. 6A and Table S7, the substitution of residues at positions 21 to 23 (AGC) did not or slightly affect the activity of ComS. This indicates that the presence of a thiol group at position 23 is not mandatory, which is confirmed by the observation that replacing C23 with Serine (similar structure as cysteine but with a hydroxyl group instead of a thiol group) did not dramatically impact ComS activity (Fig. 6B and Table S7). Substitution of residues 17 and 18 (LP) resulted in a slightly higher decrease in ComS activity, but these peptides still retained approximately 50% of the wild-type ComS activity and, although less efficiently, were still able to induce binding of ComR on PcomS in EMSA experiments (data not shown). In contrast, residues at positions 19, 20 and 24 appear more critical for ComR activation, since their substitution by alanine severely affected the ability of ComS to activate ComR. Importance of L24 was further reinforced since deletion of this residue completely abolishes ComS activity (Fig. 6B and Table S7). EMSA experiments also confirmed the lack of ComR binding activity in the presence of these 4 peptide variants (data not shown).
The importance of residues Y19 and F20 (aromatic amino acids), and L24 (branched-chain amino acid) was further investigated. The replacement of aromatic residues Y19 and F20 with another aromatic amino acid did not significantly affect ComS activity, in contrast to the sharp decrease observed when alanine was introduced at these positions (approximately 10% residual activity; Fig. 6B). Similarly, residue L24 could be replaced with either isoleucine or valine without significant reduction in ComS17–24 activity, while the substitution by alanine or threonine negatively impacted ComS17–24 activity (Fig. 6B).
Altogether, these results indicate a relatively high permissiveness in terms of ComS primary sequence, but highlight two major structural requirements for ComS activity: the presence of two contiguous aromatic amino acids at positions 19 and 20 (positions 3 and 4 of the mature octapeptide), and the presence of a branched-chain amino acid at position 24 (last position of the mature octapeptide).
Activation of ComR from S. thermophilus LMD-9 by ComS from heterologous streptococci
In view of the relatively high permissiveness of the sequence of ComS, cross-species activation of ComR was evaluated. The ability of heterologous type I (S. vestibularis F0396) or type II (S. mutans UA159, S. pyogenes M1GAS, and S. gallolyticus UCN34) ComS octapeptides to activate ComR from S. thermophilus LMD-9 was investigated in vivo, using the mutant reporter strain PcomS–luxAB ΔcomS.
In the presence of 1 μM ComS peptide from S. mutans UA159, a 268-fold increase in light production was detected (Table 3), reflecting the cross-activation of ComR. ComR activity was induced 481-fold when ComSUA159(10–17) was added at a concentration of 5 μM. A similar response was observed with ComS from S. vestibularis F0396, albeit the induction factors were fourfold lower than for ComSUA159(10–17). Addition of ComS from S. gallolyticus UCN34 only resulted in very low levels of light production. Nevertheless, the observed increase in light emission was dose-dependent, indicating the specific activation of ComR in the presence of this peptide (Table 3). Among all tested heterologous ComS peptides, only ComS from S. pyogenes M1GAS failed to induce significant levels of light production (Table 3). Although in all cases, the measured induction factors were at least 10-fold lower than with the endogenous ComS17–24 peptide from S. thermophilus LMD-9, these observations clearly indicate that cross-species activation of the ComRS system is possible.
Table 3. Activation of ComR of S. thermophilus LMD-9 by heterologous peptides
aInduction factor calculated as the ratio of the maximum specific light production (RLU OD600−1) in the presence versus absence of the corresponding peptide. Mean ± standard deviation of three independent replicates. Fresh CDML medium was inoculated with the ComS-negative reporter strain S. thermophilus LF134 at an initial OD600 of 0.05. Peptides were added after 1 h and 30 min incubation at 37°C.
bStudent's t-test performed on log10-transformed data (max. RLU OD600−1). The asterisk ‘*’ indicates a significant difference (P < 0.05) compared to absence of supplemented ComS in strain LF134.
Heterologous streptococcal ComS octapeptides
S. mutans UA159
267.9 ± 6.7*
481.4 ± 50.1*
S. vestibularis F0396
66.1 ± 9.0*
127.0 ± 20.3*
S. gallolyticus UCN34
3.5 ± 0.1*
10.7 ± 0.4*
S. pyogenes M1GAS
1.3 ± 0.2
1.4 ± 0.6
α1-casein Bos taurus
0.9 ± 0.1
0.9 ± 0.1
α1-casein Bos taurus
10.7 ± 1.7*
107.7 ± 8.3*
Activation of ComR from S. thermophilus LMD-9 by non-ComS-related peptides
It was recently shown that strain LMD-9 is naturally transformable in S. thermophilus's ecological niche (milk) (R. Gardan, pers. comm.). The primary sequence flexibility of ComR-activating peptides prompted us to investigate whether peptides issued from milk-based medium could possibly activate the ComRS system in this species. As a proof of concept, the primary sequence of mature α1-casein from Bos taurus was scanned for potential ComS-like octapeptides solely based on the presence of two aromatic residues at positions 3 and 4 and a branched-chain amino acid at position 8 (consensus XX[F/W/Y][F/W/Y]XXX[I/L/V]). Two such peptides were identified – named α1CasL(42–49) (seq: LAYFYPEL) and α1CasL(62–69) (seq: GAWYYVPL), according to their position within α1-casein mature sequence. One of these peptides [α1CasL(62–69)] was found to elicit a dose-dependent light production in a luciferase-based in vivo assay using the competence-defective ΔcomS reporter strain, (Table 3). In light of those unexpected results, we further tested if addition of α1CasL(62–69) (5 μM) in the growth medium could restore natural transformation of the ΔcomS reporter strain. No transformants were observed in supplemented M17L, THBL or CDML growth conditions. However, in milk-based medium, supplementation with 5 μM α1CasL(62–69) was found to induce natural transformation at an average frequency of 10−4 in the ΔcomS reporter strain. Moreover, the addition of 5 μM α1CasL(62–69) induces a ∼ 100-fold increase in the competence level of the WT LMD-9 strain in milk growth conditions (average frequency of 3.81 × 10−3 in presence of peptide versus 1.9 × 10−5 in its absence).
Altogether, these results demonstrate that non-ComS-related peptides are able to trigger activation of the ComRS system in S. thermophilus. The concentration of α1-casein in cow milk is typically comprised between 10 and 13 g l−1 (Belloque and Ramos, 2002). This would correspond to a α1CasL(62–69) peptide concentration of about 500 μM. The observation that a 100-fold lower concentration is sufficient for ComR activation, suggests that environment-derived peptides could participate in competence development in S. thermophilus.
In streptococci species, natural transformation is a tightly regulated process, which relies on secreted pheromones and signal transduction systems to coordinate comX expression. The recent identification of ComRS in the salivarius group (Fontaine et al., 2010a) challenged the paradigm of the ComCDE phosphorelay system as the master signalling system for comX regulation, and prompted researchers to reconsider the model of competence regulation in the genus Streptococcus (Mashburn-Warren et al., 2010; 2012). In this work, we provide experimental evidence for important steps of the model of the ComRS-dependent regulation of competence. We formally demonstrate that ComR is the proximal activator of comS and comX expression in salivarius, pyogenic, mutans and bovis streptococci. Importantly, we show that in all systems tested, the DNA binding activity of ComR is strictly dependent on the presence of its cognate XIP, which corresponds to a C-terminal form of ComS. The palindromic ComR-box with the conserved inner GACA/TGTC motif, which characterizes promoters of the type I and type II core ComRS regulon, is the minimal genetic requirement to promote binding of ComR as a dimer. According to the results of the present study, the mechanism leading to the activation of comX expression by XIP is similar between type I and Type II ComRS systems.
ComR proteins are novel Rgg-type members of pheromone-interacting regulators
The Rgg family of transcriptional regulators found in low GC Gram-positive bacteria, was previously thought to include stand-alone regulators i.e. regulators acting without an identified regulatory partner. To date, two Rgg family-subclusters were shown to directly bind a signalling peptide pheromone imported through the oligopeptide transporter Opp: the SHP-associated Rgg (Fleuchot et al., 2011) and XIP-associated ComR, as demonstrated in the present study. From a mechanistic point of view, the transcriptional activation by XIP-ComR systems shares more similarities with the RNPP PlcR–PapR (Declerck et al., 2007) and NprR–NprX systems (Perchat et al., 2011) from Bacillus cereus, than with SHP-associated Rgg systems. First, PlcR, NprR and ComR all act as transcriptional activators. In addition, unlike all characterized SHP-associated Rgg regulators, which bind DNA in the absence of SHP (Chang et al., 2011; Fleuchot et al., 2011; LaSarre et al., 2013), the DNA-binding activity of ComR strictly depends on a direct interaction with their cognate pheromone, as also observed for PlcR and NprR. We thus propose that XIP induces a conformational rearrangement of the N-terminal HTH domains of ComR dimers, which is necessary for ComR-box binding, akin to the PlcR–PapR system (Declerck et al., 2007). Transcription initiation could then be stimulated through interactions between the DNA-bound ComR dimers and the RNA polymerase (Browning and Busby, 2004). Structure comparison between free and DNA-bound XIP-ComR will be further needed to confirm this model and study the mechanistic specificities of the XIP-ComR system, compared to other pheromone-interacting regulators.
Determinants of the specificity of the ComR–ComS interaction
Cell–cell communication systems are generally species- or in some instances strains-specific (Havarstein et al., 1997; Slamti and Lereclus, 2005; Allan et al., 2007), allowing the emergence of distinct pherotypes, a driving mechanism in microorganism speciation (Carrolo et al., 2009). The low level of in vivo cross-activation of ComR from S. thermophilus by non-cognate XIP peptides (Table 3) suggests that both type I and type II ComR at least display a species-specific specialization towards their cognate ComS. In agreement, within each streptococci group, the C-terminal domain (residues 80–280) of ComR orthologues, which is believed to interact with cognate XIP, is less conserved than the N-terminal DNA-interacting domain (residues 8–71). This suggests that similarly to other pheromone-interacting regulators (Bouillaut et al., 2008; Perchat et al., 2011), the DNA-binding and XIP-binding domains of ComR have not been subjected to the same selective pressure, and that XIP and C-terminal domain of ComR have co-evolved. A striking example of the co-evolution and specialization of XIP together with the C-terminal domain of ComR, is found within the salivarius group: while the N-terminal DNA-binding domains of ComR from S. thermophilus LMD-9 and S. vestibularis F0396 share 92% identity and recognize identical ComR-boxes (at least in PcomX and PcomS), the XIP peptides (LPYFAGCL and VPFFMIYY respectively) and C-terminal domains of ComR (80% identity), is significantly lower. This is in agreement with the low level of cross-induction observed between the two salivarius streptococci (Table 3).
For small signalling peptides, which are believed to lack an ordered structure, the target specificity is generally amino acid sequence-dependent, where mutation of a single residue generally results in a dramatic loss of pheromone activity (Perego, 1997; MDowell et al., 2001). Unexpectedly, although ComR–ComS interactions appear to be species-specific (see above), single residue mutagenesis of ComS17–24 from S. thermophilus indicates a relatively high permissiveness in its sequence with respect to biological activity. Importantly, the structural features of aromatic residues 3, 4, and of the C-terminal branched-chain residue, rather than their identity, are determinant for ComS activity. It is probable that the essential double aromatic patch, which characterizes all XIP peptides, represents the sites of common non-specific interactions between the aromatic rings and the C-terminal domain of ComR. In contrast, the structural requirement for a branched-chain residue at the last position of ComS17–24 might reflect species-specific interactions with ComR, since a C-terminal tyrosine residue is found in XIP from S. vestibularis. The importance of ComS structure versus primary sequence is further supported by the observation that XIP from S. mutans, and to a lesser extent, XIP from S. gallolyticus, partially supplement deletion of endogenous comS in S. thermophilus. Indeed, those peptides share very low levels of primary sequence similarity with ComS17–24, but all contain a double aromatic patch and a C-terminal branched-chain residue.
It is possible that the low specificity of ComRS from S. thermophilus is an evolutionary consequence of its domestication for industrial fermentation processes, where there is little chance of XIP-mediated inter-species communication. Alternatively, the lack of stringency of XIP primary sequence may have been selected precisely in order to allow cross-talk between streptococci sharing the same ecological niche. Opportunities of cross-talk between S. thermophilus and other streptococci may occur for instance in artisanal fermented milk products or in mammalian breast milk (Perez et al., 2007), particularly in case of streptococcal mastitis. Moreover, the ecological niche of S. thermophilus is not restricted to milk or milk-derived products: it has been recently identified on plants (Michaylova et al., 2007) and in the human gastrointestinal tract (Qin et al., 2010). In line with the latter hypothesis, the genome of S. thermophilus contains several hallmarks of previous interactions with multiple species, including streptococci (Delorme et al., 2010): about 15% of its genes are predicted to have been acquired by lateral gene transfer events (Eng et al., 2011).
In the present study, sequence permissiveness of XIP was demonstrated for S. thermophilus, a representative species of salivarius streptococci. The cross-activation of ComR of S. thermophilus by heterologous XIP – as observed with XIP of S. mutans for instance – may probably be extended to the closely related oral species S. salivarius. Indeed, the ComR proteins of S. thermophilus and S. salivarius share 94% identity and their XIP peptides only differ by one non-critical residue (Fontaine et al., 2010a). Whether this feature – and the consequent possibility of inter-species cross-talk – can be extended to pathogenic streptococci, remains to be investigated. A recent study indicates that ComR from S. mutans cannot be cross-activated by XIP from 4 pyogenic species (Desai et al., 2012), which might reflect a different level of species-specific specialization of the C-terminal pheromone-interacting domain of ComR proteins from different streptococcal species, akin to the N-terminal DNA-binding domain (Fig. S4). Nevertheless, structurally important features of XIP – such as the double aromatic patch – appear to be conserved among all species, prompting a more in-depth investigation of the extent of XIP-mediated cross-talk in streptococci.
In terms of ecological consequences on niche colonization, the ability of a species to respond to heterologous XIP may result in a competitive advantage through the induction of natural transformation and possible connected functions such as (i) the production of bacteriocins, in the case of S. thermophilus for instance (see Fig. 3 and our unpublished data), or (ii) the induction of stress resistance (Seaton et al., 2011) and/or the formation of biofilms (Senadheera and Cvitkovitch, 2008), in the case of S. mutans. However, such a sensitivity to XIP-mediated cross-talk may also interfere with normal XIP-signalling, with potential inhibitory effects, thereby providing a selective advantage to the XIP producers.
Do casein fragments play a role in the induction of competence in S. thermophilus?
Connected to its natural niche – milk – S. thermophilus has a highly developed nitrogen metabolism: it has a complex proteolytic system specialized to the use of casein fragments as the main nitrogen source (Rul and Monnet, 1997; Hols et al., 2005). Previous reports have shown that the proteolytic system of lactic acid bacteria, including S. thermophilus (Miclo et al., 2012), can contribute to the liberation from casein of bioactive peptides displaying antimicrobial or regulatory properties (for instance, hormonal, immunomodulatory) in mammalian organisms (for reviews, see Meisel, 1997; Meisel and Bockelmann, 1999). To our knowledge, we provide the first report indicating that milk-encrypted bioactive peptides may also have regulatory functions in microorganisms. Although the presence of α1CasL(62–69) was not formally demonstrated, it is tempting to postulate that casein fragments present in milk could be involved in the initial steps of competence induction in S. thermophilus. It could lead to the activation of the auto-amplification loop, which is necessary to transiently maintain ComS production and the competence state. In agreement, deletion of comS results in a total loss of natural transformation in the competence permissive CDML medium (Fontaine et al., 2010a), but the same mutant displays a detectable transformation frequency (average transformation frequency of 4 × 10−8) when cultivated in milk, even in the absence of the mature peptide ComS17–24, supporting the hypothesis that casein-encrypted peptides may contribute to competence development in S. thermophilus. The possibility of inducing competence with casein fragments may represent an additional evidence of the adaptation of S. thermophilus to the milk niche, and suggests a co-evolution between nitrogen nutrition and competence development in this species. Most S. thermophilus strains lack the PrtS cell wall protease (Delorme et al., 2010) and take advantage of the proteolytic activity of other milk-growing, protease-positive bacteria (Courtin et al., 2002), which provide S. thermophilus with casein-derived peptides as a nitrogen source. As a consequence, the sensing of casein fragments, including fragments with competence inducing properties, indirectly reflects the level of competitors present in milk, and may be used to evaluate the benefits versus cost of inducing competence. In this context, the co-regulation of competence and bacteriocin production in salivarius streptococci, as deduced from the ComRS core regulon (Fig. 3 and our unpublished results), further reflects the microbial competition encountered in their ecological niches. This suggests the presence of a tightly coordinated gene acquisition process, involving active DNA release as described in mitis (Steinmoen et al., 2002; Guiral et al., 2005) and mutans streptococci (Kreth et al., 2005).
The present study demonstrates that the mechanism of transcriptional activation of comX expression by the ComRS system is conserved among salivarius, bovis, mutans and pyogenic streptococci. Further challenges include the identification of mechanisms regulating the initial activation of the positive feed-back loop. More specifically, what are the roles played by both biotic and abiotic factors of the ecological niche in this process? Since comR expression is not auto-regulated by ComRS, ComR may for instance serve as a focal point for integration of information from other regulatory pathways. In addition, the results obtained in this work convey to the fascinating hypothesis whereby the relative flexibility of ComR–ComS interaction is an evolutionary trait that has been selected to allow oligopeptides from the environment to participate in competence induction. In this context, the shared involvement of the Opp transporter in both nutritional and signalling functions, makes logical sense.
Bacterial strains, plasmids and growth conditions
The bacterial strains and plasmids used in the present study are listed in Tables S1 and S2 respectively. Plasmids derived from pGICB004 and from pBADHisA were respectively constructed in Escherichia coli EC1000 and DH10B. E. coli was grown in LB medium with shaking at 37°C (Sambrook et al., 1989). S. thermophilus was grown at 37°C in M17 broth, Todd Hewitt broth (THB) (Difco Laboratories, Detroit, MI), milk or in CDM, as described by Letort and Juillard (2001). All synthetic media contains 1% (w/v) lactose (M17L, THBL and CDML broth respectively). The skimmed milk used to assess transformation of S. thermophilus LMD-9 derivative strains is a reconstituted 5% (w/v) milk (BBL Litmus milk, Becton Dickinson, Franklin lakes, NJ). When required, ampicillin (250 μg ml−1 for E. coli) erythromycin (250 μg ml−1 for E. coli, 2.5 μg ml−1 for S. thermophilus) or chloramphenicol (20 μg ml−1 for E. coli, 5 μg ml−1 for S. thermophilus) was added to the media. Solid agar plates were prepared by adding 2% (w/v) agar to the medium. Solid plates inoculated with S. thermophilus cells are incubated anaerobically (BBL GasPak systems, Becton Dickinson, Franklin lakes, NJ) at 37°C.
Detection of absorbance and luminescence
Small volumes (300 μl) of culture samples (OD600 of 0.05) are incubated in the wells of a sterile covered white microplate with a transparent bottom (Greiner, Alphen a/d Rijn, The Netherlands). Growth (OD600) and luciferase (Lux) activity (expressed in relative light unit; RLU) are monitored at 10 min intervals during 5 h in a Varioskan Flash multi-mode reader (ThermoFisher Scientific, Zellic, Belgium) as previously described (Boutry et al., 2012). In the supplementation experiments, different concentrations of synthetic forms of ComS (NH2-COOH) or casein α1-derived peptides (purity > 95%) supplied by Peptide 2.0 (Chantilly, VA) were added to the 300 μl culture samples after 1 h and 30 min of growth at 37°C.
Natural transformation experiments
Experiments were performed as previously described (Fontaine et al., 2010a,b). The DNA, either pGIUD0855ery (1 μg), pGICB004 derivatives (1 μg), or purified overlap PCR products (25 ng), is added to 300 μl culture samples. The transformation frequency was calculated as the number of antibiotic-resistant colony-forming units (cfu) (erythromycin or chloramphenicol-resistant cfu in case of pGIUD0855ery, pGICB004 derivatives, or overlap PCR product respectively) per ml divided by the total number of viable cfu per ml. After the transformation experiments, the integration of the antibiotic resistance cassette at the right location was checked by PCR (primer pairs used are listed in Table S3).
DNA techniques and electrotransformation
General molecular biology techniques were performed according to the instructions given by Sambrook et al. (1989). Electrotransformation of E. coli was performed as described by Dower et al. (1988). S. thermophilus LMD-9 and Streptococcus gallolyticus UCN34 and S. pyogenes M1GAS chromosomal DNA was prepared as described by Ferain et al. (1996). S. mutans UA159 chromosomal DNA was purchased from the ATCC. PCRs were performed with Fhusion high-fidelity DNA polymerase (Finnzymes, Espoo, Finland) in a GeneAmp PCR system 2400 (Applied Biosystems, Foster City, CA). The 5′ tag-RACE method described by Fouquier d'Hérouel et al. (2011) was used to determine the transcriptional start site of ComR-regulated genes. The primers used in this study were purchased from Eurogentec (Seraing, Belgium) and are listed in the Tables S3–S6.
Construction of the luxAB reporter strains
The reporter strains (LF121, LF123, LF128, LF129, LF130, LF131, LF132, LF133) were constructed by replacing part of the blp locus of strain LMD-9 by a transcriptional fusion between a target promoter (respectively PcomS, PcomR, Pcds0064, Pster_1731, Pster_1817, Pster_1720, Pster_1655 and Pster_1643) and luxAB genes. The fusions are carried on pGICB004-derivative plasmids (respectively pGILF::PcomS, pGILF::PcomR, pGILF::Pcds0064, pGILF::P1731, pGILF::P1817, pGILF::P1720, pGILF::P1655, pGILF::P1643) and integrated in the chromosome by double homologous recombination events as previously described (Maguin et al., 1996; Fontaine et al., 2007). Those plasmids were constructed by cloning the target promoters, which were amplified by PCR, between the EcoRI and SpeI sites of plasmid pGICB004. Primers used are listed in Table S3.
Construction of the ComR–Strep expression vectors
Plasmids pBADcomRLMD9strep, pBADcomRUA159strep, pBADcomRM1strep, pBADcomRUCN34strep were constructed in two steps. First, the PCR products comRLMD-9-strep, comRUA159-strep, comRM1-strep and comRUCN34-strep were amplified from S. thermophilus LMD-9, S. mutans UA159, S. pyogenes M1GAS and S. gallolyticus UCN34 chromosome respectively. The 5′ end of forward and reverse primers respectively contains the restriction sites BspHI (in case of comRLMD-9-strep) or NcoI and EcoRI or HindIII (in case of comRUCN34-strep) for cloning purposes. In addition, the reverse primers were designed to create a translational fusion between the 5′ end of the various comR open reading frames (ORFs) and a sequence encoding a glycine/alanine flexible tail and the strep-tagII affinity tag. Second, the obtained comR–strep PCR products were digested and cloned between the NcoI and EcoRI sites (or HindIII in case of comRUCN34-strep) of plasmid pBADHisA. Primers used are listed in Table S3.
Construction of deletion strains by natural transformation
Mutant derivatives of strains LMD-9, CB001, LF121 and LF123 were constructed by exchanging the ORF of a target gene (sequence between the start and stop codons) for the chloramphenicol resistance cassette lox66-P32-cat-lox71, as previously described (Fontaine et al., 2010b). In case of multiple-ORFs deletion, the region between the start codon of the first ORF and the stop codon of the last ORF was deleted. The genetic replacement of comX with a P32-cat cassette (strains LF145) was obtained by double homologous recombination of plasmid pGIBD001, which was transferred by natural transformation, in strain CB001 (Boutry et al., 2012). Primers used are listed in Table S3.
Purification of ComR–Strep proteins
Precultures (20 ml) of strains DH10B [pBADcomRLMD9strep], DH10B [pBADcomRUA159strep], DH10B [pBADcomRM1strep] and DH10B [pBADcomRUCN34strep] were diluted to an OD600 0.05 in 1 l pre-warmed LB (42°C) containing ampicillin, and incubated at 42°C with continuous shaking. At an OD600 of ∼ 0.5, the culture was chilled to room temperature and protein expression was then induced by adding 0.02% of l-arabinose, as previously described (Lambin et al., 2012). After 4 h of induction at 28°C with continuous shaking, bacteria were centrifuged (5000 g during 15 min) and the pellet was washed in 100 ml cold buffer W (100 mM Tris-HCl pH 8.0, 150 mM NaCl, 1 mM EDTA) and resuspended in 10 ml cold buffer W supplemented with 0.5 mg ml−1 of lysozyme. After 30 min on ice, cells were sonicated at 4°C (Bioruptor, Diagenode, Liège, Belgium) and the soluble fraction was collected after centrifugation (13 000 g for 20 min at 4°C) and filtered through a 0.45 μm membrane (Millipore). The recombinant StrepII-tagged ComR proteins were then purified on a 1 ml Strep-Tactin Superflow column (IBA BioTAGnology, Göttingen, Germany) according to the manufacturer's instructions. Glycerol was added to the different eluate fractions at a final concentration of 10% (v/v). The concentration and purity of the fractions were estimated by SDS-PAGE and on a nanodrop (protein A280) (Agilent technologies, Santa Clara, CA). The purest fraction was used in the EMSA and SPR experiments.
Identification of ComR-boxes and establishment of a ComR-box position weight matrix
The 20 bp ComR-boxes (formerly ECom-box) from PcomX, PcomS, Pster_1655, Pster_1643 and Pster_1720 (Fontaine et al., 2010a) were used to generate a position weight matrix (PWM) using the Consensus tool from the RSA-tools package (http://rsat.ulb.ac.be), with default parameters. Using this matrix, a genome-wide search for additional ComR-boxes in the genome sequence of S. thermophilus LMD-9 (up to 200 upstream and downstream of annotated CDS) was performed using the Genome-Scale Patser tool from the RSA-tools package with default parameters [Ln(cut-off P-value) of −16.562, automatically calculated based on sample size and an arbitrary alignment of random sequences].
The sequences of the 8 ComR-boxes from S. thermophilus LMD-9 (5 previously identified, 3 identified in this work) were used to define a new PWM. First, a position frequency matrix was defined for the inverted repeats (9 bp), based on the assumption that each inverted repeat represents the binding site for one ComR monomer. Therefore, the approach assumed perfect symmetry between the two inverted repeats, and the 8 ComR-boxes represent 16 ComR binding sites. Second, corrected probabilities were calculated using the following formula (Wasserman and Sandelin, 2004): , where fb,i represent the counts of base b at position i, N represent the number of sequences (sample size), p(b,i) is the corrected probability of base b at position i, and ψ is the pseudocount. A pseudocount value of 0.8 was used in all cases, independently of the sample size (Nishida et al., 2009). Third, corrected probabilities were converted into weights, using the following formula (Wasserman and Sandelin, 2004): , where Wb,i represents the weight of base b at position i, p(b,i) is the corrected probability of base b at position i, and p(b) represents the background probability of base b. The background probabilities for the different nucleotides were calculated from the genome sequence of S. thermophilus LMD-9 [p(A) = p(T) = 0.3046 and p(C) = p(G) = 0.1954].
A similar approach was used for the two nucleotides located between the inverted repeats (positions 10 and 11 of the ComR-box). However, these were assumed not to be part of the ComR binding site. Therefore, no symmetry constraint was imposed, and the corresponding PWM was only calculated from a set of 8 sequences.
The different PWMs were then combined into a 20 bp ComR-box PWM, as follows: positions 1–9 were based on the IR PWM, positions 10 and 11 were based on the PWM for the two central nucleotides, and positions 12–20 were based on the reverse complement of the IR PWM. This position weight matrix was used to draw the consensus ComR-box logo, using the WebLogo tool (http://demo.tinyray.com/weblogo). All matrices (position weight matrices, position frequency matrices, corrected probability matrices) are provided as Dataset S1.
Prediction of ComR-box binding affinities
The binding affinity of ComR for the eight different ComR-boxes was calculated based on the approach used by Hardiman et al. (2010), which infers binding constants for a regulator on a DNA sequence based on nucleotide sequence conservation. This approach is based on the assumptions that (i) binding of the regulator (ComR) to the DNA (ComR-box) can be represented by independent binding reactions of the regulator to the individual nucleotides of the DNA sequence (additivity rule; McClure, 1985), and (ii) binding constants are proportional to the nucleotide frequencies in the set of DNA-binding sites (Stormo, 1990).
A specificity matrix (SpM) was derived from the ComR-box PWM, using the following formula: , where ab,i represents the specificity coefficient for base b at position i. Based on the assumption that ComR only binds to the inverted repeats, a coefficient of 1 was attributed to all bases at positions 10 and 11 (corresponding to the two central nucleotides of the ComR-box), so that these positions are not taken into account in the calculation of binding affinities. Note that the calculation of specificity coefficients (ab,i) is slightly different from the method proposed by Hardiman et al. (2010), which they directly use nucleotide frequencies (fb,i) for the calculation of ab,i, whereas the proposed method is based on corrected probabilities (p(b,i)). While in the present case, both methods gave essentially the same results (data not shown), the method of Hardiman can be misleading, especially in the case of small sample sizes (i.e. few binding sites).
For each ComR-box sequence, a specificity score was then calculated as the product of the corresponding specificity coefficients at each position of the sequence (Hardiman et al., 2010): , where k represents a ComR-box sequence. All scores were then calculated relative to the reference ComR-box sequence of PcomS. The ComR-box sequences, specificity matrix and calculated specificity scores are provided as Dataset S1.
All double-stranded DNA fragments (labelled or not) used in the EMSA experiments (approximately 200 bp) were amplified by PCR, except probes boxPcomS, Cy3-boxPcomS, WT Cy3-boxPster_1655 and its derivatives, which were obtained by annealing of single-stranded oligonucleotides. In the case of labelled probes, the 5′ end of the forward primers used (reverse primer in case of probe Cy3-PcomX from S. thermophilus LMD-9) was coupled to the Alexa 555 fluorophore. In case of probe Cy3/5-PcomS, the 5′ end of forward and reverse primers are respectively coupled to Alexa 555 and Alexa 637 fluorophores. Primers used are listed in Table S5. Typically, a gel shift reaction (20 μl) was performed in a binding buffer (20 mM Tris-HCl pH 8.0, 150 mM NaCl, 1 mM EDTA, 1 mM DTT, 10% glycerol, 1 mg ml−1 BSA) and contained 150 ng labelled probe and 4 μM ComR–Strep proteins. When necessary, 8 μM of ComS peptides (unless otherwise stated) are added. The reaction is incubated at 37°C for 10 min prior to loading of the samples on a native TBE 5% gel. The gel is next subjected to 80 V for approximately 1 h in TBE buffer. DNA complexes were detected by fluorescence on the Ettan DIGE Imager with bandpass excitation filters (nm): 540/25 (Cy3) or 635/30 (Cy5) and bandpass emission filters: 595/25 (Cy3) or 680/30 (Cy5) (GE Healthcare, Waukesha, WI).
Surface plasmon resonance experiments
Analysis of the real-time interactions between ComRLMD-9-Strep and the comS promoter region was performed using ‘research grades CM5 sensor chips’ on a BIAcore 2000 instrument (BIAcore AB, Uppsala, Sweden). Streptavidin was injected onto the CM5 sensor chips as previously described (Engohang-Ndong et al., 2004). The 125 bp biotinylated DNA fragment overlapping the comR–comS intergenic region (Biotine-PcomS) was amplified by PCR using LMD-9 chromosome as template, purified and immobilized onto the CM5 sensor chip. The biotinylated DNA fragment (4 ng μl−1) was injected in one channel of the chip to obtain a 210 resonance unit (RU) stable fixation to immobilized streptavidin. Another channel of the chip was loaded with an equivalent RU amount of biotinylated double-stranded 113-bp-long unrelated DNA fragment (Biotine-bla) corresponding to a bp14 to bp127 fragment of the pUC18 bla gene (primers used are listed in Table S6). ComRLMD-9-Strep and ComS17–24 were separately diluted to a concentration of 10 μM in W buffer containing 10% glycerol (v/v). ComRLMD-9-Strep and ComS17–24 were injected in the BIAcore apparatus, either separately or together. In the latter case, ComRLMD-9-Strep was incubated in the presence of ComS17–24 during 5 min at room temperature prior to injection. Injection of the proteins to the immobilized DNA was performed during 3 min at 25°C in W buffer at a flow rate of 20 μl min−1. Between each injection, the chip surface was regenerated by injecting a solution of 10 mM Tris-HCl, 500 mM NaCl, 0.005% SDS, 1 mM EDTA. The specific SPR signal or specific interaction (SI) signal was calculated as the difference between the signals from the channel containing the PcomS probe, and from the channel containing the unrelated bla probe. All values were then normalized to the signal before injection. For dose-response curves establishment, increasing concentrations of a ComRLMD-9-Strep::ComS17–24 mixture (molar ratio of 2.0) were injected. SI values were measured 3 min after the end of the injection period and used to calculate the dose-response curves. The stoichiometry between ComRLMD-9-Strep::ComS17–24 and immobilized DNA was determined using the equation (Speck et al., 1999; Engohang-Ndong et al., 2004): in which n is the stoichiometry of the complex, RU is the measured response (units) obtained at binding saturation and Mr is the molecular weight. ; α = 1 pg of protein bound RU−1 mm−2 and β = 0.73 pg of protein bound RU−1 mm−2 (Speck et al., 1999).
We warmly thank D. Dandoy and E. Nicolas for fruitful discussions and advices regarding EMSA experiments. We thank M. Deghorain and P. Glaser for respectively providing S. pyogenes M1GAS and S. bovis UCN34 chromosomal DNA. We gratefully thank V. Monnet for discussions and critically reading the manuscript. This research has been funded by the Interuniversity Attraction Poles Programme initiated by the Belgian Science Policy Office. L.F. is a postdoctoral researcher at FNRS. P.H. is a research associate at FNRS. Authors have no conflict of interest to declare.