Microarray analysis of the Bacillus subtilis K-state: genome-wide expression changes dependent on ComK



In Bacillus subtilis, the competence transcription factor ComK activates its own transcription as well as the transcription of genes that encode DNA transport proteins. ComK is expressed in about 10% of the cells in a culture grown to competence. Using DNA microarrays representing 95% of the protein-coding open reading frames in B. subtilis, we compared the expression profiles of wild-type and comK strains, as well as of a mecA mutant (which produces active ComK in all the cells of the population) and a comK mecA double mutant. In these comparisons, we identified at least 165 genes that are upregulated by ComK and relatively few that are downregulated. The use of reporter fusions has confirmed these results for several genes. Many of the ComK-regulated genes are organized in clusters or operons, and 23 of these clusters are preceded by apparent ComK-box promoter motifs. In addition to those required for DNA uptake, other genes that are upregulated in the presence of ComK are probably involved in DNA repair and in the uptake and utilization of nutritional sources. From this and previous work, we conclude that the ComK regulon defines a growth-arrested state, distinct from sporulation, of which competence for genetic transformation is but one notable feature. We suggest that this is a unique adaptation to stress and that it be termed the ‘K-state’.


Competence is defined as a physiological state in which exogenous DNA can be internalized, leading to a genetic transformation event. In Bacillus subtilis, only 5–10% of the cells in a given population differentiate to this state, and ComK and MecA were identified as key regulators of the competence response (Dubnau and Roggiani, 1990; Kong et al., 1993; van Sinderen et al., 1995). As cells grow exponentially, MecA binds to ComK, the competence transcription factor, targeting it for degradation by the ClpC/ClpP proteasome (Turgay et al., 1998). When the competence quorum-sensing mechanism is activated late in exponential growth, the small protein, ComS, is synthesized (D’Souza et al., 1994; Hamoen et al., 1995; Solomon et al., 1995). ComS binds to MecA, releasing ComK and thereby preventing its degradation. ComK activates a number of promoters, including its own, thereby establishing a positive autoregulatory circuit (van Sinderen and Venema, 1994). This results in a switch-like phenotype, in which the comK gene is expressed explosively in a few cells. Among the known ComK targets are several loci encoding proteins needed for the uptake of transforming DNA, as well as recA (Hamoen et al., 1998). As expected, the mutational inactivation of mecA results in overproduction of ComK, as the latter is no longer targeted for degradation and is free to act positively on its own promoter. In mecA mutants, comK is transcribed in all the cells (Hahn et al., 1995).

The C-terminal domain of MecA binds to ClpC, whereas the N-terminal domain binds to ComK and ComS (Persuh et al., 1999). In this way, MecA targets the regulatory protein ComK for degradation by a proteolytic machine that is also an important part of the heat shock response of B. subtilis, presumably degrading misfolded and aggregated proteins (Krüger et al., 2000). MecA therefore acts as an adapter, harnessing a general proteolytic mechanism for a specific regulatory response. ComK synthesis occurs in stationary phase cells, partly because the quorum-sensing system activates this synthesis in response to high population density. ComK-expressing cells are arrested with respect to growth and cell division and, upon dilution into fresh medium, retain the appearance of stationary phase cells for several hours after the majority population of non-competent cells has resumed growth (Haijema et al., 2001).

Genome-wide expression profiling with DNA micro-arrays has rapidly emerged as a powerful method for studying differential gene expression. The value of this method is greatly enhanced when applied to organisms for which the complete genomic DNA sequence is available. Among the many bacterial species whose genomes have been published, DNA microarrays have been deployed for genome-wide expression studies in several, including B. subtilis (Antelmann et al., 2000; Fawcett et al., 2000; Ye et al., 2000; Helmann et al., 2001; Kobayashi et al., 2001; Lee et al., 2001; Ogura et al., 2001), Escherichia coli (Richmond et al., 1999; Barbosa and Levy, 2000; Oh and Liao, 2000a,b; Wei et al., 2001), Helicobacter pylori (Sillen et al., 1999; Salama et al., 2000), Streptococcus pneumoniae (Peterson et al., 2000; de Saizieu et al., 2000) and Mycobacterium tuberculosis (Wilson et al., 1999; Talaat et al., 2000).

We have used a DNA microarray to determine the set of genes that is up- or downregulated in a ComK-dependent manner. It is arguable that more is known concerning the regulatory networks of B. subtilis than of any other bacterial species, particularly those networks that affect post-exponential gene expression. This system is therefore an attractive subject for exploration using modern expression profiling methods, as internal controls abound, and much information is available to facilitate the biological interpretation of array-derived results.

Unexpectedly, we have found that the expression of at least 165 genes is upregulated in the presence of ComK. These include open reading frames (ORFs) that were previously shown to be ComK dependent as well as many for which there was no prior evidence of ComK control. In several cases, validation of the microarray data was achieved through the use of promoter fusions. This profound alteration in the expression programme involves many genes that appear to have no role in transformation. We propose that competence as usually defined is but one feature of a differentiated, growth-arrested state, which we propose to call the K-state.


Bacillus subtilis genome microarrays and experimental design

To explore the global expression changes orchestrated by ComK, we first amplified individual protein-encoding ORFs by polymerase chain reaction (PCR), using a commercially available primer set. We were successful with ≈ 95% of the B. subtilis ORFs. The PCR products were spotted onto poly L-lysine-coated glass slides. These microarrays were subsequently queried with fluorescent probes, labelled with Cy3 or Cy5, comprising first-strand cDNA made from randomly primed total cellular RNA (Eisen and Brown, 1999). Using such probes, derived from stationary phase cultures grown in rich media, we typically detected hybridization signals (greater than twice background in at least one channel) for about 40–60% of the genes that were spotted onto the microarrays.

The purpose of our experiments was to describe the set of genes that is up- or downregulated in the presence of ComK. To this end, one comparison was between RNA from wild-type and ComK mutant strains grown to competence (type 1 experiment). As ComK is synthesized in only about 10% of the cells in a competent culture, comparison of wild-type and comK mutant cultures would fail to detect repressed genes. In addition, if the transcription of a gene were merely augmented in the presence of ComK, but were not completely dependent on this factor, the comparison would be intrinsically insensitive, as a large transcriptional background would be present in the non-competent cells. When mecA is inactivated, ComK is produced in essentially all the cells (Hahn et al., 1995). In this study, we have therefore also compared mecA with mecA comK cultures (type 2 experiment), thus obviating difficulties resulting from inhomogeneity in the competent cultures. However, it should be kept in mind that the type 2 comparison might identify genes that are regulated by ComK only in a mecA background.

Gene expression profiles were compared in two-colour microarray experiments (Eisen and Brown, 1999) as described above. The relative representation of a transcript between the two RNA pools was assayed by measuring the fluorescence intensities of the two dyes at a given target on the array. For each type of comparison, two independent sets of RNA samples were used (sample pairs A and B and sample pairs C and D for type 1 and 2 comparisons respectively). A total of 9–18 replications were carried out with each sample for a given comparison.

Figure 1 shows typical scatter plots for the type 1 and type 2 comparisons (Fig. 1B and C), as well as for a control experiment in which two different mecA RNA samples (Fig. A) were compared. Although the Cy3 and Cy5 signal intensities are grouped relatively closely around the line of best fit in the control graph, more scatter is evident in the type 1 and type 2 graphs. To analyse the microarray data further, we have used two approaches, as described in Experimental procedures. The first ap-proach defined genes as regulated by ComK if they were associated with normalized Cy5 to Cy3 fluorescence intensity ratios of ≥3 or ≤0.33. In the second approach, we used significance analysis of microarrays (SAM) (Tusher et al., 2001). A summary of the results with these two methods is presented in Table 1.

Figure 1.

Scatter diagrams for representative data from mecA versus mecA (A), wild type versus comK (B) and mecA versus mecA comK comparisons.

B and C. The wild-type and mecA cDNAs, respectively, were labelled with Cy3. The lines of best fit are shown, together with lines that denote the three-fold limits.

Table 1. Summary of genes identified as ComK regulated by comparison of ratios and by SAM.
 Type 1 comparisons
(wild type versus comK)
Type 2 comparisons
(mecA versus mecA comK)
 RNA ARNA BCommon in
RNA A and B
RNA CRNA DCommon in
RNA C and D
  1. a. The + and – symbols refer to genes that are up- and downregulated, respectively, in the presence of ComK.

(ratio and SAM)

Expression profile changes affected by ComK (type 1 comparison)

Using the ratio method with the type 1 comparison, 92 and 140 genes were called as upregulated in the presence of ComK with RNA pairs A and B respectively. Of these genes, 66 were called in common using both RNA pairs (Tables 1 and 2). Using SAM with the type 1 comparison, 97 and 161 upregulated genes were called with RNA pairs A and B. Of these, 55 were called in common using the two RNA pairs, and 49 of these genes were called in common by both the ratio and SAM approaches and with both RNA sample pairs (Table 1). These are regarded as the most likely candidates for positive regulation (direct or indirect) by ComK detected by the type 1 comparisons, and are listed in bold in Table 2.

Table 2. Putative ComK-upregulated genes identified in type 1 comparisons.
  1. a. This column lists the genes identified as upregulated in the type 1 comparisons, with ratios ≥ 3.0, using both A and B RNA pairs. Among these genes, those also identified by SAM as upregulated are indicated in bold.

  2. b. This column presents average ratios for the genes listed in the first column. The averages include the 18 repeats carried out with RNA pair A and the 12 repeats with RNA pair B.

comER 22.5 comEC 11.7 rpsF 7.7 gidB 4.7
nucA 22.4 comC 11.6 hisZ 7.5 yqeN 4.6
pta 19.7 ywfL 11.3 yviA 7.3 rpsR 4.6
comGA 19.2 maf 11 ywpH 7.3 mreB 4.3
comGB 18.4 smf 10.9 spoIIIJ 6.7 jag 4.3
comGD 17.4 yckB 10.6 comEB 6.7 yndK 4.2
yyaF 16.2 radC 10.3 yndJ 6.4 mreC 4.2
comEA 15.8 hxlA 10 mtlR 6.2 ywpJ 4.1
nin 15.3 rnpA 9.6 yxaG 6.2 mreD 4.1
ywfK 15.1 comFB 9.6 ssb 6 yvrO 3.9
comGC 14.2 hisG 9.4 ycbO 5.9 bdbD 3.9
comGE 14.1 hxlB 9 oxdC 5.9 hxlR 3.7
yckC 13.8 comFC 9 ywhA 5.3 yvrN 3.6
comK 13.1 ywfM 8.5 tuaG 5.2 atpB 3.5
comGF 13.1 comFA 8.3 cwlJ 5.1 yhbB 3.3
comGG 13 yckA 7.9 ywfI 4.9
yckD 12.9 ycbM 7.7 yurH 4.8

Only five and two genes were called as downregulated from RNA sample pairs A and B using the ratio method, and only nine genes were called by SAM with each RNA pair. None of the seven genes called by the ratio method or the 18 genes called by SAM as negatively regulated were called in common by both RNAs or by both the ratio criterion and SAM. As we do not expect to detect downregulated genes from a type 1 comparison, and as no genes were called in common in these various com-parisons, we regard these calls as false positives.

Expression profile changes affected by ComK (type 2 comparison)

With the type 2 comparisons, the ratio method called 234 and 176 positively regulated genes with RNA pairs C and D. Of these, 165 were called in common with the two RNA pairs (Table 1). These genes are listed in Table 3. SAM called similar numbers of positively regulated genes with both RNA pairs and, of these, 126 were called in common with RNA pairs C and D (Table 1). Table 3 lists the genes called in common as positively regulated by ComK with both RNA pairs using the ratio method. The genes also called by SAM are indicated in bold. Those genes called by both type 1 and type 2 comparisons and by both SAM and the ratio method are underlined in Table 3. Of the genes identified by the type 1 comparison, all but three (atpB, yurH and ywfI) were also identified by the type 2 comparison. In fact, ywfI was called by both ratio and SAM using RNA pair C. With RNA pair D, the ratio was 2.7, just below the cut-off. In the type 2 comparison, it is possible to detect genes that are downregulated in the presence of ComK. Nine genes are called using both RNA pairs by the ratio method (Table 4) and 30 using SAM. Of these, eight were called in common by both criteria.

Table 3. Putative ComK-upregulated genes identified in type 2 comparisons.
  1. a. This column lists the genes identified as upregulated (ratios ≥ 3.0) in the type 2 comparisons using both the C and D RNA pairs. Among these, the genes identified by SAM as upregulated are indicated in bold. The genes also identified as upregulated in the type 1 comparisons (Table 2) by both the ratio method and SAM are underlined.

  2. b. This column presents average ratios for the genes listed in the first column. The averages include the 16 repeats carried out with RNA pair C and the nine repeats with RNA pair D.

comGA 37.4 mtlR 13.6 tuaG 8.4 mreC 5.4 argJ 4.1
comGD 29.7 yckD 13.2 ywpE 8.4 cwlH 5.4 ywfG 4
comGE 28.9 yckC 13 yfhB 8.4 ywiB 5.4 usd 4
pta 28.6 rnpA 12.8 rapH 8.3 ycbL 5.4 ykuK 4
nucA 28.1 comZ 12.7 ycbO 8.2 yhcI 5.2 yqeF 4
nin 26.9 hisZ 12.6 ywfM 7.8 mreD 5.2 exoA 3.9
comGB 26.1 yvyG 11.9 yddR 7.5 gidB 5.2 yocI 3.9
comGF 25.7 yviE 11.9 ywfF 7.5 yvrL 5.2 trpB 3.9
yvaY 24.2 xpt 11.8 aroC 7.4 dinB 5.1 ybdL 3.9
comER 23.8 yqeN 11.6 sacX 7.4 ycbP 5.1 ybxI 3.9
comFB 23.7 hisG 11.4 yhcE 7.3 jag 5.1 yoxD 3.8
smf 22.5 pbuX 11.3 yxaG 7.1 cwlJ 5 ywfB 3.8
comGC 21.9 yckA 11.3 yqzG 7 ywoH 5 ywhH 3.7
comEA 21.9 rpsR 11.3 ywtF 6.9 recA 5 yvrP 3.7
maf 21.2 yckB 11.2 yvrO 6.8 ywoG 4.9 dnaG 3.7
yyaF 20.8 comEB 11.1 degS 6.8 yyaC 4.9 yyaH 3.6
radC 20.4 yhzC 11.1 yjzB 6.6 mreB 4.8 sboA 3.6
flgM 18.2 ssb 11 yviF 6.5 ywtE 4.8 yndG 3.6
comGG 17.7 rpsF 10.9 yqiX 6.5 yvtA 4.8 cypC 3.6
comFC 17.6 med 10.7 spoIIIJ 6.5 sucD 4.8 yfhC 3.5
comC 17.3 yndK 10.6 yvrN 6.4 ydeC 4.7 groES 3.5
ywfK 17 yndH 10.4 ywpD 6.3 bdbC 4.6 minC 3.4
argC 16.6 yjbF 10.4 sigA 6.3 topA 4.6 ykuJ 3.4
comEC 16.3 ycbM 10.2 ywfE 6.1 yyaA 4.4 yqgP 3.4
ywpJ 15.4 flgL 10.1 flgK 6 csrA 4.4 ydeE 3.4
yndJ 15.3 yvyE 10 bdbD 6 yvaX 4.3 phrF 3.4
yviA 15.3 mbl 10 degU 6 ycgS 4.3 yjaX 3.4
yqzE 14.9 ywpH 10 malL 5.9 tuaF 4.3 yoml 3.3
comFA 14.6 yvhJ 9.4 oxdC 5.7 argD 4.3 yqfL 3.3
hxlA 14.6 ycbN 9.3 ywhA 5.7 yddQ 4.3 hipO 3.3
comK 14.3 hxlB 9.2 ywfC 5.6 yvyF 4.3 minD 3.3
spoIIID 13.6 hxlR 9.1 yvrM 5.5 argB 4.1 yqxD 3.2
ywfL 13.6 ybdK 9 yhbB 5.5 yxaF 4.1 yknW 3.1
Table 4. Putative ComK-downregulated genes identified in type 2 comparisons.
  1. a. The genes listed were identified as downregulated (ratio ≤ 0.33) in the type 2 comparisons with both the C and D RNA sample pairs. All but ywzA are in bold because they were also identified using SAM.

  2. b. These are grand average ratios computed from the 16 values from RNA pair C plus the nine values from RNA pair D.

ywzA 0.2
nrgA 0.24
spo0A 0.25
csbA 0.26
grpE 0.26
mcpA 0.29
ureB 0.29
yjfB 0.3
yqfU 0.32

For reasons to be discussed below, we regard the gene lists in Tables 2–4 as conservative. Nevertheless, we will largely restrict our discussion from this point to the genes listed in these tables. More complete data may be viewed at the website listed in Experimental procedures.

Genomics of ComK-regulated genes

Of the 66 genes listed in Table 2 that fall within the 3.0-fold cut-off with both the A and B RNA pairs, 48 are located in 14 clusters on the B. subtilis chromosome. Using this cut-off therefore identifies at least 32 ComK-dependent transcription units (66–48+14). The type 2 comparison identified 128 clustered genes in 39 clusters, corresponding to a total of 76 transcription units. Many of these ComK-regulated clusters undoubtedly represent operons, and several are shown in Fig. 2. In nearly every instance, the microarray data suggested differential gene expression for all the ORFs within a given cluster (e.g. comFA–yviE and comGA–yqzE; Fig. 2A and D). The ComK-dependent transcription of genes downstream from the comF operon (Fig. 2A) is consistent with the finding that readthrough transcription into the antisigma factor gene flgM plays a role in the downregulation of motility in competent cells (Liu and Zuber, 1998). The upregulation of yqzE downstream from the comG operon (Fig. 2D) and of yqeN downstream from the comE operon (Tables 2 and 3) may also represent readthrough from known competence promoters. Neither of these downstream genes are required for competence (Albano et al., 1989; Hahn et al., 1993).

Figure 2.

Examples of putative ComK-regulated operons and gene clusters. The boxes denote genes whose expression patterns from microarray analyses suggested co-expression driven by ComK. spo0J and soj are not listed in the tables but are apparently upregulated by ComK, as discussed in the text. Arrows denote direction of transcription; the solid rectangles indicate transcription terminators as annotated on the Subtilist website (http://genolist.pasteur.fr/SubtiList/), and ComK boxes are indicated by triangles.

Not unexpectedly, putative ComK boxes precede several of the ComK-regulated gene clusters. The search for the ComK-binding motifs was based on the work of Hamoen et al. (1998), which identified AAAAN5TTTT as a dyad motif recognized by ComK. In the characterized ComK-driven promoters, a second such dyad element was always present, separated from the first by 8, 18 or 31 bp. Our search criteria permitted no more than two departures from consensus in a single dyad, and a total of no more than three mismatches in a single binding site, which are the limits observed with the promoters that have been studied in detail (Hamoen et al., 1998). We have also imposed the conservative requirement that any putative ComK binding site be located within 500 bp of the first codon of a ComK-regulated gene cluster. Searching the entire genome with these criteria, using the SEARCHPATTERN utility (http://genolist.pasteur.fr/SubtiList/) generated hits for many of the ComK-dependent genes. Of the 76 upregulated transcription units identified in the type 2 comparison, 20 possess putative ComK-binding motifs (Table 5). Of the nine downregulated genes listed in Table 4, three (yjbF, nrgA and ywzA) are also preceded by putative ComK boxes (Table 5). Of the 23 ComK boxes presented in Table 5, 20 are newly identified. Several hundred putative ComK boxes were detected that are apparently not associated with ComK-regulated genes. The significance of this finding is discussed below.

Table 5. Putative ComK-binding motifsª preceding selected ComK-regulated genes.
Bp from start codonGenePattern sequence
  1. a. No more than a total of three mismatches were allowed for the entire motif (AAAANNNNNTTTT < N8, N18 or N31 > AAAANNNNNTTTT)

  2. and no more than two mismatches per half. The search was restricted to regions within 500 bp upstream of start codons. The ComK box

  3. motifs are underlined. Genes with previously identified ComK-binding motifs are in bold, and one such characterized gene is included in each spacing class for purposes of illustration. All the genes listed are upregulated in the presence of ComK except for yjfB, nrgA and ywzA, which are downregulated.

–240 nucA tgaaaacaaatatttcgatctgctaaaattacgatttctgc
–487 bdbD tagaagctaagcttttgcctagtgaaaaactcatttttgtc
–165 med gcaaaaacaaaaattatcgagtggaaaagacaattttcatc
–462 topA cccaaaaatggcatttccctatgagaaaccgtattatcagc
–106 ycbL aaaaaaagaataatttttataaaacacacacgtttttcaat
–170 yjbF ggaaatcggttttttttatgcacgaaaatgccattttagag
–175 yjfB ttgaaaacagcttttcgtattgctcaaaggtcttttccatt
–90 yndG ggataaagatactttttttggagagatatacacttttcaaa
–167 yqzG cgaaatcgctcattttgttcgtttaaaagaatgttttcatc
–105 comGA gaaagaattggtttttcagcatataacatctcacaaaatcacgttttccct
–114 pta aaaaataagatttttttatgaaagcgctataatgaaagttggctgtttgaa
–84 rapH tgaaaaaacaatttcttccttctatattgttttcaaaatttgggattgata
–199 ywfI tcaaacagccaactttcattatagcgctttcataaaaaaatcttatttttc
–163 ywpH ttaaaaagtacatatttcttcaaaggaaaaaagcaaaagatgtttttagct
–324 ywzA tcaaaaatttcattttgttaaactttctagataggaattattttttcctcg
–138 yyaF ttgaaaaaagcataatcaaagggattggcaagctgaaaatgcttttttgtc
–147 comK agtaaaatcggtttattactagtcatttagtaccattaaatatcattaaaagatgattttat
–429 hipO agaagaaagaaggtttgattctttggagtctttaacaaagcaaattaaaaaggatatttcgt
–84 nrgA catgaaaatgttttatcattcttttttctctataatgaagaaatataataattgctttttat
–230 ybdK cggaaaattttcttttatttatatgaaaatagagacagtatcctgacaaaggaacatttctt
–132 yfhB tataaaacaagaaattcttaatcgttacacccattttcttaaaaaagaaaatgggttttttt
–202 yhzC gataaaatcatcttttaatgatatttaatggtactaaatgactagtaataaaccgattttac
–169 yvrP tgaaaaaatcccggtttgaagccgggattttttttattcggctttgaaaatacaatttttct

Verification of ComK-dependent genes

Several candidate ComK-regulated genes (soj–spo0J, ywpH, yvrL, pta, smf and bdbDC) were studied further to verify the microarray-derived results. In one type of experiment, we used fluorescence microscopy to determine whether candidate ComK-dependent genes were expressed preferentially in the 10% of cells in a given culture that also expressed comK. This approach uses the majority non-comK-expressing cells as internal controls and does not employ a mecA background, permitting the study of gene expression in a wild-type context. For these experiments, we first selected ywpH and yvrL. The ywpH gene was of interest because its predicted product resembles single-strand DNA-binding proteins, and it might therefore play a role in transformation. yvrL and the gene immediately upstream (oxdC) were both identified as upregulated (Fig. 2C). Fusions of ywpH and yvrL to yfp (the gene for the yellow fluorescent protein) were constructed. These fusion constructs were each inserted at their homologous chromosomal loci by single reciprocal recombination. This was done in a genetic background that also included a fusion of comK to cfp, the gene for the cyan fluorescent protein, as well as an intact functional copy of comK, so that these strains exhibited normal expression of competence. The YFP and CFP fluorescence signals can be readily distinguished in the microscope using appropriate filter sets. The comK–cfp fusion permitted the microscopic identification of the ComK-expressing cells that were therefore competent (Haijema et al., 2001). If ywpH and yvrL were indeed ComK dependent, we would expect their YFP fluorescence signals to be expressed preferentially in the cells that also expressed CFP. Figure 3 documents that this was indeed the case; YFP and CFP are co-expressed in 5–10% of the cells, whereas in the majority of cells, neither fluorescent signal was detected above background. We have reported previously an all-or-nothing pattern for comK expression (Haijema et al., 2001), and the present results suggest that the transcription of ywpH and yvrL is greatly enhanced when ComK is active.

Figure 3.

yvrL, ywpH and spo0J are preferentially expressed in competent cells. Strains carrying a comK–cfp fusion and yvrL–yfp (A and B), ywpH–yfp (C and D) or spo0J–yfp (E and F) were examined for CFP (A, C and E) or YFP (B, D and F) fluorescence. The CFP autofluorescence is obvious and may be used to identify the locations of cells that do not synthesize ComK–CFP.

During the course of unrelated experiments, we observed enhanced fluorescence from a spo0J–yfp fusion in competence-expressing cells (Fig. 3). As expected, the Spo0J–Yfp signal is visible in all the cells, as this protein is present even in the absence of ComK. It should also be noted that, in these stationary phase cells, the Spo0J–YFP signals were usually not localized at the cell poles as they are during exponential growth (Lin et al., 1997). Although the soj spo0J operon is not listed in Tables 2 or 3, these genes fell just below our arbitrary criteria. For instance, in the type 1 comparison, soj and spo0J both exhibited ratios of 1.5 with RNA pair A and 1.7 with RNA pair B. In the type 2 comparison, the ratios for soj and spo0J were 4.2 and 3.8 with RNA pair C and 2.8 and 2.4 with RNA pair D respectively. The results obtained by fluorescence and with the microarrays have also been confirmed in Western blots using antisera raised against Soj and Spo0J proteins; markedly stronger signals were obtained with both proteins in the wild-type compared with a comK mutant strain (not shown). We conclude that the soj spo0J operon is positively regulated by ComK, as indicated in Fig. 2G, possibly by readthrough from the upstream gidB and yyaA genes.

We performed a similar experiment with pta, which encodes phosphotransacetylase and was predicted to be ComK dependent (Tables 2 and 3). This gene was chosen for study as it is involved in intermediary metabolism, is presumably not required for transformation and is most likely expressed even in non-competent cells. As expected, a strong YFP signal was detected in all the cells in the population (not shown), and no difference could be detected between the competent and non-competent cells. A transcriptional fusion of pta to E. coli lacZ was then obtained from T. Henkin, and β-galactosidase activity was determined in wild-type, mecA, mecA comK and comK backgrounds. Figure 4A demonstrates that the pta–lacZ fusion in the wild-type, comK and mecA backgrounds was growth stage regulated in the three strains, and that the fusion was expressed slightly more in the mecA background and slightly less in the comK background. These small effects would be reasonable for a gene that is ComK regulated but can also be expressed independently of ComK. In the mecA comK double mutant, however, the expression of the fusion was markedly reduced. Taken together, these data support the microarray results and suggest that pta transcription is ComK dependent, at least in a mecA background. We do not understand why the type 1 comparison suggests a robust effect of ComK on pta transcription (Table 2), whereas only a small effect was apparent using the pta–lacZ fusion (Fig. 4A). It is possible that the amount of lacZ mRNA produced from the pta fusion promoter does not always faithfully reflect that of the pta transcript.

Figure 4.

smf and pta are regulated by ComK. Strains carrying pta (A) and smf (B) fusions to lacZ were grown in competence medium and sampled at the indicated times for assay of β-galactosidase.

A. pta–lacZ (closed squares), pta–lacZ mecA (open circles), pta–lacZ comK (closed circles) pta–lacZ mecA comK (open triangles).

B. smf–lacZ (closed squares), smf–lacZ mecA (open circles), smf–lacZ comK (closed circles).

smf is apparently ComK dependent (Tables 2 and 3) and is highly similar to genes from Haemophilus influenzae, H. pylori and S. pneumoniae that are required for full transformability in those organisms (Karudapuram et al., 1995; Campbell et al., 1998; Smeets et al., 2000). Figure 4B confirms that the transcription of a smf–lacZ fusion is ComK dependent and is strongly induced when mecA is inactivated.

A further confirmation of the microarray data was performed as part of another study (Meima et al., 2001). The bdbDC operon encodes a pair of thiol-disulphide oxidoreductases that are required for transformation, apparently because they are needed for the correct folding of ComGC, an essential competence protein. The micro-array data suggest that this operon is upregulated in the presence of ComK. bdbD satisfies our criteria in both type 1 and type 2 experiments. bdbC is called in the type 2 experiment but exhibits ratios of 2.3 with both RNA pairs in the type 1 comparison. The comK dependence of the bdbDC operon has been confirmed using a transcriptional fusion to lacZ (Meima et al., 2001).

Transformation phenotypes of new ComK-dependent genes

We have tested mutants of several of the newly identified ComK-dependent genes for transformability. Inactivation of smf resulted in a 100-fold decrease in transformation frequency when selection was for Leu prototrophy (not shown). Insertion of an intact copy of smf in the ectopic amyE locus, under pSPAC control, fully complemented the transformation phenotype of the smf knock-out. Also, inactivation of the ssb paralogue ywpH resulted in a 10-fold decrease in transformation (not shown). To demonstrate that the ywpH phenotype was not caused by polar effects on downstream genes (glcR and ywpJ), we constructed a double knock-out of these with no discernible effect on transformability. It is interesting that ssb expression is also elevated when ComK is present (Table 1). ssb could not be inactivated (not shown), and this gene is most probably essential for viability. Inactivation of oxdC (predicted to be polar on yvrL) or of pta had no effects on transformation. Finally, it should be noted that yjbF is an orthologue of coiA, which has been characterized as a competence gene in S. pneumoniae (Pestova and Morrison, 1998).


Expression profiling documents a profound alteration of the gene expression programme in response to the synthesis of ComK. This can be visualized in Fig. 1. Although we have emphasized the arbitrarily adopted expression ratio cut-offs of 3.0 to identify ComK-regulated genes, inspection of Fig. 1B and C conveys the impression that many additional genes that fall within these limits are affected by ComK. In fact, for several reasons, we regard the gene lists in Tables 2–4 as conservative. First, we have deliberately adjusted our criteria to exclude a small number of the known ComK-regulated genes (see Ex-perimental procedures). Secondly, in several cases, a few genes in an operon are identified as ComK regulated, whereas others are not. The genes not identified often narrowly fail to satisfy our criteria. An example of this is provided by the gene cluster composed of maf, radC, mreB, mreC, mreD, minC and minD (Fig. 2F). In the type 1 comparison, all these genes were identified except for minC and minD (Table 2). minC exhibited average ratios of 2.7 and 3.5 using RNA pairs A and B, and minD exhibited average ratios of 2.0 and 2.5. In the type 2 comparison, both genes met the ratio criterion with both RNA pairs, but minD was called by SAM only with RNA pair C. These clustered genes therefore did not always meet our criteria, although it appears nearly certain that they are both actually upregulated in the presence of ComK. Additional examples of this kind were presented above in the validated cases of bdbD, bdbC, soj and spo0J. These considerations lead us to conclude that the actual list of genes regulated by ComK is larger than indicated by Tables 2–4. It is noteworthy that we detected more genes upregulated by comK in the type 2 than in the type 1 comparisons. This was predicted, as only a minority of the cells in a mecA+ culture express comK. In this connection, it is noteworthy that many of the genes called in the type 2 but not in the type 1 comparison are also expressed independently of ComK. Among these are argB, argD, argJ, csrA, degS, degU, dnaG, minC, minD, sigA and topA. As these genes will be expressed in the non-comK-expressing cells, the fold change in their expression in the presence and absence of ComK will be decreased.

One plausible explanation for the large number of genes apparently regulated by ComK is that the transcription of a distinct subset of affected genes is directly dependent on the binding of ComK to their promoters, and that the products of these genes have secondary effects on many others. It is notable in this connection that 28 genes, identified in this study as dependent on ComK, have been annotated as regulators of transcription or translation (comZ, degS, degU, hisZ, hxlR, med, mtlR, phrF, rapH, rnpA, rpsF, rpsL, sacX, sigA, spoIIID, spoIIIJ, usd, ybdK, ycbL, ycbM, ydeC, ydeE, yvhJ, ywfK, ywoH, ywpD, ywtF and yxaF). Of these, three (med, rapH and ycbL, in addition to comK itself) are associated with predicted ComK boxes, and any of them may be responsible for the ComK-activated transcription of additional genes. An additional 19 ComK-regulated genes appear to be involved in intermediary metabolism (argB, argC, argD, argJ, aroC, fabHA, hxlB, hipO, hisG, malL, pta, sucD, trpB, ycgS, yddQ, ywpJ, ywfG, yoxD and xpt), and their activities may easily affect the expression of downstream genes. In general, it is likely that homeostatic mechanisms will tend to dampen such ripple effects. However, these dampening mechanisms may be overpowered by regulatory proteins that directly activate or repress many genes. In the present case, it is clear that the set of genes directly dependent on ComK is large, as at least 24 such genes, organized in nine operons, have been described previously (Ogura et al., 1997; Hamoen et al., 1998; Ogura and Tanaka, 2000), and many of the genes identified in this study appear to be associated with ComK boxes (Table 5).

It is satisfying that many of the ComK-activated genes are organized in clusters, or putative operons, as if one gene in an operon is ComK dependent, one would expect this to be true, by and large, of the others. It is also satis-fying that many of the ComK-dependent genes are preceded by apparent ComK boxes. This strongly suggests that the transcription of these genes may be directly enhanced (or repressed) by ComK binding. However, a cautionary note is sounded by the observation that several hundred genes that are apparently not activated by ComK are also preceded by predicted ComK boxes (not shown). It is likely that our understanding of the rules defining functional ComK boxes is deficient, and that many of these sequences have been erroneously identified while others may have remained unpredicted. It will be important to confirm the direct activity of ComK at a sample of these predicted ComK boxes and to refine our understanding of what constitutes a ComK regulation site. However, it is also possible that many of these ComK boxes are in fact not overpredicted, but represent binding sites that do not function directly to modulate the transcription of adjacent genes. One such site was apparently identified in a previous study (Hamoen et al., 1998). ComK-expressing cells contain about 90 000 monomers of this transcription factor, or about 23 000 active tetramers (Turgay et al., 1998). This is highly unusual for a regulatory protein, suggesting that ComK may play a structural role, adjusting the folding of DNA in the nucleoid, in addition to its activity as a directly acting transcription factor.

An interesting pattern can be detected in five instances in which divergently transcribed genes are ComK dependent. These pairs of genes, identified among those listed in Tables 2 and 3, are pta/ywfI, comK/yhzC, degS/yvyE, yckB/yckC and hxlA/hxlR. At least in the cases of comK/yhzC and pta/ywfI, there are predicted (palindromic) ComK boxes between the divergent transcription units, within 500 bp of the respective start codons. This raises the possibility that ComK can activate transcription in both directions after binding to a single palindromic site. This arrangement may permit modulation of gene expression by global changes in supercoiling (Opel et al., 2001).

Induction by ComK of several newly identified ComK-driven genes was verified with yfp or lacZ fusions. In the cases of bdbDC, ywpH, yvrL, spo0J–soj and smf, ComK-dependent transcription was confirmed (Figs 3 and 4) (Meima et al., 2001). As noted above, it is possible that some of the ComK-activated genes that we have identified comparing mecA comK and mecA RNA may not be significantly dependent on ComK for their transcription in a mecA+ background. However, the substantial overlap of the ComK-dependent genes identified in mecA comK versus mecA and in the comK versus wild-type com-parisons (Tables 2 and 3) strongly suggests that most of the genes identified are truly ComK dependent even in the presence of a functional mecA gene.

The 174 up- or downregulated genes that meet the three-fold cut-off criterion in the mecA versus mecA comK experiment fall into several categories. Although many are genes of unknown function, in many cases, they have been annotated in the database as exhibiting similarities to genes for which a function has been established. Based on these database annotations (http://genolist.pasteur.fr/SubtiList/) and additional similarity searches, we have classified many of the activated genes as involved in transformation (25), sporulation (six), cell shape determination (including peptidoglycan synthesis) and cell division (11), transcriptional regulation (23), transport (11), intermediary metabolism (22), protein synthesis (five), DNA metabolism (10) and detoxification and the stress response (four). It is clear that the ComK regulon affects disparate cellular functions.

The competence-associated transcription of DNA repair genes has been reported previously (Love et al., 1985). Among the transformation-related genes are several newly identified late competence gene candidates: ywpH (an ssb paralogue); yjbF (which resembles coiA of S. pneumoniae; Pestova and Morrison (1998) and smf. It was not expected that sporulation genes would be transcribed in response to ComK, as it is generally believed that sporulation and competence are mutually exclusive. It is interesting in this connection that spo0A was found to be downregulated by ComK (Table 4). However, all six ComK upregulated sporulation genes act at or after stage III, and their induction in the absence of functional early spore gene expression would not presage entry into the sporulation pathway.

It is certain that many (probably most) of the newly identified ComK-dependent genes are not required for competence, originally defined as receptivity to transformation (Lerman and Tolmach, 1957), nor for the recombination and recovery steps that follow DNA uptake. We have demonstrated that this is the case for pta and oxdC/yvrL, and it is certainly true for many of the 29 intermediary metabolism genes, the six sporulation genes and many of the newly identified ComK-dependent transcriptional regulators. It is therefore no longer appropriate to refer to the ComK-determined physiological state as ‘competence’, as more is involved than transformability. We propose to refer to this instead as the ‘K-state’, a neutral term with no functional connotation.

The cell shape and cell division genes are of particular interest, as the K-state is associated with inhibition of cell elongation and division (Hahn et al., 1995; Haijema et al., 2001). The competence gene comGA plays a role in the inhibition of these two processes. One ComK-dependent gene cluster (Fig. 2) includes the genes for Maf (an inhibitor of cell division that has also been implicated in DNA repair; Butler et al., 1993; Minasov et al., 2000), MreB, MreC, MreD (shape determining factors; Jones et al., 2001), MinC and D (inhibitors of cell division; Levin et al., 1992) and RadC, a probable DNA repair protein. In addition to this cluster, mbl, tuaF, tuaG, cwlH and cwlJ are activated. Mbl plays a role in cell shape determination (Henriques et al., 1998; Jones et al., 2001), TuaG and F are required for the synthesis of teichuronic acid (Soldo et al., 1999), and CwlH and J are cell wall hydrolases. It appears that the K-state is accompanied by a reprogramming of cell shape, cell division and cell wall synthesis genes.

A minority of the cells in a given population reach the K-state, and these cells are arrested in cell division and growth (Haijema et al., 2001). The reversal of this growth inhibition requires at least the degradation of ComK (Turgay et al., 1998). In this sense, the K-state appears to be in some respects a resting state and is associated with the induction of a number of genes (exoA, radC, recA, ssb, topA, maf and dinB) that are likely to be involved in DNA repair. The arrest of cell division and growth may be an advantage if the K-state has evolved in part to deal with DNA damage, or if DNA repair is required after transformation, as growth in the presence of DNA lesions may be detrimental. In E. coli, the SulA protein is induced as part of the SOS regulon and inhibits cell division (Bi and Lutkenhaus, 1993), presumably until DNA damage has been repaired. Several genes that are activated in the K-state might facilitate the assimilation of novel nutritional sources. These include malL, sucD, yoxD and ycgS and the putative transport genes pbuX, yckA, yckB, ycbN, ywfF, yvrO, yvrN, yvrM, yqiX, ywoG and yvrP. Several of these transport proteins, in particular ywfF and ywoG, might function instead as detoxifying efflux pumps. In this connection, it is worth mentioning some additional ComK-dependent genes. oxdC encodes an acid-induced oxalate dehydrogenase (Tanner and Bornemann, 2000), which has been suggested to play a role in pH homeostasis in response to acid stress. hxlA and hxlB encode enzymes of the ribulose monophosphate pathway (Yasueda et al., 1999), and hxlR encodes a positive activator of their expression. It was suggested that this pathway is involved in the detoxification of formaldehyde. ComK induces all three of these genes. Additional stress response genes that are apparently upregulated in response to ComK include groES and possibly yqxD. Finally, two genes are likely to be involved in the synthesis of antibiotics (sboA and cypC; Hosono and Suzuki, 1983; Matsunaga et al., 1999), which may serve to eliminate competitors.

In conclusion, we propose that the K-state is a global adaptation to stress, distinct from sporulation, which enables the cell to repair DNA damage, to acquire new fitness-enhancing genes by transformation, to use novel substrates (possibly including DNA; Redfield, 1993; Finkel and Kolter, 2001) and to detoxify environmental poisons. This view of the K-state suggests a reason for its expression in only a fraction of the cells in a given popu-lation. The K-state represents a specialized strategy for dealing with danger, but also carries with it inherent risks. Transient arrest of growth and cell division confer vulnerability to overgrowth by competing populations, and transformability opens the cell to invasion by foreign DNA. The genome may therefore activate alternative systems in subpopulations to deal with adversity, and the K-state may be one such system. This strategy maximizes the probability that the genome will survive when faced with changing environments, a valuable capability for a soil-dwelling organism.

Experimental procedures

Bacillus subtilis strains and strain construction

All strains used were derivatives of B. subtilis strain 168, and all comparisons were made between isogenic pairs of strains. The following strains were used as sources of RNA for the microarray comparisons: BD630 (his leu met); BD2121 (his leu met comK::kan); BD2103 (his leu met mecA::spc); and BD2125 (his leu met mecA::spc comK::kan). The pta knock-out and pta–lacZ strains were gifts from T. Henkin.

The oxdC knock-out was constructed as follows. An internal 630 bp fragment of oxdC was amplified from the chromosome using the primers YVRK1 (5′-GGCGGATATGCCCGGGAAGTG) and YVRK2 (5′-GCTCTAGATCGGGTGCCAGTGCAG). Underlined letters represent the sites for SmaI and XbaI respectively. The polymerase chain reaction (PCR) product was digested with these enzymes and cloned into the similarly digested vector pUC18CM. Plasmid DNA from the resultant clone was transformed into BD630 where it integrated by single reciprocal recombination, inserting the entire plasmid in oxdC and creating BD3226.

To disrupt the ywpH gene, a 1428 bp region bearing the ywpH gene was obtained by PCR amplification of BD630 chromosomal DNA using the primers YWPH-FP(#19)-EcoRI (5′-CGGAATTCCTGCAATGTTGGGGC) and YWPH-RP(#19)-SphI (5′-ACTGCGGCATGCGGTCTCCATCATGAATAAG). This fragment was cloned into pCR2.1-TOPO (Invitrogen). The resulting plasmid, pED395, was digested with MfeI, and a spectinomycin resistance cassette obtained by EcoRI digestion of pIC156 was inserted in ywpH. We used this plasmid, pED396, to transform B. subtilis, generating BD3056. The disruption of ywpH by replacement recombination was confirmed by PCR. To disrupt glcR and ywpJ, a 1912 bp fragment was amplified by PCR from BD630 chromosomal DNA using the primers GLCR-FP(#50)-BamHI (5′-CGGGATCCAAAAAGGTTCTCTCGTC) and GLCR-RP(#50)-EcoRI (5′-CGGAATTCCCGCAGCCTCCAGC). The product was digested with EcoRI and BamHI and cloned into pUS19, a pUC19 derivative with a spectinomycin resistance cassette inserted between the NdeI and NarI sites. The resulting plasmid, pED480, was digested with NarI, and a kanamycin resistance cassette, obtained by ClaI digestion of pDG782 (Guerout-Fleury et al., 1995), was inserted. This plasmid, pED485, was transformed into BD630, giving rise to BD3194. In this mutant strain, the kanamycin resistance cassette replaced the last 612 bp of glcR and the first 196 bp of ywpJ.

A fusion of comK to the cfp gene was constructed as follows. A 383 bp comK DNA fragment, containing the first nine codons and the regulatory region of comK, was amplified from the chromosome using the primers: YFP-EcoRI (5′-GACATCGAATTCTTTTGTT) and YFP-XhoI (5′-CCGCTCGAGTAAAGGTGCGTCTGTTTTCTG). The fragment was digested with EcoRI and XhoI and used to replace the spo0J fragment carried by the spo0J–yfp fusion vector pKL184. This plasmid was a gift from K. Lemon and A. Grossman. The comK–yfp vector was digested with EcoRI and NcoI, and a fragment containing the comK sequences, fused to the first 191 bp of the Yfp protein, was isolated and cloned into pFG10 generating a comK–cfp fusion plasmid. pFG10 was a gift from F. Gueiros and R. Losick. The resulting plasmid was then used to transform BD630 by single reciprocal recombination, integrating into the chromosomal comK gene to create BD2961.

From the comK–yfp fusion, ywpH–yfp and yvrL–yfp fusions were prepared by excising the comK sequences by cutting with EcoRI and XhoI and replacing this fragment with PCR products containing the ywpH or yvrL promoter regions. For ywpH, the primers were YWPH1 (5′-CGGAATTCATTTACAATACATACAACCGC) and YWPH2 (5′-CCG CTCGAGATCAGCAGCTTTTTCCCGGGG). For yvrL, the primers were YVRL1 (5′-CGGAATTCGCACTCGTTCTCTGCAAGCCT) and YVRL2 (5′-CCGCTCGAGGCTGGGAAGAGCCGAGTT GCA). The EcoRI and XhoI sites are underlined. The ywpH–yfp and yvrL–yfp fusion constructs, marked by chloramphenicol resistance, were combined with a comK::kan mutation to create BD2990 and BD3010 respectively.

To construct an smf–lacZ fusion, the C-terminus of smf was amplified from BD630 chromosomal DNA with PCR using the primers SMF-FP(#31)-EcoRI (CGGAATTCGTGATCAAATAAAAGCAGC) and SMF-RP(#31)-BamHI (CGGGATCCTTAAAGGGTTCCGTATATTGAAC). The product was cloned into pMutin2 (Vagner et al., 1998) using the underlined EcoRI and BamHI sites. The resulting plasmid, pED455, was integrated into the BD630 chromosome by single cross-over recombination, leading to smf–lacZ transcriptional fusion without the disruption of smf (BD3148). The chromosomal DNA of comK::spc (BD2259) was transformed into BD3148 to give BD3150. BD3158 was obtained by transforming BD3148 with chromosomal DNA from mecA::kan (BD2090).

To disrupt the smf gene, an 1800 bp fragment, bearing the smf gene, was obtained by PCR amplification of chromosomal DNA with SMF-FP(#20)-EcoRI (CGGAATTCAACATACGAAGCGGTGC) and SMF-RP(#20)-SphI (ACTGCGGCATGCGTTTTCAGCTCTTTTAATACC). This fragment was digested with SphI and cloned into pUK19, a pUC19 derivative with a kanamycin resistance cassette inserted into the NarI site. The resulting plasmid, pED398, was digested with MfeI, and the spectinomycin resistance cassette obtained by EcoRI digestion of pIC156 (Steinmetz and Richter, 1994) was inserted. pED398 was used to transform BD630 (his leu met) for spectinomycin resistance to produce BD3024. A spectinomycin-resistant and kanamycin-sensitive transformant obtained from this cross had a disrupted smf produced by replacement recombination, as verified by PCR.

DNA microarrays

A complete set of B. subtilis ORF-specific PCR primers was purchased from Eurogentec and used to amplify the protein-coding ORFs from B. subtilis 168 genomic DNA. Chromo-somal DNA was prepared using the method described by Pitcher et al. (1989). The following PCR components were combined in 96-well plates: 25 pmol of each primer, 1×Taq polymerase buffer (PE Applied Biosystems), 0.25 mM dATP, 0.25 mM dCTP, 0.25 mM dGTP, 0.25 mM TTP, 2 mM MgCl2, 0.1 μg of B. subtilis genomic DNA and 1.75 U of Taq polymerase (PE Applied Biosystems). An MJ Research PTC-225 thermocycler was programmed to incubate the reactions for 36 cycles of 95°C (30 s), 56°C (45 s) and 72°C (3 min 30 s). An aliquot of each completed reaction was analysed by agarose gel electrophoresis to ensure that a PCR product of the correct size and adequate yield was obtained. The success rate for a single-pass amplification of the 4107 ORFs was ≈ 92%. Failed reactions, most often occurring for genes with long ORFs (≈ 3–15 kb), were repeated using Expand polymerase mixture (Roche Molecular Biochemicals), and the reaction products were verified by agarose gel electro-phoresis. Collectively, the PCR product pools generated in two rounds of amplifications represented ≈ 95% of the B. subtilis genome. The amplified B. subtilis ORFs were precipitated with isopropanol, resuspended in 15 μl of 3× SSC, and 5 μl aliquots were stored at –20°C in 384-well microplates (Eisen and Brown, 1999). From these plates, the ORFs were spotted onto poly L-lysine-coated glass microscope slides using the equipment and methods that are described on the website of P. O. Brown of Stanford University (http://cmgm.stanford.edu/pbrown/protocols).

Preparation of probes and hybridization

Fluorescent probes were prepared by reverse transcription of 25 μg of total RNA from B. subtilis to incorporate aminoallyl-dUTP into first-strand cDNA. The amino-cDNA products were subsequently labelled by direct coupling to either Cy3 or Cy5 monofunctional reactive dyes (Amersham Pharmacia Biotech). The details of this protocol can be found at http://cmgm.stanford.edu/pbrown/protocols. Cy3- and Cy5-labelled probes were combined and purified using Qiaquick PCR spin columns (Qiagen). The purified probes were dried under vacuum in a SpeedVac (Savant Instruments), resuspended in 15.5 μl of water and combined with the following: 3.6 μl of 20× SSC, 2.5 μl of 250 mM HEPES (pH 7.0), 1.8 μl of poly dA (500 μg ml–1; Amersham Pharmacia Biotech) and 0.54 μl of 10% SDS. Before hybridization, the solution was filtered with a 0.22 μm Ultrafree-MC microcentrifuge filter (Millipore), boiled for 2 min and cooled to room temperature. The probe was then applied to the microarray under a coverglass, placed in a humidified chamber and incubated at 63°C overnight. Before scanning, the arrays were washed consecutively in 1× SSC with 0.03% SDS, 0.2× SSC and 0.05× SSC and centrifuged for 2 min at 500 r.p.m. to remove excess liquid. Lastly, the slides were imaged using a custom-built confocal laser microscope (Eisen and Brown, 1999). For each type of comparison, two independent sets of RNA samples were used. For the type 1 comparisons (the wild-type BD630 compared with the comK mutant BD2121), we used RNA sample pairs A and B. For the type 2 comparisons (mecA strain BD2103 compared with the mecA comK strain BD2125), we used RNA sample pairs C and D. Each comparison with a given RNA pair was repeated a number of times. The type 1 experiments were repeated 18 and 12 times with RNAs A and B respectively. The type 2 experiments were repeated 16 and nine times with RNAs C and D respectively.

Treatment of data

Relative fluorescence values for microarray spots were quantified using QUANTARRAY software from GSI Lumonics. The raw Cy3 and Cy5 intensity values were normalized in order to equalize the median values of the two data sets.

Two approaches to the analysis of data were used. In the first approach, we analysed ratios derived from normalized fluorescence intensity data. We calculated the average ratio of wild type to comK (type 1 comparison) or mecA to mecA comK (type 2 comparison) using normalized fluorescence intensities. We then selected a threshold ratio, above which we regarded a gene as ComK dependent for its transcription. To arrive at this threshold, we took advantage of the considerable published knowledge base concerning the ComK regulon. It is has been shown, using reporter gene analysis, that the transcription of the following 24 genes is dependent on ComK: addA, addB, comC, comEA, comEB, comEC, comER, comFA, comFB, comFC, comGA, comGB, comGC, comGD, comGE, comGF, comGG, comK, flgM, med, nin, nucA, recA and comZ (Ogura et al., 1997; Hamoen et al., 1998; Liu and Zuber, 1998; Ogura and Tanaka, 2000). We then selected a threshold ratio that called nearly all these genes as ComK dependent. We deliberately set this ratio so that a small number of the test genes were not called, in order to limit the numbers of false positives. Using a threshold of three, 18 and 21 test genes were called with RNA samples A and B in the type 1 comparisons, and 22 genes were called with each of the two RNA samples in the type 2 comparisons.

The normalized fluorescence intensity data were also analysed using the significance analysis of microarrays (SAM) method (Tusher et al., 2001). SAM calculates a t-like statistic (dI) and then uses a bootstrapping procedure to determine the significance of this statistic for each gene. SAM permits the adjustment of a significance parameter Δ. Genes with dI values exceeding Δ or below –Δ are called as significant. Although in both type 1 and 2 comparisons, our test genes were called as significant by a wide range of Δ-values, we have selected Δ for each comparison, using our test set of 24 genes as described above. Again, we adopted a conservative approach by selecting Δs that excluded a small number of the test genes. With the chosen values of Δ, the type 1 experiments called 17 and 16 of the test genes with RNA samples A and B (Δ = 1.75 and 10 respectively). In the type 2 comparisons, 21 and 18 test genes were correctly called as ComK dependent with RNA samples C and D (Δ = 9 and 6 respectively).

As a control, we compared two independent samples of mecA (BD2103) RNA (from sample pairs C and D). This comparison was repeated 12 times. No genes exhibited average ratios greater than or equal to 3 or less than or equal to 0.33 (Fig. 1A). When analysed using SAM, no genes were called as either up- or downregulated.

The normalized data for the complete gene sets, together with the calculated ratios, can be viewed or downloaded from http://www.phri.nyu.edu/dubnaud/dubnaudbio.htm. The raw data are also available (contact R. Berka at rambo@novozymesbiotech.com).

Culture conditions and isolation of total RNA

Cultures (250 ml) were grown in competence medium (Albano et al., 1987) and harvested 2 h after the transition to stationary phase, at which time the wild type (BD630) reaches maximum competence. The cultures were chilled rapidly by the addition of ice, and the cells were collected by centrifugation at 4°C. The pellets were resuspended in 2.5 ml of ice-cold sterile water. RNA was prepared using the FastRNA blue kit (BIO101), according to the manufacturer’s instructions, with minor modifications. Two hundred micro-litres of cell suspension was added to each FastPrep blue tube containing the lysing matrix as well as the other required reagents. These mixtures were shaken in the FastPrep instrument at a speed rating of 6, twice for 45 s and then for an additional period of 30 s. The rest of the procedure was as described (BIO 101). The final RNA pellets were dissolved in 50 μl of SAFE buffer (provided in the FastRNA kit) and pooled. Typically, 3–4 mg of RNA was obtained from 12 tubes. Each RNA sample was analysed on agarose gels to assess quality. In all cases, 23S rRNA signals were detected with greater intensity than the 16S rRNA bands.

Fluorescence microscopy

Cultures were grown in competence medium (Albano et al., 1987) to 2 h past the transition from the exponential to the stationary phase of growth. Cells were processed for microscopy as described previously (Haijema et al., 2001). Images were collected with the OPENLAB software package (Improvision) and then exported to ADOBEPHOTOSHOP where they were prepared for publication. We used a Zeiss Axiovert 135M microscope equipped with an Orca digital camera (Hamamatsu), and a Zeiss 1.3 NA Plan Neo-Fluar 100× oil immersion objective. For detecting fluorescent images, the following optical filter sets were used: CFP, D480/20m (emitter), D436/20 (exciter), 455dclp (dichroic), all from ChromaTechnology; YFP, XF3082 (emitter), XF1068 (exciter), XF2030 (dichroic), all from Omega Optical.


This study was funded by NIH grant GM57720 from the National Institutes of Health. We gratefully acknowledge the assistance of Michael W. Rey of Novozymes Biotech, Inc. for assistance in developing a database for storage of micro-array data. We also thank Joe DeRisi, Vishy Iyer, Michael Eisen and Arkady Khodursky for assistance in constructing our arrayer and scanner and for helpful discussions. Finally, we gratefully acknowledge T. Henkin, F. Gueiros, R. Losick, K. Lemon and A. Grossman for donating strains and antisera, and all the members of our respective laboratories for helpful discussions.