Genome-wide analysis of the general stress response in Bacillus subtilis


  • The first two authors contributed equally to this work.


Bacteria respond to diverse growth-limiting stresses by producing a large set of general stress proteins. In Bacillus subtilis and related Gram-positive pathogens, this response is governed by the σB transcription factor. To establish the range of cellular functions associated with the general stress response, we compared the transcriptional profiles of wild and mutant strains under conditions that induce σB activity. Macroarrays representing more than 3900 annotated reading frames of the B. subtilis genome were hybridized to 33P-labelled cDNA populations derived from (i) wild-type and sigB mutant strains that had been subjected to ethanol stress; and (ii) a strain in which σB expression was controlled by an inducible promoter. On the basis of their significant σB-dependent expression in three independent experiments, we identified 127 genes as prime candidates for members of the σB regulon. Of these genes, 30 were known previously or inferred to be σB dependent by other means. To assist in the analysis of the 97 new genes, we constructed hidden Markov models (HMM) that identified possible σB recognition sequences preceding 21 of them. To test the HMM and to provide an independent validation of the hybridization experiments, we mapped the σB-dependent messages for seven representative genes. For all seven, the 5′ end of the message lay near typical σB recognition sequences, and these had been predicted correctly by the HMM for five of the seven examples. Lastly, all 127 gene products were assigned to functional groups by considering their similarity to known proteins. Notably, products with a direct protective function were in the minority. Instead, the general stress response increased relative message levels for known or predicted regulatory proteins, for transporters controlling solute influx and efflux, including potential drug efflux pumps, and for products implicated in carbon metabolism, envelope function and macromolecular turnover.


The general stress response contributes strongly to bacterial survival in the natural environment, in fresh and processed foods and in pathogenic interactions. Among the low-GC Gram-positive bacteria, this response is controlled by the alternative transcription factor σB (reviewed by Price, 2000). σB is activated by diverse environmental and energy stresses to direct the expression of a large set of general stress genes, thereby conferring broad resistance to subsequent challenges. With its well-developed genetic system, Bacillus subtilis has served as a model for investigating the general stress response within this group, which includes significant human pathogens. For example, σB function is important for stress resistance in Listeria monocytogenes and Staphylococcus aureus (Becker et al., 1998; Kullik et al., 1998; Wiedmann et al., 1998). Moreover, σB function contributes directly to virulence in Bacillus anthracis (Fouet et al., 2000) and to the ability of Staphylococcus epidermidis and S. aureus to form adherent biofilms (Rachid et al., 2000;Knobloch et al., 2001), which is a crucial component of their proclivity to cause nosocomial infections (Costerton et al., 1999). Among the high-GC Gram-positive bacteria, a σB-like factor is important for virulence in Mycobacterium tuberculosis (Chen et al., 2000). In view of these findings, understanding the range of cellular functions under σB control has assumed considerable significance.

Earlier efforts to identify σB-dependent genes in B. subtilis relied on (i) a genetic approach using transposon mutagenesis and plate transformations (Boylan et al., 1991; 1993a); (ii) two-dimensional gel analysis of proteins synthesized by wild-type and mutant strains (Völker et al., 1994;Petersohn et al., 1999a); and (iii) a consensus-directed screen of the B. subtilis genome for promoter elements potentially recognized by σB (Petersohn et al., 1999b). These screens usually included or were followed by a more detailed transcriptional analysis of the candidate genes. The sum of these approaches led to the estimate that more than 100 general stress genes were controlled by σB and to the assignment of some of them to functional categories (Hecker and Völker, 1998). The largest such category included genes whose products directly counter the oxidative stress damage caused by unbalanced metabolism, such as the ClpP protease and the associated ClpC ATPase, which are thought to sort, repair or degrade damaged proteins (Krüger et al., 1996;Gerth et al., 1998), Dps, which is thought to bind and protect DNA (Antelmann et al., 1997a), thioredoxin, which helps maintain thiol–disulphide balance (Scharf et al., 1998), and the KatB and KatX catalases (Engelmann et al., 1995; Petersohn et al., 1999c). Another category included five proteins implicated in controlling solute influx and efflux (von Blohn et al., 1997; Gomez and Cutting, 1997; Akbar et al., 1999; Petersohn et al., 1999a,b), and yet another included two regulators, the CtsR repressor of certain heat shock genes (Krüger and Hecker, 1998;Derre et al., 1999) and the putative transcriptional regulator YvyD (Drzewiecki et al., 1998).

With the completion of the B. subtilis genome project (Kunst et al., 1997) and the advent of genome-wide transcriptional analysis (e.g. Chu et al., 1998; Iyer et al., 1999; Richmond et al., 1999;Fawcett et al., 2000), we endeavoured to apply transcriptional profiling to identify additional genes in the general stress regulon. Using nylon substrate DNA arrays representing nearly all the annotated reading frames from the B. subtilis genome, we compared the transcripts present in wild-type and σB mutant strains that had been subjected to ethanol stress in two independent experiments. We then correlated these ethanol stress profiles with profiles of cells in which expression of active σB was under the control of an inducible promoter. To help identify genes directly controlled by σB, we constructed hidden Markov models (Durbin et al., 1998) trained on well-characterized σB promoter sequences, then tested these models with a transcriptional analysis of seven representative σB-dependent genes that emerged from the array experiments. Based on the transcriptional profiling experiments, we identified 97 new genes that were likely members of the general stress regulon, including 13 known or potential regulatory proteins that could secondarily modify cellular physiology.


Transcriptional profiling to identify σB-dependent genes

Two different approaches were used to induce the general stress response in cultures growing logarithmically in rich medium. In the first, the response was induced by ethanol stress, and we compared expression profiles of a wild-type strain with those of a mutant strain devoid of σB activity (Boylan et al., 1991). This mutant (PB153) was constructed by deleting a 453 bp region from within the coding region of the σB structural gene (sigB) and substituting a cat gene for the deleted region. In the second approach, we used a strain in which the σB-dependent promoter of the sigB operon was replaced by the LacI-repressible, IPTG-inducible Pspac promoter to allow ectopic expression of σB (Boylan et al., 1992). This strain (PB213) also carries a mutation that abolishes the function of the RsbW anti-σ factor, the primary regulator of σB activity (Benson and Haldenwang, 1993). The addition of IPTG to the growth medium therefore produces such high levels of σB activity that growth ceases about 2 h after induction (Boylan et al., 1992). For the experiments reported here, samples were taken at 0, 5, 10 and 15 min after induction of the general stress response. RNA was then extracted from each sample as described in Experimental procedures.

Radiolabelled cDNAs synthesized from the RNA samples were hybridized to replicate nylon filters spotted with polymerase chain reaction (PCR) products that represented nearly all the annotated genes in the B. subtilis genome (Kunst et al., 1997; The results from individual cDNAs hybridized to replicate filters showed good reproducibility, with an average Pearson correlation coefficient of 0.971 (Fig. 1A). As shown in Fig. 1B, the results from independent RNA isolations and cDNA syntheses also showed good reproducibility. The results in Fig. 1B also indicate that wild-type and sigB mutant cells have similar expression patterns in the absence of stress. In contrast, the scatterplot in Fig. 1C indicates that, relative to the sigB null mutant, the wild type showed significant changes in its expression profile during a typical ethanol stress experiment. An example of the primary data used to generate Fig. 1C is shown in Fig. 2, which is a close-up of the autoradiograph for a small region of the arrays. In marked contrast to the similar signal intensities seen for most of the genes in this part of the array, the intensities for the known σB-dependent gene dps (Antelmann et al., 1997a) were substantially different for wild type and mutant. The intensities for two other genes highlighted in Fig. 2 also differed substantially: yflT, which was previously inferred to be σB dependent based on two-dimensional gel analysis of its product (Völker et al., 1994); and yfhF, which was not previously known to be a member of the σB regulon. One additional feature of Fig. 2 bears comment: the intensities for the known σB-dependent gene ctsR (Krüger and Hecker, 1998) are not obviously different in the wild type and the sigB mutant. ctsR belongs to a special class of heat shock genes under the dual control of σB and another σ factor (class III according to Hecker et al., 1996). We will return to the issue of the class III heat shock genes in the next section.

Figure 1.

Logarithmic plots of normalized spot intensities.

A. Typical replicate filters are shown with the intensity (in arbitrary units) of each spot in one filter plotted against the equivalent spot in the other.

B. Independent RNA preparations are shown, with the averaged spot intensities from replicate filters hybridized with a probe from PB2 wild-type cells sampled at T0 (immediately before the application of stress) plotted against the averaged spot intensities from replicate filters hybridized with a probe from PB153 (sigBΔ2::cat) mutant cells, also sampled at T0.

C. Averaged spot intensities from replicate filters hybridized with a probe from PB2 wild-type cells sampled at T10 (10 min after application of a 5% ethanol stress) plotted against the averaged spot intensities from replicate filters hybridized with a probe from PB153 mutant cells, also stressed and sampled at T10. Each graph represents 4100 individual array spots. r is the Pearson correlation coefficient.

Figure 2.

Autoradiograph of filter arrays. Arrays were probed with cDNA derived from PB2 wild-type (A) or PB153 (sigBΔ2::cat) mutant cells (B), both stressed with 5% ethanol for 10 min. Arrows point to the locations of four representative σB-dependent genes. These include dps, a well-characterized σB-dependent gene (Antelmann et al., 1997a); yflT, a gene inferred to be σB dependent from the behaviour of its product in wild and mutant strains (Völker et al., 1994); the class III heat shock gene ctsR, in which the loss of expression from its σB promoter is masked by increased expression from a second, σB-independent promoter (Krüger et al., 1996; Krüger and Hecker, 1998); and the newly discovered yfhF gene (this study).

Expression profiles of known σB-dependent genes

We examined first the expression profiles of several well-characterized σB-dependent genes in two independent ethanol stress experiments and one ectopic expression experiment. The complete data set is available (see Supplementary material), and examples of these profiles are shown in Fig. 3.

Figure 3.

Expression of the known σB-dependent genes dps and clpP in two independent ethanol stress experiments and one ectopic expression experiment. Expression levels were calculated using the intensity averages of replicate filters, with (A) and (B) showing the normalized spot intensities for dps and (C) and (D) showing the normalized spot intensities for clpP. For the ethanol stress experiments shown in (A) and (C), RNA was extracted from strain PB2 (wild type; black square and circle) and strain PB153 (sigB::catδ2; white square and circle). For the ectopic expression experiments shown in (B) and (D), RNA was extracted from strain PB213 (PspacrsbV rsbWΔ1 sigB rsbX) grown in the presence (black triangle) or absence (white triangle) of the IPTG inducer.

The expression profile of dps shown in Fig. 3A and B is similar to other genes in our data set that are known to be controlled directly by σB, particularly with regard to the observation that the two ethanol stress experiments gave a substantially greater induction (> 70-fold for dps) than the ectopic expression experiment (sixfold). This reflects the fact that the Pspac promoter used to control σB expression in the latter experiment is somewhat leaky even in the absence of its IPTG inducer. Owing to the absence of the RsbW anti-σ factor in this strain, the basal activity of σB in uninduced Pspac cells is significantly higher than in unstressed wild-type cells (compare Fig. 3A with 3B).

In contrast to dps, which represents the majority of σB-dependent genes, the expression profile of clpP represents the class III heat shock genes, which include clpC, ctsR, nadE and trxA (Krüger et al., 1996; Antelmann et al., 1997b; Gerth et al., 1998; Krüger and Hecker, 1998; Scharf et al., 1998). Although many genes in the σB regulon are under dual control, at least some class III genes are atypical, in that loss of transcription from their σB-dependent promoter is masked by a second, σB-independent promoter, which is itself stress responsive and often more active in the absence of σB (Krüger et al., 1996;Gerth et al., 1998). Consequently, for these class III genes, the gain in σB function in the ectopic expression experiment would be expected to provide a greater change in transcript ratio than would the loss of σB function in the two ethanol stress experiments (compare Fig. 3C and D).

Cluster analysis of genes whose expression was altered in a σB-dependent manner

Before focusing on the best candidates for new members of the σB regulon, we conducted a broad analysis of the data in order to identify major themes. We used the clustering method of Eisen et al. (1998) to represent the expression of all 709 genes whose transcript ratios were geqslant R: gt-or-equal, slanted fourfold different between the wild type and sigB mutant during at least one of the time points in either ethanol stress experiment, or whose transcript ratios were geqslant R: gt-or-equal, slanted threefold different during at least one of the time points in the ectopic expression experiment. This lower cut-off for the ectopic expression data takes into account the lower induction ratios in these experiments compared with the ethanol stress experiments (see Fig. 3A and B). As shown in Fig. 4A, for about 60% of these genes, the transcript levels were higher in the presence of σB (indicated by shades of red). In contrast, for about 40% of these genes, the transcript levels were lower in the presence of σB (indicated by shades of green). Although a variety of expression patterns was evident in Fig. 4A, here we consider only three broad groups.

Figure 4.

Eisen plots and HMM predictions.

A. Eisen plot showing the profile of 709 genes whose level of expression depended on σB; genes with similar expression patterns are grouped (Eisen et al., 1998). Blue time course, ethanol stress experiment 1 (0, 5 and 10 min); yellow time course, ethanol stress experiment 2 (0, 5, 10 and 15 min); grey time course, ectopic expression experiment (0, 5, 10, 15 and 20 min). Times refer to minutes after the addition of ethanol or IPTG to logarithmically growing cells. The blue and yellow time courses display the ratio of normalized spot intensities for the hybridization signals for probe derived from PB2 wild-type cells versus probe derived from the PB153 (sigBΔ2::cat) mutant. Ratios are displayed colorimetrically, with stronger hybridization for the wild-type probe compared with mutant shown as shades of red (ratios > 1) and stronger hybridization for mutant probe shown as shades of green (ratios < 1). As the ratios approach 1, these shades darken to black. The grey time course displays the expression profile after the addition of IPTG to PB213 cells bearing the PspacrsbV-rsbWΔ1-sigB-rsbX construction. The plot compares the ratio of probe derived from cells treated with IPTG with probe from untreated cells, with stronger hybridization for treated cells shown as shades of red (> 1). Zoomboxes on the right show representative genes from each of the three groups described in the text. Here and elsewhere in Fig. 4, (*) indicates genes previously known to be σB dependent based on both product and promoter analysis, whereas (**) refers to genes previously inferred to be σB dependent based on product analysis alone (see text).

B. Eisen plot for the 128 genes from (A) whose expression ratios were geqslant R: gt-or-equal, slanted fourfold different in wild type versus mutant in both ethanol stress experiments, and whose expression ratios were also geqslant R: gt-or-equal, slanted threefold different after IPTG addition in the ectopic expression experiment. An annotation for these genes is provided in Table 1, including a description of their known or projected functions together with an HMM prediction for the presence of σB promoters.

C. Annotated Eisen plot for the 123 genes from (A) that also met one of the following three criteria: (i) predicted by HMM to have a promoter recognized by σB; (ii) previously known (*) or inferred (**) to be σB dependent; or (iii) thought to lie in an operon with a gene from (i) or (ii), as indicated by [o]. Genes shown in red text had also met the more stringent expression criteria for inclusion in the plot shown in (B) and are therefore found in both the (B) and (C) clusters.

Group I includes genes whose transcript levels were lower when σB was present during ethanol stress or in the ectopic expression experiment. From the known functions of genes in this group, we infer that induction of the general stress response downregulates many cellular systems needed for active growth. For example, the zoombox highlights 15 genes, some of whose products are involved in translation (rplD, rpsC, rplW and rpsJ, all of which lie in the promoter-proximal half of the S10 r-protein operon), transcription (rpoE, encoding the δ-subunit of RNA polymerase) and cell wall synthesis (rodA, gcaD, tagF and tagB). The product of another gene (yqiI) is also implicated in cell wall metabolism as a result of its similarity to N-acetylmuramoyl alanine amidases (Tatusov et al., 2001). On the other hand, the decreased expression of some group I genes may promote physiological changes that forge a more stress-resistant state. For example, the transcript levels for five genes encoding known or suspected iron-binding proteins (feuA, fhuD, yfiY, yfmC and yxeB– not shown in the zoombox) were at least threefold lower when σB activity was induced in the ectopic expression experiment. Interestingly, it has been proposed that an OxyR-dependent reduction in ferric ion uptake in Escherichia coli would prevent the hydroxyl radical damage caused by reaction of intracellular iron with peroxide (Storz and Zheng, 2000). These observed changes in the transcript ratios of the group I genes could reflect either active regulation (i.e. negative regulation by one or more products of the σB regulon) or a more passive mechanism (increased competition for RNA polymerase core enzyme).

Group III includes genes whose transcript levels increased primarily when σB activity was induced by IPTG in the ectopic expression experiment. It is noteworthy that this group includes a number of known class III heat shock genes, and the zoombox details the profiles of four that lie in the same transcription unit, ctsR–yacH–yacI–clpC (Krüger et al., 1997; Krüger and Hecker, 1998). Expression of this operon is known to be under the dual control of a σB-dependent promoter and a σA-like promoter that can largely compensate for the loss of σB function when stress is encountered (Krüger et al., 1996). As shown in Fig. 3C and D, class III heat shock genes have a characteristic expression pattern in which ethanol stress substantially increases transcript levels in the wild type and does so only slightly less effectively in the sigB mutant. Therefore, for class III heat shock genes, a significant effect of σB function would be expected primarily in the ectopic expression experiment, as is the case for many genes in this group. Some of the genes included in group III are therefore potential candidates for new class III heat shock genes.

Group II includes genes whose transcript levels were higher when σB was present during ethanol stress or in the ectopic expression experiment. The zoombox highlights 25 of these genes, including five whose σB dependence has been firmly established by other means: ydaG (Petersohn et al., 1999a), katB (Engelmann et al., 1995), ytxH (Varón et al., 1996), yvyD (Drzewiecki et al., 1998) and opuE (von Blohn et al., 1997). We therefore considered that the genes in group II were valid candidates for new members of the σB regulon.

Identification of 127 genes as the best candidates for inclusion in the σB regulon

In order to identify only the strongest candidates for new σB-dependent genes, we performed a second analysis of the data, but now asked that the absolute transcript ratio (in the presence or absence of σB) was geqslant R: gt-or-equal, slanted fourfold different at one or more of the time points in each of the two ethanol stress experiments, and also geqslant R: gt-or-equal, slanted threefold different at one or more of the time points in the ectopic expression experiment. When expression data are filtered in this manner, it is not possible to calculate directly the number of false positives present in the resulting set. However, we can assume independence for each of the three experiments and analyse the probability that a gene would have passed our criteria by chance alone. A binomial analysis of independent variables indicates that, on average, less than two genes would have been expected to pass our criteria by chance. A more conservative approach is to add 1.96 standard deviations of this binomial random variable to the mean value. This yielded an upward limit of less than five genes that would have been expected to pass our criteria by chance. The actual observation of 128 genes in this category clearly indicates the strongly non-random nature of the genes selected. Some types of false positives have a non-random distribution and may additionally be contained within the final set. In contrast, it should also be noted that our analysis may miss true positives that exhibit expression changes less than the cut-off values. However, our choice of conservative ratios in the three independent experiments, combined with the elimination of genes with ratios deemed to be suspect based on observed inconsistency upon duplicate hybridization with the same cDNA, has resulted in the selection of a group of candidates that we believe provides a robust foundation for future experiments.

A cluster analysis of the 128 genes that met our criteria is shown in Fig. 4B. Within this cluster, the expression of 127 genes appeared to be positively governed by σB in each of the three experiments, whereas only the expression of rpsJ (encoding ribosomal protein S10) appeared to be negatively influenced. Among the 127 positively controlled genes, the wild-to-mutant transcript ratios of 89 increased more than 10-fold in both ethanol stress experiments, 53 increased more than 20-fold, and 17 increased more than 100-fold.

To evaluate further the criteria used to derive this cluster, we asked how many of the 127 genes had been identified previously as σB dependent and how many of the previously identified genes were excluded. Earlier studies found possible σB-dependent genes by analysing their transcripts or protein products, either directly or using reporter fusions (reviewed by Price, 2000). These previously identified genes can be grouped into two categories: known and inferred. The known category includes 42 well-defined σB-dependent genes. For these genes, consensus σB recognition sequences have been experimentally associated with expression by locating the 5′ end of the message or by altering key residues of the proposed recognition sequences (Price, 2000). Genes in this category are therefore likely to be transcribed directly by σB-containing holoenzyme in vivo. In contrast, the inferred category includes genes for which no experimental analysis of the promoter region has been reported.

Here, we focus our evaluation on the 42 genes in the known category, which should be a more reliable subset. Of the 127 genes shown in Fig. 4B, 19 were known to be σB dependent in previous studies. Why were the remaining 23 excluded? One gene (gtaB) was not represented in the arrays, another (rsbV) had a defective PCR amplification, and the expression of 21 others did not meet the criteria. Of these 21, nine are known class III heat shock genes that would not be expected to meet the criteria (clpC, clpP, comY/yacK, ctsR, nadE, sms, trxA, yacH and yacI); five genes only narrowly missed the criteria (bmrU, ctc, csbX, rsbW and ytxG); and seven genes underperformed the criteria more substantially (bmr, bmrR, bofC, csbA, csbB, ycnH and yjbC). For these last seven genes, the σB-dependent component of their expression appears to be less significant than the independent component. Based on the calculated inclusion rate of 45% in the known category, we estimate that the 127 best candidates shown in Fig. 4B represent a total of more than 250 σB-dependent genes encoded by the B. subtilis genome.

Functional predictions

The 127 genes whose induction met the stringent criteria for inclusion in Fig. 4B were assigned to functional groups, as shown in Table 1. For genes whose roles have not been determined experimentally, functions were provisionally assigned by weighing the similarity of their products to known proteins, determined by a blast search (Altschul et al., 1997), and also by considering the cluster of orthologous groups (COG) of proteins to which each belonged. The COG database ( is a phylogenetic classification of proteins encoded by 34 complete microbial genomes, representing 26 lineages, and it allows the detection of both close and distant relationships (Tatusov et al., 2001). To be included within a COG, an orthologous protein must be represented in at least three of the 26 lineages. Consequently, some of the predicted products listed in Table 1 presently have no COG, because they either resemble products found chiefly in non-COG organisms or are unique to B. subtilis.

Table 1.   Functional categories of 127 σB-dependent genes identified in macroarray hybridization experiments.
Category and&sol; productaPredb COG no./namecDescription or commentd
  1. a. Gene known (*) or inferred (**) to be σ B dependent from previous studies (see text).

  2. b. HMM indicates hidden Markov models prediction of σ B promoter elements no more than 350 bp upstream from the coding region; ♦ indicates σB-dependent 5′ end located by RACE-PCR (Fig. 5); [o] indicates an operon and refers to a downstream gene in a known σB-dependent transcription unit.

  3. c. COG refers to the clusters of orthologous groups of proteins classification for 34 complete microbial genomes ( Tatusov et al., 2001).

  4. d. Similar with an E-value of < 1e-10 in a blast 2.0 comparison (Altschul et al., 1997); Signal Peptide indicates predicted signal peptide for YfkD and YwsB, whereas Lipoprotein Signal Peptide indicates predicted lipoprotein signal peptide for OpuBC, YjgB and YutC (Tjalsma et al., 2000).

Transcriptional regulators
 DinR (LexA) 1974/SOS response transcriptional repressor 
 LicRHMM1762/Phosphotransferase – mannitol/fructoseLichenase operon activator; similar to B. subtilis ManR (YjdC)
 RsfA (YwfN)  Regulator of σF; leucine zipper motif; similar to B. subtilis YlbO
 Xpf  σ-like positive control factor for phage PBSX
 YdeC 2207/AraC-like transcriptional regulatorAraC family
 YtzE  DeoR family; 60% identity with HTH of E coli GlpR
 YvaN 1396/Predicted transcriptional regulatorsSinR family; similar to B. subtilis YvaO
 YwhH 2606/Uncharacterized ancient conserved regionYbaK/EbsC family; similar to Desulfitobacterium dehalogenans RhpA
 YvyD*HMM1544/Ribosome-associated protein YSimilar to other hypothetical proteins including σ54 modulators
Post-transcriptional regulators
 ManP (YjdD) 1299/1445/1762/Phosphotrasferase – mannitol/fructoseContains all three domains of mannitol/fructose PTS Enz II
 NhaX (YheK) 0589/Universal stress protein UspA and related proteinsSimilar to B. subtilis YxiE among others
 YabT 0515/Serine-threonine protein kinases 
 YfkJHMM♦0394/Protein-tyrosine-phosphataseLow MW tyrosine phosphatase family; apparent operon yfkJ–yfkI–yfkH
 YxaC 1346/1380/Putative murein hydrolase effectorsSimilar to Staphylococcus aureus LrgB murein hydrolase effector
σB regulators
 SigB*[o]1191/Specialized σ subunitsσB transcription factor
 RsbX*[o] Feedback serine phosphatase, PP2C family
 Dps*HMM0783/Starvation-induced DNA-binding protein 
 KatB (KatE)*HMM0753/CatalaseStationary phase catalase
 KatX*HMM0753/CatalaseStationary phase catalase
 YdbDHMM♦ Similar to Lactobacillus plantarum manganese-containing catalase
 YdfOHMM♦0346/Lactoylglutathione lyaseSimilar to Sphingomonas species dioxygenases
 YisP 1562/Phytoene-squalene synthetaseSimilar to phytoene and carotenoid synthases
 YkzA*HMM1764/Stress-induced protein OsmCSimilar to Xanthamonas campestris hydroperoxide resistance protein
 YqgZHMM♦1393/Arsenate reductase ArsG and related proteinsApparent operon yqgZ–trnSLGln1–yqhAB regulator); similar to YjbC
 YxlJHMM2094/3-methyladenine glycosylaseSimilar to 3-methyladenine glycosylases, including mammalian enzymes
Influx and efflux
 CsbC (YxcC)*HMM0477/Permeases 
 GabP 1113/γ-aminobutyrate and related permeasesγ-Aminobutyrate permease
 OpuBB 1174/Proline-glycine betaine ABC transporterMembrane permease of choline ABC transporter
 OpuBC 1732/Periplasmic binding protein ABC transporterBinding protein of choline ABC transporter; Lipoprotein SigPep
 OpuE*HMM0591/Na+s; proline symporters and related proteinsProline transporter
 YdbE 1638/Dicarboxylate-binding periplasmic proteinSimilar to Rhodobacter C4-dicarboxylate binding-protein DctP
 YdfC 0697/Predicted permeases 
 YesP 1175/Permease of ABC type sugar transporter 
 YfkE 0387/Ca2+s;/H+s; antiporterSimilar to sodium/calcium exchangers; apparent operon yfkE–yfkD
 YflA*HMM1115/Sodium alanine symportersSimilar to Alteromonasd-alanine or glycine transport protein
 YqiY 0765/Amino acid ABC transporter permease 
 YusP 0477/PermeasesSimilar to B. subtilis multidrug efflux pump Bmr3
 YvqJ 0477/PermeasesSimilar to Streptococcus pneumoniae macrolide efflux pump MefE
 YwjA 1132/ABC-type lipid transport system, ATPaseSimilar to E. coli MsbA core lipid A transporter; ywiE–ywjA–ywjB
 YyaM 0697/Predicted permeases 
 YybO 0477/Permeases 
Carbon metabolism
 AbnA  Endo 1,5-α-l-arabinase
 AldY** 1012/NAD-dependent aldehyde dehydrogenasesAldehyde dehydrogenase
 BioA 0161/Adenosylmethionine aminotransferaseAdenosylmethionine oxononanoate aminotransferase
 GgaA 0463/Glycosyltransferases in cell wall biogenesisBiosynthesis of minor teichoic acid species
 GspA*HMM1442/Lipopolysaccharide biosynthesis proteinsSimilar to E. coli glycosyltransferases RfaI/RfaJ
 HisD 0141/Histidinol dehydrogenaseHistidinol dehydrogenase
 MmgD 0372/Citrate synthaseCitrate synthase III
 SdhC 2009/Succinate dehydrogenase cytochrome b subunitSuccinate dehydrogenase cytochrome B-558 subunit
 YcdF**HMM1028/Dehydrogenases with different specificitiesSimilar to glucose dehydrogenases; apparent operon ycdF–ycdG
 YcdG 0366/GlycosidasesSimilar to oligo-1,6-glucosidase from Bacillus species; ycdF–ycdG
 YcsD 0764/Hydroxymyristoyl-(acyl carrier) dehydratases 
 YdaD*HMM1028/Dehydrogenases with different specificitiesApparent operon ydaD–ydaE–ydaF–ydaG
 YdaP*HMM0028/Thiamine pyrophosphate-requiring enzymesSimilar to E. coli pyruvate dehydrogenase
 YdjL 1063/Threonine dehydrogenases and related Zn2+s; dehydrogenasesSimilar to various dehydrogenases
 YerDHMM0069/Glutamate synthetase domain 2Similar to region of alpha subunit of glutamate synthase
 YfhF 1090/Nucleoside diphosphate epimerasesSimilar to hypothetical proteins, e.g. E. coli YfcH; yfhF–yfhE–yfhD
 YkgA**HMM1834/Dimethylarginine dimethylaminohydrolaseApparent operon ykgA–ykgB
 YurG 0075/Serine-pyruvate aminotransferaseSimilar to Methanobacterium aspartate aminotransferase
 YutB 0320/Lipoate synthaseSimilar to E. coli LipA lipoic acid synthase
 YvaAHMM0673/Predicted dehydrogenases and related proteinsSimilar to B. subtilis YhjJ
 YvfD 0110/Acetyltransferases of isoleucine patch familySimilar to Neisseria meningitidis PglB glycosylation protein
 YwfH 1028/Dehydrogenases with different specificities 
 YwiEHMM1502/Cardiolipin synthase and related enzymesApparent operon ywiE-ywjA-ywjB
 YxaB**  Similar to Streptococcus thermophilus EpsL exopolysaccharide synthase
 YxjG 0620/Methionine synthase II cobalamin independentSimilar to adjacent gene product YxjH; apparent operon yxjG–yxjH
 YxnA 1028/Dehydrogenases with different specificities 
 Nap 0596/Predicted hydrolases or acyltransferasesCarboxylesterase NA
 YebA 1305/Transglutaminase, putative cysteine protease 
 YfkM**HMM0693/Putative intracellular protease or amidase 
 YvaK 1647/Esterase-lipaseSimilar to Bacillus stearothermophilus carboxylesterase
 YvaJ 0557/Exoribonucleases; 1098/Ribosomal protein S1 domainSimilar to E. coli ribonuclease R; apparent operon yvaK–yvaJ
Unknown function – similar to protein in other organisms
 GsiB*HMM Similar to Arabidopsis thaliana Em1 desiccation stress protein
 YdjJHMM Similar to Deinococcus radiodurans conserved protein
 YfkH 1295/tRNA-processing ribonuclease BNSimilar to Enterococcus faecalis Orfde2 membrane protein; yfkJ–yfkI–yfkH
 YknA 0590/Cytosine deaminase-related enzymesSimilar to Pseudomonas putida CumB multicopper oxidase protein
 YgxB  Similar to Pseudomonas aeruginosa CmpX putative membrane protein
 YqhBHMM1253/Uncharacterized cystathionine beta synthase domainSimilar to four B. subtilis paralogues and probable haemolysins
 YrkE 2210/Uncharacterized ancient conserved regionSimilar to S. aureus protein; CCD2 domain
 YrvC 0490/Putative regulatory, ligand-binding proteinSimilar to B. subtilis YhaT; related to C-terminal domain of K+s; channels
 YshC 1796/DNA polymerase β; 1387/Histidinol phosphataseN-terminal COG1796; C-terminal 1387; similar to conserved proteins
 YtxH*[o] Similar to Listeria protein; apparent operon ytxG–ytxH–ytxJ
 YtxJ*[o] Similar to B. subtilis and D. radiodurans proteins
 YvrE**HMM3386/Uncharacterized proteinSimilar to Xenopus laevis conserved regulator protein
 YvgT 2860/Predicted membrane proteinSimilar to putative membrane proteins of other microbes
Unknown function – similar only to protein in B. subtilis
 XkdS  From phage PBSX; 77% identical to YqbS of skin element
 YcdB  Similar to adjacent gene product YcdC; apparent operon ycdB–ycdC
 YdaE*[o] Apparent operon ydaD–ydaE–ydaF–ydaG
 YdaT**  Apparent operon ydaT–ydaS (see ydaS in next category)
 YfkD  Apparent operon yfkE–yfkD; Signal Peptide
 YfkI  Apparent operon yfkJ–yfkI–yfkH
 YfkT  Apparent operon yfkR–yfkS–yfkT;yfkR and yfkS in Fig. 4C
 YflT**HMM Apparent operon yfmA–yflT (see yfmA in next category)
 YjgB**HMM Lipoprotein Signal Peptide
 YjgD 2427/Uncharacterized ancient conserved regionSimilar to B. subtilis YrhD; apparent operon yjgC–yjgD;yjgC in Fig. 4C
 YorAHMM From phage SPβC2; 42% identical to SPP1 phage protein 33
 YoxB  Apparent operon yoxC–yoxB–yoaA
 YoxC**HMM Apparent operon yoxC–yoxB–yoaA
 YqbA  In skin element; 67% identical to XkdE of phage PBSX
 YrhD 2427/Uncharacterized ancient conserved regionSimilar to B. subtilis YjgD
 YutC  Lipoprotein Signal Peptide
 YwsBHMM Similar to B. subtilis YfhK; Signal Peptide
Unknown function – 36 to 90 residues
 CsbD (YwmG)*HMM3237/Uncharacterized bacterial conserved region62 residues
 YbyBHMM 86 residues
 YdaS** 2261/Predicted membrane proteins85 residues; contains internal sequence similar to YwzA (see below)
 YfhE  36 residues; apparent operon yfhF–yfhE–yfhD
 YfmA  55 residues; apparent operon yfmA–yflT
 YisH  73 residues; similar to Bacillus cereus GerPA and B. subtilis YisC
 YpzE  54 residues
 YtiAHMM0254/Ribosomal protein L3182 residues
 YuzAHMM2155/Uncharacterized bacterial conserved region78 residues; 54% identity with Sinorhizobium meliloti Orf3
 YwjCHMM♦ 90 residues
 YwmEHMM♦ 53 residues
 YwzA 2261/Predicted membrane proteins49 residues; replica with 37 identical residues found within YdaS
 YxjJHMM 87 residues; apparent operon yxjJ–yxjI; yxjI in Fig. 4C

Evaluation of hidden Markov models used to predict promoters recognized directly by σB

Among the 127 genes shown in Fig. 4B, 30 were known or inferred previously to be controlled by σB. To assist the analysis of the 97 newly recognized genes, we developed hidden Markov models (Durbin et al., 1998) that identified possible σB recognition sequences preceding 21 of them. These models were constructed essentially as described by Fawcett et al. (2000), using the promoter sequences of 28 known σB-dependent transcription units as training data; these transcription units contain the 42 best-characterized genes in the σB regulon (Price, 2000).

We then evaluated the accuracy of the hidden Markov models (HMM) by mapping the 5′ end of the σB-dependent message in a representative subset of the 21 candidate genes, which we presumed to be under direct σB control. Rapid amplification of cDNA ends (RACE-PCR) was used to estimate the 5′ ends of the stress-induced messages for seven different genes (Frohman, 1994). As shown in Fig. 5, for all seven genes, we found a σB-dependent 5′ end downstream from sequences that closely resemble the −35 and −10 recognition elements of well-characterized σB-dependent promoters (Haldenwang, 1995). However, for two of the seven genes, ydbD and ydfO, the locations of the presumed −35 and −10 elements varied from the prediction. We conclude that, when coupled with stringent expression criteria, the HMM were very effective in identifying genes likely to be controlled directly by σB, and that they predicted the locations of the σB recognition elements with about 70% accuracy.

Figure 5.

5′ ends of σB-dependent messages determined by RACE-PCR. The first two triplets of the reading frame are shown (upper case) together with the putative ribosomal binding site (double underlined) and the upstream sequence (lower case). The solid overlining indicates the −35 and −10 regions predicted by the HMM, and the inverted black triangle indicates the 5′ end of the specific signal detected in ethanol-stressed PB2 wild-type cells but not in the ethanol-stressed PB153 (sigBΔ2::cat) mutant. For ydbD and ydfO, the dotted overlining indicates possible −35 and −10 regions that are more consistent with the experimentally determined 5′ end than are the HMM predictions. For yfkJ, a white triangle indicates the 5′ end of a second, σB-independent message detected in both wild-type and mutant cells, and the solid underlining shows σA-like recognition sequences preceding this 5′ end.

Use of the HMM to find additional candidate genes in the σB regulon

Once the HMM were found to be useful, we sought to identify additional candidates for σB-dependent genes by applying the models to all the genes included in Fig. 4A. The basic expression criteria for the new cluster (Fig. 4C) were the same as those used for the original cluster (Fig. 4A): each gene must manifest a geqslant R: gt-or-equal, slanted fourfold transcript ratio in one or both of the ethanol stress experiments or a geqslant R: gt-or-equal, slanted threefold transcript ratio in the ectopic expression experiment. However, to be included in Fig. 4C, a gene also had to fulfil one of three additional criteria: (i) a σB promoter was predicted by the HMM to lie within the 250 bp preceding the coding region; (ii) the gene was a known or inferred member of the σB regulon; or (iii) the gene lay in an apparent operon with genes in category (i) or (ii). Of the 123 genes included in Fig. 4C, the transcript levels of three (ykpA, ptb/yqiS and yodB) were significantly lower in the presence of σB. Considering the 120 genes whose transcript levels were significantly higher, 54 were previously included in the Fig. 4B cluster representing our strongest candidates, whereas 17 were known and 10 were inferred σB-dependent genes that had been excluded from Fig. 4B because of the stringent criteria. The remaining 39 genes in Fig. 4C therefore represent potential new members of the σB regulon that were identified by a combination of profiling data and the HMM. These 39 new genes are interesting candidates for further study, albeit with a greater potential for emerging as false positives than the genes listed in Fig. 4B.

Among the new genes in the Fig. 4C cluster are two that lie in apparent operons, promoter distal to one or more of the stronger candidates from Fig. 4B. The first is ywjB, which encodes a potential dihydrofolate reductase (Tatusov et al., 2001) and lies in the ywiE–ywjA–ywjB operon. The second is ykgB, which encodes a putative carboxymuconate cyclase (Tatusov et al., 2001) and lies in the ykgA–ykgB operon. Each of these putative operons has a σB-dependent promoter predicted by the HMM (Table 1). Another gene that lies promoter distal to stronger candidates from Fig. 4B is yoaA, which lies in the yoxC–yoxB–yoaA operon and was previously suggested to be σB dependent (Petersohn et al., 1999b). The Eisen plots shown in Fig. 4 group genes according to their expression profiles and, in Fig. 4C, yoaA is positioned near ydaF. On the B. subtilis chromosome, ydaF lies within a cluster of known general stress genes (ydaD–ydaE–ydaF–ydaG), but its σB dependence remains uncertain (Petersohn et al., 1999a). Interestingly, yoaA and ydaF each encode potential acetyltransferases of the Riml family (Tatusov et al., 2001). Although the yoaA and ydaF products resemble each other in amino acid sequence, the genes themselves share little nucleotide identity. The similarity of their expression profiles is therefore not the result of cross-hybridization (Richmond et al., 1999) and probably reflects the importance of their products in general stress resistance. Other genes whose expression patterns group them near yoaA and ydaF encode known class III heat shock genes, including nadE, ctsR, yacH, clpC, yacI and trxA, suggesting that some of the less well-characterized genes in this region of Fig. 4C may also be members of this class. For example, both malS and citZ manifest typical class III expression patterns, and the HMM predicts a σB-dependent promoter preceding each.


How does the expression of the σB regulon confer a general stress resistance? On the basis of the classification shown in Table 1, the general stress response extensively alters cellular physiology, but gene products with a direct protective function appear to be in the minority. Instead, the general stress response has a significant impact on the expression of genes coding for known or potential transcriptional and post-transcriptional regulators, for transporters involved in controlling solute influx and efflux, for proteins involved in carbon metabolism, envelope function and macromolecular turnover, and for at least 50 proteins of unknown function. These changes provide intriguing clues as to how bacterial cells prepare themselves for survival in suboptimal environments.

Known and potential regulatory genes

Table 1 includes 16 known or potential regulators that would allow σB to influence a wide range of cellular processes indirectly. Five are predicted post-translational regulators. These include YfkJ, classified as a low-molecular-weight tyrosine phosphatase (Shi et al., 1998), YabT, which resembles a serine/threonine protein kinase, and ManP (YjdD), which possesses all three domains of the PTS enzyme II family and is thought to control the ManR (YjdC) transcriptional activator as well as mannose transport itself (Stülke et al., 1998;Reizer et al., 1999). A fourth predicted regulator is YxaC, which appears to be a two-domain protein combining the LrgA and LrgB negative effectors of murein hydrolase activity in S. aureus (Groicher et al., 2000). Loss of LrgA and LrgB function renders S. aureus more susceptible to penicillin-induced killing, whereas overexpression of LrgA and LrgB inhibits this killing. A fifth is NhaX (YheK), provisionally assigned a role regulating NhaC (YheL), an Na+/H+ antiporter thought to aid pH homeostasis under alkaline conditions (Wei et al., 2000). NhaX belongs to a COG containing universal stress protein UspA (Tatusov et al., 2001), which is known to modify at least six other E. coli proteins in response to stress and starvation (Nyström and Neidhardt, 1996). However, NhaX resembles other UspA family members in a proposed ATP-binding domain and not in the suggested tyrosine kinase signature of UspA (Freestone et al., 1997), and its proposed role in NhaC regulation remains to be tested.

Nine other genes encode known or suggested transcriptional regulators. These include LicR, an activator of the lichenan operon (Tobisch et al., 1999), RsfA (YwfN), a regulator of σF-controlled genes (Wu and Errington, 2000), YvyD, which resembles σ54 activators and whose loss modestly affects the activity of a promoter dependent on σL, a σ54 homologue (Drzewiecki et al., 1998), and DinR/LexA, the repressor of the SOS response (Miller et al., 1996; Winterling et al., 1997). Enhanced expression of DinR might dampen the SOS response under general stress conditions, with the possible result of balancing an appropriate level of DNA repair against an unwelcome induction of resident prophage. Other potential transcriptional regulators include YdeC, YtzE, YvaN and YwhH, which are members of the AraC, DeoR, SinR and YbaK/EbsC families, respectively, and Xpf, a positive control factor of the resident PBSX prophage (McDonnell et al., 1994).

The expression of five other proteins that specifically regulate σB activity are known or inferred to be under the autogenous control of σB. These include RsbV, RsbW, SigB and RsbX, whose structural genes comprise the promoter-distal half of the sigB operon (Kalman et al., 1990), and YqhA, an unlinked σB regulator (Akbar et al., 2001), whose expression is also thought to be σB dependent (Petersohn et al., 1999b). The sigB and rsbX genes are found in the Fig. 4B cluster of strong candidates, rsbV was not represented in the arrays, and rsbW narrowly missed the criteria for inclusion in Fig. 4B and is found in Fig. 4C. Interestingly, yqhA also narrowly missed these criteria but is found in Fig. 4C because it lies in an apparent operon downstream from yqgZ, a gene from Fig. 4B that encodes a potential arsenate reductase (described in the next section). Notably, the HMM found characteristic σB recognition sequences preceding yqgZ, and RACE-PCR associated these sequences with the 5′ end of a σB-dependent message (Fig. 5). Our data therefore support the inference that yqhA is under σB control (Petersohn et al., 1999b) and argue that it lies in an operon with the order yqgZ–trnSLGln1–yqhA.

Genes with a potential protective function

The products of three newly identified genes may fulfil a detoxification role that contributes to general stress protection. These are YdbD, which is similar to a manganese-containing catalase from Lactobacillus plantarum (Igarishi et al., 1996), YdfO, which is similar to hydroquinone dioxygenases from Sphingomonas species (Miyauchi et al., 1999; Xun et al., 1999), and YqgZ, which is related to the arsenate reductases and may therefore confer resistance to toxic oxyanions of arsenic and antimony. The product of a fourth new gene, yxlJ, is similar to a variety of 3-methyladenine glycosylases that repair alkylated DNA. The HMM identified potential σB promoter elements preceding each of these four new genes (Table 1), and ydbD, ydfO and yqgZ gave a positive outcome with RACE-PCR (yxlJ was not tested; see Fig. 5). From these results, together with the results of previous studies, we infer that at least eight of the nine genes in the protective category are controlled directly by σB. The possible exception is another newly discovered gene, yisP, which lacks HMM-recognized σB promoter elements in the 250 bp immediately preceding its coding region. Notably, YisP is similar to carotenoid synthases from bacteria and plants. Carotenoids serve as photoprotectant agents by quenching singlet oxygen and other harmful radicals formed when aerobically growing bacteria are exposed to light (Dahl et al., 1989).

Influx and efflux

In addition to transporting essential nutrients, permeases contribute to stress resistance by removing toxic compounds and maintaining ion balance. The solute has not been identified experimentally for most of the recognized B. subtilis transporters. However, in a comprehensive classification of microbial permeases, Saier and colleagues found that transport specificity often correlates with phylogeny (Paulsen et al., 1998), allowing a preliminary functional prediction.

It was suggested previously that σB augments the expression of Bmr, a multidrug efflux transporter (Petersohn et al., 1999a). The notion that the general stress response contributes to multidrug resistance is strengthened by finding two new multidrug efflux homologues within the σB regulon (Table 1). One is YusP, whose sequence is intriguingly similar to Bmr3, a known multidrug transporter in B. subtilis (Ohki and Murata, 1997), and the other is YvqJ, which bears significant similarity to the MefE macrolide resistance protein of Streptococcus pneumoniae (Tait-Kamradt et al., 1997). Multidrug efflux is thought to reflect an ancient, broad-ranged resistance to toxic compounds generated internally by cellular metabolism or externally by plants in the environment (Miller and Sulavik, 1996;Klyachko and Neyfakh, 1998). Considering the distribution of σB and its regulators among Gram-positive pathogens (Price, 2000), this potential contribution to multidrug resistance may be of particular relevance in clinical settings.

The expression of three other permeases listed in Table 1 may also further environmental stress resistance. One is GabP, a member of the amino acid–polyamine–organocation family that transports γ-aminobutyrate (Brechtel and King, 1998). In E. coli and Lactococcus lactis, γ-aminobutyrate transport is part of a cycle that contributes to acid stress tolerance (Sanders et al., 1998;Castanie-Cornet et al., 1999). Another is YfkE, a member of the Ca2+ cation antiporter family (Paulsen et al., 1998). These antiporters contribute to maintaining sodium or calcium balance and can also confer resistance to high cation levels in the environment (Ivey et al., 1993). A third permease is the multicomponent ABC transporter encoded by the opuB locus, which is narrowly specific for choline, a precursor of the osmoprotectant glycine betaine (Kappes et al., 1999). The transcript levels of opuBB and opuBC met the criteria for inclusion within the cluster shown in Fig. 4B, and those of opuBD narrowly missed the criteria. These genes encode the membrane components of the permease and the extracellular choline-binding protein. In contrast, the transcript levels of opuBA, encoding the membrane-associated ATPase component, were not influenced by σB. This potential influence of the general stress response on choline transport is presumably independent of the osmotic control of opuB expression demonstrated previously by Kappes et al. (1999). If this proves to be the case, it would parallel the dual osmotic and σB control already established for the proline transporter OpuE (von Blohn et al., 1997), also included in the Fig. 4B cluster.

Energy stress is one of the strong triggers of the general stress response. Consistent with this, many of the other permeases discovered in this study appear to transport sugars, amino acids and other nutrients that would be advantageous to starving cells. These include members of three different ABC transporters: YdbE, a possible C-4 dicarboxylate binding protein; YesP, which appears to be the permease component of a sugar transporter; and YqiY, a possible permease component of an amino acid transporter (Paulsen et al., 1998). Moreover, YwtG and YybO are members of the major facilitator superfamily. These (and the superfamily members CsbC and CsbX identified in previous studies) probably transport sugars and related compounds (Paulsen et al., 1998).

A substantial number of influx and efflux functions appear to be altered during the general stress response. This parallels the emerging picture that other levels of cellular metabolism are also modified, as discussed in the next section.

Other metabolic changes

At least three different categories of metabolism appear to be influenced by expression of the σB regulon: (i) carbon metabolism; (ii) envelope function; and (iii) turnover.

Carbon metabolism

Many genes in this category encode potential oxidoreductases or dehydrogenases. These include YdjL, a putative Zn-dependent oxidoreductase, and three predicted dehydrogenases, YvaA, YwfH and YxnA. These proteins join YcdF and YdaD, two previously discovered dehydrogenases whose suggested role is maintaining cellular redox balance during stress (Petersohn et al., 1999a,b). However, some of these dehydrogenases also have important roles in carbon metabolism, including (for certain homologues of YcdF, YdaD, YwfH and YxnA) roles in fatty acid metabolism (Tatusov et al., 2001).

Envelope function

In addition to influencing transport and possibly fatty acid metabolism, as described above, other σB-dependent gene products also appear to modify cell envelope function. Their products are included in the carbon metabolism category of Table 1. For example, GgaA participates in the synthesis of a minor teichoic acid (Estrela et al., 1991), and we suggest that GspA, YvfD and YxaB participate in exopolysaccharide biosynthesis. GspA is an abundant product of a known σB-dependent gene (Antelmann et al., 1995), but its physiological role has remained elusive. We note here that GspA significantly resembles the E. coli RfaI and RfaJ glycosyltransferases involved in lipopolysaccharide biosynthesis (Schnaitman and Klena, 1993). Similarly, YxaB was inferred previously to be a member of the σB regulon (Antelmann et al., 1997c), but its function is unknown. Our profiling experiments provide additional evidence that yxaB transcription is under σB control, and we also find that YxaB resembles Streptomyces thermophilus EpsL (Stingele et al., 1996). YvfD emerged from our study and is similar to sugar transferases such as Streptococcus agalactiae NeuD (Chaffin et al., 2000) and Neisseria meningitidis PglB (Power et al., 2000). Other examples of genes implicated in envelope function are provided by ywiE and ywjA, which seem to comprise an operon. YwiE belongs to a cluster of cardiolipin and phosphatidylserine synthases (Tatusov et al., 2001), and YwjA may form part of a special, conjoined-peptide ABC transporter of the MsbA family (Paulsen et al., 1998). MsbA transports lipid molecules across the inner membrane in E. coli (Zhou et al., 1998).


The five genes listed in the turnover category of Table 1 encode known or potential ribonucleases and proteases that could recycle damaged or unnecessary macromolecules to satisfy changing cellular needs. Two of these are encoded by yvaK and yvaJ, which lie in an apparent operon. YvaK closely resembles a carboxylesterase from Bacillus stearothermophilus, whereas YvaJ has an activity similar to ribonuclease R from E. coli (Oussenko and Bechofer, 2000).

Genes of unknown function

Based largely on their phylogenetic distribution, there are three different categories for gene products of unknown function specified in Table 1.

The 13 proteins in the first category are significantly similar (E-value < 1e-10) to proteins found in other organisms, and their inferred involvement in the general stress response provides a clue to the physiological roles of their orthologues in Deinococcus radiodurans, Enterococcus faecalis, L. monocytogenes and S. aureus.

The 28 members in the second category are not significantly similar to proteins in other organisms, but some do have B. subtilis paralogues. For example, YjgD and YrhD resemble each other and share an ancient conserved region of unknown function (Tatusov et al., 2001). Additionally, YfkD and YwsB are predicted to possess signal peptides, and YjgB and YutC are predicted to have lipoprotein signal peptides (Tjalsma et al., 2000), but their functions remain unknown.

The 13 members in the third category include small proteins ranging between 36 and 90 residues. A blast comparison found that most of these small proteins closely resemble nothing outside of B. subtilis. The exceptions are YisH, 56% identical in a 72-residue overlap with Bacillus cereus GerPA (Behravan et al., 2000), and YuzA, 54% identical in a 61-residue overlap with Sinorhizobium meliloti Orf3 (Davey and de Bruijn, 2000). The functions of GerPA and Orf3 are not yet known. gerPA lies in an operon with other genes involved in spore germination, whereas orf3 lies immediately upstream from the gene for a presumed sensor that positively regulates the ndiAB locus in response to nutrient limitation, osmotic stress and entry into stationary phase. Likely homologues to Orf3 are also found in Rickettsia, Chlamydophila and Chlamydia species. Of the remaining products in this category, the COG database suggests that YuzA and CsbD/YwmG carry bacterial conserved regions of unknown function, and that YdaS and YwzA are membrane proteins. Curiously, the 85-residue YdaS sequence contains a replica of the 49-residue YwzA sequence, with 37 identical residues and six conserved substitutions. Seven of the genes in this category are preceded by possible σB recognition sequences detected by the HMM, and the σB-dependent messages of two, ywjC and ywmE, were confirmed by RACE-PCR (Fig. 5).


Our transcriptional profiling experiments support the view that the general stress response brings about widespread changes in cellular metabolism to achieve a more stress-resistant state. Of the 97 new σB-dependent genes we identified, few appear to counter environmental or energy stresses directly. Instead, we observed patterns of transcription consistent with changes in carbon metabolism, macromolecular turnover and envelope function, including particularly striking changes in transport function. Significantly, most of the 97 newly identified genes are of unknown function but can now be associated with the general stress response in B. subtilis and other microbes. And lastly, we discovered substantial shifts in the transcriptional patterns of known or suspected regulatory proteins. In some cases, these regulatory proteins may be the agents of the metabolic shift that our study suggests is a prominent feature of the general stress response. In other cases, the altered expression of these regulatory proteins may be a consequence of metabolic realignment. The question of how such a realignment promotes general stress resistance, and also biofilm formation and virulence in related pathogenic organisms, may now be addressed by analysing the role and regulation of individual general stress genes identified in B. subtilis.

Experimental procedures

Growth conditions

Bacillus subtilis strains were derivatives of the 168 Marburg strain. These included PB2 wild type (trpC2) and PB153 (sigBΔ2::cat trpC2) used for the two ethanol stress experiments (Boylan et al., 1991) and PB213 (PspacrsbV rsbWΔ1 sigB rsbX) used for the ectopic expression experiment (Boylan et al., 1992). Cells were grown overnight on Luria–Bertani (LB) agar lacking salt, then inoculated to an OD600 of ≈ 0.05 in 50 ml of buffered LB medium lacking salt (Boylan et al., 1993b), contained in a 125 ml flask. For the ethanol stress experiments, parallel cultures of PB2 and PB153 were grown at 37°C with good aeration until they reached an OD600 of ≈ 0.40, at which time they were diluted into 50 ml of fresh, prewarmed LB lacking salt to achieve an OD600 of ≈ 0.015. The cultures were then grown to an OD600 of ≈ 0.20, at which point ethanol was added to a final concentration of 5% (v/v). In previous experiments, maximal reporter fusion activity was seen between 10 and 15 min after the imposition of an ethanol stress (Boylan et al., 1993b). Consequently, for the first experiment reported here, 10 ml culture samples were withdrawn 0, 5 and 10 min after ethanol addition, whereas for the second experiment, the samples were withdrawn at 0, 5, 10 and 15 min. The specific growth rates (µ) of these cultures were 1.98–2.08 h−1 before ethanol addition, and these decreased to 0.60–0.69 h−1 during the period of sampling. For the ectopic expression experiment, parallel cultures of PB213 were grown in LB lacking salt to an OD600 of ≈ 0.20, as described above; then, IPTG was added to one of the flasks at a final concentration of 1 mM. Culture samples were taken 0, 5, 10, 15 and 20 min after IPTG addition. Although included in the database and shown in Fig. 4C, data from the 20 min time point were not used for analysis because of the large number of genes (618) whose expression was significantly altered by treatment with inducer, many of which presumably reflected secondary consequences of sigB induction. The specific growth rate of these cultures was 1.98 h−1 before IPTG addition and did not change appreciably during the period of sampling. All culture samples were rapidly pelleted and frozen in liquid N2 to await RNA extraction.

Transcriptional profiling and treatment of the data

Custom PCR-based macroarrays were made on nylon substrate filters. PCR amplification was carried out in 96-well plates using a set of primers for 4100 B. subtilis open reading frames (Eurogentech). PCR products were analysed by agarose gel electrophoresis, and > 3900 of the 4100 reading frames were found to be represented.

RNA was prepared from frozen culture samples using a variant of the hot acid–phenol method (see Supplementary material). To make the cDNA probes, reverse transcription reactions were primed using random hexamers on 5 µg of total RNA template, then conducted in the presence of [33P]-dCTP. Radiolabelled probes were spin purified and assayed by liquid scintillation, then divided and hybridized to replicate array membranes for 18 h at 68°C. These replicate membranes were washed and baked according to standard protocols, then exposed overnight against phosphorimager screens. Membranes were used only once, then discarded.

A Fuji phosphorimager and array vision software (Imaging Research) were used to image and quantify hybridization intensities and for background subtraction. Spots were associated with genes using expression explorer software (Millennium Pharmaceuticals), and intensity distributions were normalized to a median value of 10. A minimum threshold value of 1 was set for all data points to avoid spurious ratios towards the lower end of the intensity range.

Data sets were imported into a filemaker pro 4.1 database (see Supplementary material). Expression ratios were calculated using average values obtained from replicate filters, considering only those genes showing geqslant R: gt-or-equal, slanted fourfold difference in the case of the ethanol stress experiments or geqslant R: gt-or-equal, slanted threefold difference in the case of the ectopic expression experiment. We superimposed on these basic criteria a statistical heuristic to eliminate those genes that gave inconsistent hybridization by calculating a 90% confidence interval based on the replicate data and asking that ratios calculated at the extremes of the confidence intervals remained greater than the basic criteria.

RACE-PCR experiments

We used RACE-PCR to locate the 5′ ends of selected messages, essentially as described by Frohman (1994). RNA was extracted from both the PB2 wild type and the PB153 mutant (sigBΔ2::cat) 10 min after exposure to ethanol stress. For the experiments reported here, 1.3 µg of RNA template was mixed in a final volume of 10 µl with 20 pmol of a gene-specific primer (oligo-1, located about 300 bp downstream from the predicted 5′ end). The sample was denatured at 70°C for 5 min and allowed to anneal while cooling to room temperature. For cDNA synthesis, we added to this 10 µl annealing mixture an additional 40 µl of reverse transcriptase buffer containing 0.5 mM dNTP and 0.5 mM M-MuLV reverse transcriptase (New England Biolabs), then incubated it at 37°C for 2 h. A homopolymeric A-tail was added to the 3′ end of the cDNA with 25 U of terminal transferase (New England Biolabs), and the tailed cDNA was amplified further using 0.5 mM poly(T) primer (Roche Molecular Biochemicals) together with a second gene-specific primer (oligo-2, located about 200 bp downstream from the predicted 5′ end). The resulting PCR products were sequenced directly and compared with the relevant intercistronic regions. Owing to its multistep nature, RACE-PCR has the potential to amplify truncated cDNA ends, thereby designating a 5′ end downstream from its actual location (Schaefer, 1995). Within the error of the method, all seven 5′ ends shown in Fig. 5 were found an appropriate distance downstream from elements resembling the −35 and −10 regions of known σB-dependent promoters. In contrast, none of these same 5′ ends was detectable using RNA extracted from the σB null mutant.


We thank Tatiana Gaidenko, Arkady Khodursky and Richard Losick for their helpful discussions and critical comments on the manuscript. This research was supported in part by Public Health Service grant GM42077 from the National Institute of General Medical Sciences (C.W.P.).

Supplementary material

The following material is available from or from


A. Preparation of RNA

B. Preparation of radioactive cDNA probes

C. Array hybridizations

D. Hidden Markov models

Complete data set

A. Array data

B. Annotation for Fig. 4A