Global impact of mature biofilm lifestyle on Escherichia coli K-12 gene expression



The formation of biofilm results in a major lifestyle switch that is thought to affect the expression of multiple genes and operons. We used DNA arrays to study the global effect of biofilm formation on gene expression in mature Escherichia coli K-12 biofilm. We show that, when biofilm is compared with the exponential growth phase, 1.9% of the genes showed a consistent up- or downregulation by a factor greater than two, and that 10% of the E. coli genome is significantly differentially expressed. The functions of the genes induced in these conditions correspond to stress response as well as energy production, envelope biogenesis and unknown functions. We provide evidence that the expression of stress envelope response genes, such as the psp operon or elements of the cpx and rpoE pathways, is a general feature of E. coli mature biofilms. We also compared biofilm with the stationary growth phase and showed that the biofilm lifestyle, although sharing similarities with the stationary growth phase, triggers the expression of specific sets of genes. Using gene disruption of 54 of the most biofilm-induced genes followed by a detailed phenotypic study, we validated the biological relevance of our analysis and showed that 20 of these genes are required for the formation of mature biofilm. This group includes 11 genes of previously unknown function. These results constitute a comprehensive analysis of the global transcriptional response triggered in mature E. coli biofilms and provide insights into its physiological signature.


Surface-attached, matrix-enclosed bacterial communities, called biofilms, cause serious economic and health problems because of biofilm-associated phenotypes such as antibiotic resistance or biofouling (Costerton et al., 1999). The drastic phenotypical changes seen in biofilms led to the assumption that the physiological modifications necessary for planktonic bacteria to adopt the biofilm lifestyle must involve specific responses. Global expression analysis comparing either protein or transcription profiles in Pseudomonas planktonic and biofilm bacterial cultures suggested that a large number of genes could be differentially regulated during biofilm development (Sauer and Camper, 2001; Whiteley et al., 2001; Sauer et al., 2002). In Escherichia coli, the use of DNA array technology to compare the expression profiles between planktonic cells and young biofilms showed recently that, beyond genes involved in adhesion events, several novel gene clusters are specifically induced during the early stages of biofilm formation (Schembri et al., 2003). These pioneering studies opened the way to the genetic characterization of the biofilm phenotype. However, although the early events of biofilm formation are well documented, extracting functional information from genomic approaches remains a challenge (Sauer, 2003). In particular, the nature of the physiological changes and critical regulatory processes occurring inside the aged, mature biofilm is still poorly understood (Ghigo, 2003).

Whereas E. coli K-12 does not spontaneously form extensive biofilms, it was shown previously that the expression of pili from conjugative plasmids, which are widespread in natural bacterial populations, promotes the development of thick, mature biofilms (Ghigo, 2001; Reisner et al., 2003). This raised the possibility of studying the genetic basis of the mature biofilm phenotype in a widely used bacterial model in which expression profiling can be combined with the phenotypic analysis of a large set of deletion mutants.

In order to identify the physiological responses taking place inside mature biofilm, we investigate the global transcriptional differences between E. coli K-12 mature biofilm and planktonic cultures. We validated our expression profiling approach by the systematic inactivation of the most biofilm-induced genes followed by a detailed analysis of their biofilm phenotypes. This functional profiling led us to assign a biofilm-related function to 20 genes. We showed that the expression of stress response genes is a general feature of E. coli biofilms, and we investigated the role of these genes in biofilm formation and architecture. This global transcription analysis of mature E. coli biofilms provides cues to the genetic events underlying the phenotypic changes that characterize the biofilm mode of growth.


Production of mature E. coli biofilms

The capacity of different E. coli K-12 strains to form mature biofilms was tested in M63B1 glucose minimal medium in a microfermenter-based continuous flow culture system (Ghigo, 2001). Most of the strains tested formed only thin biofilms after 2–5 days. However, high biomass and thick biofilm production (>200 µM) was reproducibly achieved using E. coli TG1, a strain carrying the F conjugative plasmid previously shown to promote biofilm formation (Ghigo, 2001; Reisner et al., 2003). To identify E. coli genes that are differentially expressed in mature biofilms, we compared 8-day-old TG1 biofilms with late exponential TG1 planktonic (OD = 0.6) or stationary phase cultures (OD = 3). Whereas in agitated flask and planktonic culture conditions, no surface adhesion was observed, a significant amount of contaminating biofilm formation occurred in planktonic TG1 continuous cultures grown in fermenters. This led us, in the main experiment described in this study, to compare planktonic cultures grown in agitated flasks with TG1 biofilms grown in fermenters. However, differential gene expression between planktonic and biofilm bacteria both grown in fermenters was also investigated (see Discussion).

Biofilm formation has a global impact on gene expression when compared with exponential growth phase

Total RNAs were isolated from independent biofilm and exponential growth phase cultures and subjected to a stringent expression profiling procedure using E. coli membrane DNA macroarrays. Data were subjected to a Wilcoxon rank test. The expression pattern and predicted function of differentially expressed genes are summarized in Fig. 1 and in the Supplementary material (Fig. S1). In biofilms, 250 genes (5.8%) were overexpressed (P < 0.05, 82% of them with P < 0.005), whereas 188 genes (4.4%) were underexpressed (P < 0.05, 85% of them with P < 0.005). This indicates that 10.2% of the E. coli genome is differentially expressed in TG1 biofilm at a statistically significant level (Supplementary material, Fig. S1 and Tables S1 and S2[link]). Among these identified genes, 1.9% were up- or downregulated by a factor of twofold or more.

Figure 1.

Function of genes overexpressed in TG1 biofilm versus exponential growth phase. This figure summarizes the data presented in Table S1 (Supplementary material). The genes have been classified according to the COG functional categories annotation system. Large and medium size numbers indicate the total number of E. coli biofilm-induced genes in each class or subclass of indicated functions. Genes are indicated only when their expression level in biofilm differed by at least a factor of twofold (≥ 2). Numbers within brackets indicate the rank as overexpressed genes; 1 = most expressed gene in TG1 E. coli biofilm.

Figure 1.

. Overexpressed genes (≥ 2) in E. coli TG1 and TG biofilms versus exponential growth phase.

a. Gene names according to E. coli Colibri database (

b. Gene names according to Blattner nomenclature (♯gen).

c. Ratio of gene expression in E. coli biofilm versus gene expression in planktonic exponential cultures.

d. Rank position; 1 = the most overexpressed gene in E. coli biofilm.

e. Biofilm phenotype of the mutants: ND, not determined; NA, not applicable because of a growth defect in M63B1 glucose medium; wt, similar to wild type; –, biofilm reduced compared with wt; Struct, biofilm structure impaired compared with wt.

f.✓, genes also found to be significantly overexpressed in F minus E. coli strain TG.

g. Function description according to E. coli Colibri database.

h. pspF was expressed by only a factor of 1.22 in TG1 biofilm but has been included for comparison with other members of the psp operon.

Arrow, mutants affected for biofilm formation.

*Genes that were not induced in TG1 biofilm versus stationary phase.

Genes that were also induced in TG1 biofilm versus stationary phase by at least a factor of two. These genes are also summarized in Table S3.

The genes have been classified according to the COG functional categories annotation system used by the NCBI (

Table 2. . Strains and plasmids used in this study.
Strain/plasmidRelevant characteristicsReference/source
  1. a . Additional individual mutants in the following genes: cutC, cyoC, dinI, eco, fadB, fdhF, gadA, lctR, malM–G, mdh, nifS, nifU, nlpE, pspA–E, rbsB, rpoE, rseB, sixA, sodC, spy, sucA, sulA, tatE, ybeD, ybjF, yccA, yceP, ycfJ, ycfL, ycfR, ydcI, yebE, yfcX, yggN, yghO, ygiB, yhhY, yiaH, yjbO, yneA, yoaB, yqcC, yqeC, were named TG1Δ[gene name]::aphA (KmR).

E. coli strains
 PAP6181K1519pspF::miniTn10 (TetR) Jovanovic et al. (1996)
 PHL904 cpxA:: Ωcat (CmR)Dorel et al. (1999)
 RG075MG1655ΔmsrA::ΩSpec (SpecR)A gift from F. Barras
 STC27 fimA1::cat (CmR) Pratt and Kolter (1998)
 TG1F′[traD36 proAB+lacIqlacZΔM15]
supE hsdΔ5 thiΔ(lac-proAB)
Laboratory collection
 TGA F minus derivative of TG1Laboratory collection
 TG1ΔcpxATG1cpxA:: Ωcat (CmR)This work
 TG1ΔcpxPTG1ΔcpxP::ΔfrtThis work
 TG1ΔcpxRTG1ΔcpxR::ΔfrtThis work
 TG1ΔfimATG1ΔfimA::cat (CmR)This work
 TG1ΔmsrATG1ΔmsrA::ΩSpec (SpecR)This work
 TG1ΔpspFTG1pspF::miniTn10 (TetR)This work
 TG1recATG1recA56 SrlC300::Tn10 (TetR)Laboratory collection
 TG1ΔrseATG1ΔrseA::ΔfrtThis work
 TG1gfpTG1λatt::gfp-bla (AmpR)A gift from A. Roux
 TG1gfpΔcpxPTG1ΔcpxPλatt::gfp-bla (AmpR)This work
 TG1gfpΔcpxRTG1ΔcpxRλatt::gfp-bla (AmpR)This work
 TG1gfpΔyccATG1ΔyccAλatt::gfp-bla (KmR, AmpR)This work
 TG1gfpΔycfJaTG1ΔycfJλatt::gfp-bla (KmR, AmpR)This work
 pKOBEGpSC101 ts (replicates at 30°C), araC
arabinose-inducible λredγβα operon (CmR)
Chaveroche et al. (2000)
 pCP20 ts (replicates at 30°C) plasmid bearing
the flp recombinase gene (CmR and AmpR)
Cherepanov and Wackernagel (1995)

The most significant classes of biofilm-induced genes when compared with the planktonic exponential growth phase either by level of overexpression or by number are (i) genes involved in cellular processes such as envelope stress responses (pspABCDE, cpxP, spy, rpoE, rseA, rseB) and stress (recA, dinI) as well as cell envelope biogenesis and transport (fimA, tatE); (ii) genes involved in energy (cyoD, sucA, sixA, nifU) and carbohydrate metabolic functions (rbsB, lamB); and (iii) genes of unknown function (48 %) (Fig. 1).

The main classes of repressed genes include genes involved in amino acid, carbohydrate transport and inorganic ion transport and genes of unknown function (Fig. S1 and Table S2). In the rest of this study, we focus on genes that were found to be the most overexpressed in E. coli biofilms. The role and significance of the repressed functions will be reported elsewhere.

Both stationary phase and biofilm-specific genes are expressed in mature biofilms

Mature biofilms constitute heterogeneous environments in which bacteria grow at different rates. This heterogeneity is proposed to be mostly dependent on nutrient availability and depth-related conditions created within the biofilm. We wished to determine to what extent the genes identified above were truly biofilm specific or, instead, a consequence of the stationary phase-like conditions prevailing in the mature biofilm. Total RNAs were isolated from independent stationary phase planktonic cultures, subjected to the expression profiling procedure and compared with biofilm profiling (complete comparison available at Among the 64 genes found to be the most induced in biofilm versus exponential phase (≥twofold ratio, see Fig. 1), 61% (39/64) of them were not induced in biofilm when compared with stationary phase (Table 1). This suggests that these 39 genes are not biofilm specific but may, instead, reflect the stationary phase-like growth conditions within the mature E. coli biofilm.

In contrast, the remaining genes (39%, 25/64) were also overexpressed in biofilm versus stationary growth phase, 24 of which with a ratio ≥ 2, thus defining a set of biofilm-specific genes (Table 1 and Supplementary material, Table S3).

Validation of the macroarray data

Several approaches were used to validate the data issued from transcriptional profiling experiments. We checked the correlation between expression data and operon structure in E. coli. An analysis restricted to the genes with known function found to be induced by at least a twofold factor in biofilm compared with exponentially grown cells showed that 51% of them (21/41) were predicted to be included in 14 different operons, using the EcoCyc database ( For 10 of these 14 operons, we identified at least two members of the operon with expression that was induced in biofilms compared with exponentially grown cells. Furthermore, in order to verify the expression level changes, we then performed a quantitative reverse transcription polymerase chain reaction analysis (Q-RT-PCR) on a selection of the biofilm growth-regulated genes. Q-RT-PCR was performed for seven of the most biofilm-induced genes compared with exponentially grown cells (cpxP, ycfJ, ycfR, yebE, cyoD, sucA and fimA; see Fig. 1 and Table 1). Figure 2 shows a good correlation between the data obtained by the two different techniques (r = 1.12).

Figure 2.

Correlation of macroarray and quantitative real-time PCR results. The calculated macroarray and Q-RT-PCR ratios of the expression of seven genes in TG1 biofilm relative to exponential growth phase were log2 transformed, and values were plotted against each other to evaluate their correlation. The correlation coefficient was deduced from a linear regression of the plotted values.

These results indicate both good internal consistency of our macroarray data as well as good correlation between our analysis and actual mRNA levels, as determined experimentally by Q-RT-PCR. To extract further functional information from our DNA array data, we then wished to analyse the biofilm-related phenotypes of isogenic mutants of the identified biofilm-induced genes.

Functional profiling of E. coli biofilms: 20 biofilm-induced genes are involved in mature biofilm development

Among genes significantly induced in TG1 biofilms (compared with planktonic exponential growth phase cells), 64 genes were found to be overexpressed by at least a factor of two (Table 1). To test directly the contribution of these genes to biofilm development, we deleted 23 of the 25 genes that were overexpressed in biofilms compared with both planktonic phases (biofilm-specific genes) as well as 31 of the 39 genes that were only induced in biofilms versus exponential growth phase. Mutations in sixA, sucA, yfhN (nifU), yfhO (nifS), ybeD, yhhY, rpoE and rseA impaired growth in M63B1 glucose minimal medium (data not shown). Mutants in these genes, along with ftsL, an essential cell division gene, could not be meaningfully tested for biofilm formation and were therefore excluded from further biofilm analysis. rpoE is an essential gene in which mutations can be suppressed by extragenic mutations (De Las Penas et al., 1997). Although our rpoE mutant did not exhibit full wild-type growth, we cannot exclude the appearance of such suppressor mutations in this mutant.

The ability to form a mature biofilm within 24 h was assessed for each mutant and compared with TG1. Both macroscopic biofilm development in microfermenters and biofilm cell density after dispersion of the biofilm grown on the removable glass slide of the fermenter were examined. Twenty mutants displayed a reduced biofilm phenotype (see Table 1, Fig. 3 and Supplementary material, Fig. S2). Eight of the mutants with reduced biofilm biomass correspond to genes of known function: fimA, msrA, rbsB, mdh, lctR, tatE, recA and cpxP.

Figure 3.

Biofilm phenotype of selected deletion mutants. Mature biofilm development of E. coli TG1 (wt) compared with a selection of deletion mutants of genes overexpressed in TG1 biofilm.
A. For each mutant phenotype analysis, the extent of biofilm formation is shown in the bottom part of the microfermenter and on the removable glass slide. A typical experiment is shown.
B. Graphical comparison of biofilm formation relative to wild type from the mutants presented in (A). Data represent the average of three independent experiments for each mutant. The level of biofilm formed by wt TG1 biofilm was set to 100%.

fimA, msrA, rbsB and mdh are genes encoding proteins that have already been linked to biofilm formation or adhesion properties (see above). As expected, adhesion appeared to be a key factor in TG1 biofilm formation. Indeed, fimA encodes for the major subunit of type I fimbriae, a known initial adhesion factor (Klemm and Christiansen, 1987), the role of which in biofilm formation has been demonstrated previously (Cormio et al., 1996; Austin et al., 1998; Pratt and Kolter, 1998; Watnick et al., 1999; Cookson et al., 2002). In contrast to our results, Reisner et al. (2003) showed recently that a fimA mutation had no effect on the development of biofilms formed in flow chambers by an F plasmid-bearing E. coli strain. Differences in strain, medium and biofilm growing system used might account for this discrepancy. msrA encodes a peptide methionine sulphoxide reductase (MsrA), a repair enzyme that contributes to the maintenance of adhesins in Streptococcus pneumoniae, Neisseria gonorrhoeae, E. coli (Wizemann et al., 1996) and Mycoplasma genitalium (Dhandayuthapani et al., 2001), which could explain the alteration in biofilm formation in the msrA mutant.

The biofilm lifestyle leads to a profound modification of energy metabolism as judged by the identification of mdh, rbsB and lctR as biofilm-induced genes. The rbsB and mdh genes have already been identified as being overexpressed in biofilms formed by pathogenic E. coli (Tremoulet et al., 2002). rbsB is part of the rbsDACBK operon that encodes high-affinity transport of and chemotaxis towards d-ribose (rbsC and rbsD are also induced in biofilms; see Supplementary material, Table S1). mdh encodes malate dehydrogenase, an enzyme of the TCA cycle. The lctR gene encodes for a regulator of l-lactate dehydrogenase. Furthermore, several sugar metabolism/transport systems are activated in biofilm (maltose transport, glycerol metabolism and uptake, galactose-binding proteins, see Supplementary material, Table S1).

Our results also suggest that mature E. coli biofilm formation might require Tat-dependent secretion of a specific set of proteins. Indeed, tatE is proposed to be involved in the twin-arginine cell envelope protein transport system (Chanal et al., 1998). In Pseudomonas aeruginosa, tatA and tatB, encoding components of this secretion system, have been shown to be induced in biofilms (Whiteley et al., 2001), whereas tatC has been shown to be required for biofilm formation (Ochsner et al., 2002).

We also observed a defect in mature biofilm formation in a recA mutant (Fig. 3). This underlines the importance of stress responses in E. coli TG1 biofilm. Consistent with this result, several stress response genes are overexpressed in TG1 biofilm (SOS response: dinI, dinP, dinG, sbmC, recN, sulA; general stress: rpoS; chaperones: dnaJ and dnaK; heat shock proteins: htpX, htpG and ddg; DNA repair: exo, xthA; and envelope stress: see below and Supplementary material, Table S1). cpxP and spy are both linked to envelope stress response (Connolly et al., 1997; Danese and Silhavy, 1997; Raivio and Silhavy, 2001) and will be investigated below.

We could also assign a biofilm-related function to 11 genes of previously unknown function (ycfJ, ycfR, yoaB, yqcC, yggN, yneA, yccA, yfcX, yghO, yceP and ygiB). YfcX may be required for fatty acid utilization as a carbon source in anaerobic conditions (Campbell et al., 2003). Among these 11 genes, five encode putative extracytoplasmic proteins (ycfJ, ycfR, yqcC, yneA, yccA). YcfJ is homologous to UmoD of Proteus mirabilis, a protein that negatively regulates the flhDC flagellar and swarming master operon (Dufour et al., 1998). yccA is a putative cpx regulon member (De Wulf et al., 2002) encoding a protein of unknown function, but it has been shown to be a substrate for the membrane protease FtsH (Kihara et al., 1998). Among the mutants lacking any one of these five putative extracytoplasmic proteins, ΔycfJ and ΔyccA were the most affected for mature biofilm formation, with a reduction of about 50% compared with wild-type strain TG1 (Fig. S2).

To investigate the biofilm-related role of these two putative membrane proteins further and to confirm their importance in mature biofilm formation, we genetically introduced the green fluorescent protein (GFP) gene into the wild-type strain TG1 and in the mutant strains TG1ΔycfJ and TG1ΔyccA. This allowed us to compare biofilm formation between TG1gfp and TG1gfpΔycfJ and TG1gfpΔyccA in continuous flow chamber cultures, another well-established experimental model that is a non-invasive means of observing where the spatial arrangement of the cells is preserved. This experimental system allows the quantitative, real-time monitoring of biofilm architecture development using confocal laser scanning microscopy and comstat analysis (Heydorn et al., 2000) (Fig. 4). Initial adhesion of the two ycfJ and yccA mutants was not affected, as measured by substrate coverage and biomass analysis. However, the maturation of the biofilm formed by these two mutants was greatly delayed, especially for the yccA mutant. Indeed, in the yccA mutant, the accumulated biomass remained very low over time, and typical biofilm mushroom structures appeared only sporadically and much later compared with wild-type strain TG1 (see Fig. 4). This suggests a role for YcfJ and YccA proteins in biofilm maturation.

Figure 4.

Functional profiling of E. coli biofilm: flow chamber analysis.
A. Spatial distribution of biofilm formation for E. coli TG1 and selected TG1 deletion mutants expressing Gfp. Biofilms were grown in flow chambers. Biofilm development was monitored by SCLM at the indicated times after inoculation (20 h, 45 h, 70 h, 95 h). Micrographs represent simulated three-dimensional images. Images inset into 70 h and 95 h of ycfJ correspond to a rare area in which the biofilm was more developed.
B. comstat analysis of biofilm structures. Diagrams and standard deviations (numbers indicated in the individual columns) of biomass and substrate coverage from biofilms of E. coli TG1 and TG1 deletion mutants were determined by the comstat program at four different time points (20 h, 45 h, 70 h, 95 h). Values are means of data from 12 image stacks (six image stacks from two independent channels). The biomass is in the units µm3 µ−1 m2. The substratum coverage values are relative (1 represents total coverage).

These results demonstrate the involvement in mature biofilm formation of 30% (20 genes) of the most highly expressed genes identified in our study. Fifty per cent of these genes (10/20) were induced in biofilm versus both exponential and stationary growth phase (cpxP, spy, tatE, lctR, mdh, rbsB, ygiB, yqcC, yceP and yfcX), whereas the other 50% (10/20) were only induced in biofilm versus exponential growth phase (fimA, msrA, recA, yoaB, ycfJ, ycfR, yneA, yccA, yggN and yghO) (see Table 1 and Supplementary material, Table S3).

Biofilm-induced genes are not involved in the early stage of biofilm formation

Failure to form a wild-type mature biofilm could result from an initial adhesion defect. Therefore, we investigated whether the genes identified as overexpressed in mature TG1 biofilms and that impaired mature biofilm formation when mutated were also involved in the early adhesion steps. For this, we tested these mutants in a static microtitre plate-based assay that has been used widely to study the first steps in biofilm formation (Genevaux et al., 1996; O’Toole et al., 1999). With the exception of fimA, the early adhesion capacity of the mutants could not be distinguished from the parental strain (Supplementary material, Fig. S3). This result indicates that most genes overexpressed in mature biofilms are not involved in the early steps of this process and confirms that they participate in mature biofilm functions.

Comparison of E. coli F+/F biofilm global response: general relevance to E. coli biofilm

In this study, we used an E. coli strain carrying a conjugative plasmid, a widespread situation that promotes biofilm formation (Ghigo, 2001; Reisner et al., 2003). To distinguish general features of E. coli biofilms from those specific to our model, we analysed the transcription profile of the E. coli strain TG, an F-free isogenic derivative of TG1. This control is of particular relevance because some of the genes found to be the most overexpressed (pspA, cpxP) have been shown to be related to either the conjugation process (cpx stands for conjugation plasmid expression; McEwen and Silverman, 1980) or stress responses that could correlate with the expression of membrane appendages such as conjugative pili. TG forms a thin and fragile biofilm after 5 days of culture in microfermenters (data not shown). Total RNA was isolated from E. coli TG biofilm and flask planktonic exponential cultures and was subjected to the same macroarray analysis as described for TG1. TG1 and TG biofilms were not strictly comparable in terms of depth and structure (and therefore, possibly, for biofilm-induced responses). As expected, some functions induced in TG1, for instance RecA and part of the SOS stress pathway, were not induced in TG (Table 1), suggesting that F-specific, possibly transfer-related, responses are induced in TG1 biofilm. Despite this fact, 33% of the genes induced in TG1 biofilm by a factor of more than twofold were also found to be statistically significantly overexpressed in TG biofilm (including cpxP, rseA, rseB, spy, psp operon members, tatE and fimA; see Table 1). This demonstrates that many of the biofilm-induced genes identified in this study are F independent and part of a general E. coli K-12 biofilm response.

Envelope stress pathways in E. coli mature biofilm

cpxP is one the most overexpressed genes in E. coli TG1 biofilms versus planktonic growth phase (Fig. 1, Table 1 and Supplementary material, Table S3). cpxP is a target of the cpx two-component system, which is known to respond to a variety of extracytoplasmic stress (envelope stress) (Raivio and Silhavy, 2001).

We therefore investigated the effect of deletion mutations in key components of the cpx pathway on biofilm formation. As shown in Fig. 5, inactivation of the sensor–regulator components of the cpx system (cpxA, cpxR), but also of cpxP and nlpE, affected biofilm formation in microfermenters. A mutation in spy (a biofilm-induced cpxP homologue) has no effect on biofilm biomass. rpoE and rseA mutants displayed a growth rate defect and, consequently, could not be studied in microfermenters. A mutation in rseB, the second antisigma E factor of the RpoE envelope stress pathway, did not affect growth, and an rseB mutant formed a wild-type biofilm. Whereas it is difficult to conclude that the rpoE pathway has a role in biofilm formation, the cpx pathway appears to contribute to biofilm development, based on the morphological effects caused by mutations in several of its key components. Indeed, the biofilms produced by both TG1ΔcpxR and TG1ΔcpxP in microfermenters were very fragile compared with wild-type TG1 biofilms. TG1ΔcpxP biofilm was made of large plaques, in strong contrast to the homogeneous TG1 biofilm (Fig. 6A–C). Consistent with this observation, a detailed electron microscopy analysis revealed that a cpxP mutation greatly altered biofilm macromorphology (Fig. 6D and E). Despite its fragility, no clear structural defect could be detected in the TG1ΔcpxR biofilm (data not shown). Even though slight structural differences could also be seen in the TG1Δspy mutant biofilms, structural alterations were not found in nlpE, cpxA or rseB mutant biofilms grown in microfermenters (data not shown).

Figure 5.

A comparison of biofilm formation capacity of mutants in the E. coli cpx and rpoE envelope stress pathways. Biofilm development comparison of TG1 and TG1 deletion mutants in microfermenters. The average of at least four experiments was plotted in the histogram. The level of biofilm formed by wt TG1 biofilm was set to 100%.

Figure 6.

Phenotypic analysis of the structure of TG1 and TG1 ΔcpxP biofilms grown in a microfermenter.
A. General view of the bottom part of the fermenter.
B. Macroscopic biofilm grown on the internal glass slide, removed from the fermenter shown in (A).
C. Close-up of the biofilm shown in (B).
D. Transverse section of TG1 and TG1 ΔcpxP biofilm.
E and F. Detailed 50× and 10 000× electron micrographs of TG1 and TG1 cpxP biofilm structure.

To investigate further the role of cpxP and cpxR, we introduced a gfp allele into TG1ΔcpxP and TG1ΔcpxR and compared their biofilm formation with that of the parental TG1gfp strain in continuous flow chamber cultures. Single cells and very small colonies were observed on the surface for these two mutants during the initial steps of biofilm development in contrast to the wild type that forms normal three-dimensional colonies (Fig. 4, 20 and 45 h). Furthermore, both cpxP and cpxR mutants were also strongly affected in maturation of the biofilm (Fig. 4). These experiments suggest that stress envelope pathways are involved in the establishment of a structured mature biofilm in E. coli.

Phage shock protein operon (psp) is expressed in response to a variety of environmental and intracellular stresses including processes related to protein insertion in the outer membrane (Weiner and Model, 1994). Although the precise functions of the psp genes are not understood, they help to ensure survival of E. coli in adverse conditions, suggesting that psp genes are part of a stress response operon (Model et al., 1997). In our analysis, pspA and other members of the operon (pspBCDE) were consistently overexpressed in biofilm (Fig. 1, Supplementary material, Tables S1 and S3). Nevertheless, the disruption of the pspABCDE operon did not have a major impact on early (Supplementary material, Fig. S3) or late biofilm formation nor on biofilm structure (data not shown).


In this study, we investigated the differences in gene expression between E. coli K-12 mature biofilm and planktonic laboratory cultures. Using DNA macroarrays, we showed that the biofilm lifestyle, while sharing similarities with the stationary growth phase, triggers the expression of specific sets of genes.

Modifications in E. coli K-12 gene expression induced by the biofilm lifestyle

The use of large-scale fusion technology had already suggested that a significant fraction of the bacterial genome could be involved in biofilm physiology (Prigent-Combaret et al., 1999). Accordingly, Pseudomonas putida and P. aeruginosa biofilm proteome analyses showed that a large number of genes are differentially regulated during biofilm development (Sauer and Camper, 2001; Sauer et al., 2002). In contrast, a transcription profiling of the P. aeruginosa planktonic and biofilm phases led to the conclusion that only 1% of P. aeruginosa genes display more than a twofold difference in gene expression (Whiteley et al., 2001).

In E. coli, Schembri et al. (2003) showed recently that ≈ 5–10% of the E. coli genes exhibited altered microarray expression profiles compared with planktonic growth phases and young biofilm cultures. They hypothesized that this could result from the rather early stages of biofilm development analysed in their study, where the still ongoing switch from planktonic to sessile growth could result in a high level of transient gene expression.

Here, we compared mature biofilms with the planktonic exponential growth phase and showed that, as in the case of mature P. aeruginosa biofilms, only a small fraction (1.9%) of the E. coli genes are differentially expressed by more than a factor of two. However, below that threshold, biofilm formation still leads to the statistically significant differential expression of > 10% of the E. coli genome. These results therefore support the proposal that biofilm formation results in and from significant differences in the overall make-up of bacterial cells (Stoodley et al., 2002; Sauer, 2003).

Mature biofilm cells have been proposed to have stationary growth phase traits such as reduced growth and metabolic activity. To investigate the stationary phase character of bacterial life within biofilm, we also compared the expression pattern of stationary phase cultures with those determined for the exponential growth phase and the mature biofilm. Biofilm-specific genes, i.e. genes differentially regulated in biofilm versus both forms of planktonic phases, correspond to 4% of the genome (118 over- and 53 underexpressed/4290), and this proportion decreases to < 1% (0.67%; 23 over- and six underexpressed/4290) for genes varying by a factor of more than two. When one only considers the genes induced in response to the stationary growth phase character of the biofilm lifestyle, these genes represent 3% of the genome. The biofilm lifestyle, while sharing similarities with the stationary growth phase, thus triggers the expression of specific sets of genes.

Functional profiling of the biofilm-induced genes

The biological importance of the differential gene expression exhibited upon biofilm versus planktonic growth was tested by the disruption of the majority of the highly induced genes in biofilms, including all biofilm-specific induced genes. We show that, although the mutants were not impaired in initial steps of adhesion to surfaces (with the exception of fimA), a third of them (20 genes) were affected in biofilm maturation (Table 1, Figs 3 and S2). This high proportion of genes involved in biofilm maturation strongly supports the pertinence of our analysis. Among these 20 genes, half correspond to biofilm-specific genes, whereas the other half was only induced in biofilms versus exponential growth phase (see Tables 1 and S3). This indicates that the development of a full mature biofilm requires not only biofilm-specific genes but also genes related to the stationary phase character of the biofilm. The individual role of some of these newly identified genes is currently being investigated.

Biofilm-related physiological functions

We show that genes found to be the most overexpressed in TG1 biofilm versus exponential growth phase were also part of the E. coli F-free biofilm response, therefore indicating that genes identified in this study are involved in the general response developed in mature E. coli K-12 biofilms. Those genes are not distributed randomly into all potential functional classes. Instead, they display a strong bias towards specific functional categories, and we propose that they are part of the biofilm genetic signature. Genes with expression that is required for full maturation of TG1 biofilm belong to functions linked to adhesion (fimA, msrA), energy metabolism (rbsB, mdh, lctR), transport (tatE), general stress (recA) and envelope stress response (cpxP and spy). However, it is likely that many genes identified in our study are not specifically involved in biofilm-specific functions but, rather, correspond to adaptive responses to the biofilm environment. Mutations in many biofilm-induced genes that also correspond to information storage and processing, metabolism, cellular processes and unknown functions indeed have no effect on TG1 biofilm formation (Table 1).

Moreover, 48% of the genes significantly overexpressed in biofilms versus exponential growth phase were of uncharacterized function. Compared with 19.6% of such genes found in the E. coli genome (Serres et al., 2001), this high proportion of genes of unknown functions expressed in mature biofilm suggests that new aspects of E. coli biology are adopted during biofilm formation. We show that 11 of these uncharacterized genes are necessary for full mature biofilm formation, thus experimentally assigning them a biofilm-related function (Table 1, Figs 3 and S2). Among them, five encode putative membrane proteins that could be of particular relevance when considering the importance of envelope-related physiology within a biofilm.

Consistent with the drastic phenotypical changes occurring inside biofilms, we found that 15% of the genes identified as over- or underexpressed in biofilms versus exponential growth phase are involved in either energy processes or carbohydrate metabolism (Fig. 1, Tables 1 and S1). Despite the presence of polysaccharides in the TG1 biofilm (data not shown), we could not clearly associate the expression of any of those genes with the production of the biofilm matrix (i.e. cellulose, colanic acid). This could reflect, among other explanations, a lack of sensitivity of our approach as a result of the averaging occurring while extracting transcription information from the heterogeneous bacterial biofilm population.

A partial comparison of the most overexpressed genes in our analysis (by a factor of > twofold) and in the study by Schembri et al. (2003) (by a factor of > eightfold) only revealed a few genes identified as overexpressed in E. coli biofilm in both studies (rbsB, b0836, yfjO, yceP, glgS, ydeW, yneA, yqeC, ylcC, rplV, rplD, rpsS, b1550, rplP, rpsR, flu, rplM, ppc, oppA, gatD, cydA, atpB, rpsN, malK, atpG). Three of these genes (rbsB, yceP, yneA) were nevertheless also found here to be required for mature biofilm formation. This relatively low overlap between the two studies may result from technical differences. Different scenarios were used in terms of strain background, media and experimental setup. This could also reflect the difference in the gene expression pattern between two biofilms at very different stages of maturation [i.e. young and thin biofilms in Schembri et al. (2003) versus mature and thick biofilms]. Further studies comparing the expression profile of E. coli biofilms at different maturation stages within the same experimental setup will provide a more dynamic view of biofilm gene expression.

Heterogeneity of oxygen conditions in E. coli K-12 biofilms

Biofilms are heterogeneous environments and, with respect to aerobiosis, our analysis supports these results. In the main experiment described in this study, we compared exponentially grown agitated flask cultures with TG1 biofilm in aerated conditions. Under these conditions, numerous genes known to be induced by aerobiosis were also induced in biofilms, including some genes for TCA cycle enzymes (e.g. aceB, cyo operon members, fadB, mdh, glpD, sucAB). In addition, some genes known to be repressed by aerobiosis were repressed in biofilms (e.g. adhE, cydAB, dcuC, focA, fumB). This tends to indicate that our biofilms were mainly grown under aerobic conditions. Consequently, we also compared differential gene expression between TG1 biofilms and TG1 planktonic cultures, both grown in aerated fermenters (data not shown). In this configuration, we clearly observed that some typical aerobic genes were induced in biofilms, whereas others were repressed. This was also the case for typical anaerobic genes. This could reflect the heterogeneity of the aerobic conditions in biofilms, in which external bacteria are in contact with oxygen while internal bacteria are in conditions close to anaerobiosis.

Stress responses in biofilms

Our study revealed that a major physiological response to biofilm formation is the induction of stress responses. Interestingly, such a stress response induction may also take place in P. aeruginosa biofilms. Indeed, the most highly activated genes identified in a P. aeruginosa biofilm transcriptome analysis were those of temperate bacteriophages (Whiteley et al., 2001). As stresses are known to induce prophages and other mobile genetic elements, our results suggest that Pseudomonas prophage induction may be a consequence of stresses created by the drastic conditions that prevail inside the biofilm. As such, stress may well be a key factor in the mechanisms that lead to the observed antibiotic resistance inside biofilm communities.

Owing to the possible role of cell–cell and cell–surface interactions in biofilm, it may be of significance that envelope stress genes such as cpxP, spy and the psp genes are consistently induced in this environment. CpxP may inhibit the cpx-mediated induction through a direct interaction with the two-component system sensor CpxA, whereas Spy may play a similar role in the rpoE pathway (Raivio et al., 2000). The cpx system is known to respond to envelope stresses such as overproduction and misfolding of membrane proteins or elevated pH (Raivio and Silhavy, 2001). However, relatively little is known about the physiological role of envelope stress responses. Recently, adhesion of E. coli cells to hydrophobic but not hydrophilic surfaces was shown to activate the cpx system, including cpxP, through a process called surface sensing, which requires both cpxR and nlpE (Otto and Silhavy, 2002). Consistently, we find that cpxP and spy are highly induced in mature biofilms where bacteria are de facto in contact with the hydrophobic surfaces of other cells.

Our results thus provide additional experimental evidence that stress response pathways are key factors in biofilm formation. The structure of biofilms grown in microfermenters is altered in a cpxP mutant (Fig. 6) and, to a lesser extent, in a spy mutant. Observation of spy mutant biofilms by transmission electron microscopy also revealed a high proportion of spheroblasts compared with wild-type TG1 (data not shown), suggesting a possible cause for the affected structure of the biofilm in this mutant. In addition, a cpxP and a cpxR mutant are both impaired in forming wild-type microcolonies (Fig. 4). This strongly corroborates the idea that cpxP and cpxR mutants have reduced cell-to-cell adherence, as any growth up in the water column will be counteracted by the shearing forces of the flow. It appears, then, that the inappropriate expression of the cpx-regulated genes in biofilm, i.e. a derepression of the cpx regulon in the cpxP mutant or an absence of induction of the cpx regulon in the cpxR mutant, leads to an alteration in the process of biofilm formation. Considering the importance of environmental conditions in biofilm formation, two-component systems, which sense perturbations or changes in the bacterial environment, might play a regulatory role in bacterial biofilm formation, a proposal that requires further investigation.

Our analysis identified the biofilm mode of growth as an environment that induces the expression of the pspABCDE stress operon. However, no biofilm-related phenotype could be observed in a strain deleted for the pspABCDE operon. Nevertheless, the deletion of pspF, a constitutively expressed positive regulator of the pspABCDE operon, affects biofilm formation (Fig. S2). As pspABCDE is not required for biofilm formation, pspF might also regulate a biofilm-related locus that is not part of pspABCDE operon. Evidence for such an additional PspF-regulated target has been provided in the case of the Yersinia enterolitica psp regulon (Darwin and Miller, 2001).

Changes in gene expression and biofilm development

The changes in gene expression demonstrated here and in other studies could be considered as either part of the E. coli biofilm development (needed for maturation) or caused by the conditions progressively created within the biofilm during its maturation (consequence of the maturation). The first hypothesis implies that biofilm formation is a developmental process in which genetic checkpoints could control the maturation of the biofilm by inducing a succession of biofilm-specific genes. Whereas eight mutations out of 54 mutants created in this study display a 50% decrease in biofilm biomass and maturation, none of them leads to a total loss of biofilm formation. Considering the existence of multiple and partially overlapping or complementing pathways that can lead to biofilm formation, this result, without formally excluding the existence of a biofilm developmental programme, rather speaks in favour of the second working hypothesis. In this case, most changes observed in biofilm gene induction could be a consequence of, rather than a prerequisite for, biofilm maturation.

The results presented here provide new insights into the global effect triggered by biofilm formation in E. coli. By monitoring the changes in gene expression occurring in mature biofilms, we have identified biofilm-related physiological pathways and previously uncharacterized biofilm-induced genes. This may lead to new biofilm control strategies that will probably hinge upon a better understanding of biofilm-induced physiological responses.

Experimental procedures

Bacterial strains and culture conditions

Bacterial strains used in this work are described in Table 2. All experiments were performed in 0.4% glucose M63B1 minimal medium at 37°C except flow chamber experiments, which were performed at 30°C in 0.02% glucose FAB minimal medium. Proline was added at 400 µg ml−1 for TG growth.

Early adhesion and biofilm formation assay

Microtitre plate assays were performed as described by O’Toole and Kolter (1998). Biofilm development comparisons in aerated microfermenters were conducted as described by Ghigo (2001). The biofilms formed on the removable glass slide were photographed and then resuspended in 10 ml of M63B1 minimal medium. The optical density at 600 nm (OD600) of the resuspension was then measured. After 24 h, the average resuspended E. coli TG1 biofilm biomass reached OD600 = 5. Each mutant was tested in at least three independent experiments alongside the control strain TG1.

Macroarray analysis

Genomic expression profiles were performed on E. coli TG1 and TG strains grown in 0.4% glucose M63B1 at 37°C as either planktonic cultures or mature biofilms. Planktonic cultures were realized in agitated Erlenmeyer flasks (main experiment) or aerated microfermenters, both in exponential phase (OD600≈ 0.6) or stationary phase (OD600≈ 3). Mature biofilms were grown in aerated microfermenters (8- and 5-day-old biofilms for TG1 and TG respectively). For all conditions, the equivalent of 15 OD600 of bacterial cells was collected. The cells were then broken in a Fast Prep apparatus (Bio 101). Total RNA was extracted by Trizol (Gibco BRL) treatment. Genomic DNA was degraded using the DNA-freetm kit (Ambion). Radioactively labelled cDNAs, generated using E. coli K-12 CDS-specific primers (Sigma-GenoSys), were hybridized to E. coli K-12 panorama gene arrays containing duplicated spots for each of the 4290 predicted E. coli K-12 open reading frames (ORFs; Sigma-GenoSys). The intensity of each dot was quantified with the xdotsreader software (Cose) as described by Hommais et al. (2001). Experiments were carried out using three independent RNA preparations of TG1 planktonic flask cultures versus TG1 biofilm. For the F-free TG experiment and the TG1 planktonic fermenter versus TG1 biofilm experiments, two independent RNA preparations were used. Each hybridization with each independent sample was carried out with 1 µg and 10 µg of total RNA. Comparison of the signal intensity of arrays from duplicates or from independent hybridizations showed that the results were highly reproducible (data not shown).

Statistical analysis of the macroarray data

Genes that were statistically significantly over- and underexpressed were identified using the non-parametric Wilcoxon rank sum test. For each gene, the expression in E. coli TG1 flask exponential and stationary planktonic cultures (n = 10 and n = 12 respectively), TG flask planktonic cultures (n = 4), TG1 fermenter planktonic culture (n = 4) and TG1 biofilm (n = 10) or TG biofilm (n = 4) were compared. Analyses were performed with one-tailed tests. Genes were considered to be statistically significantly over- or underexpressed when P < 0.05. Low (<0.01) or negative levels of expression were removed from the analysis.

Disruption of genes identified through macroarray analysis

fimA, msrA, recA, cpxA and pspF mutants were transferred to TG1 by P1 transduction. For the other genes, a non-polar mutation that deletes the entire target gene from the initiation to the stop codon was created by allelic exchange with the non-polar aphA gene cassette from Tn903. We used a three-step PCR procedure as described by Chaveroche et al. (2000) and Derbise et al. (2003) and detailed at

The primers used to inactivate the 54 genes presented in this study, as well as nlpE and cpxR genes, are described in the Supplementary material (Table S4).

Quantitative RT-PCR

Quantitative RT-PCR was used to confirm the DNA macroarray data. Total RNAs used for macroassay were used for real-time PCR and RT-PCR. PCR and RT-PCR were performed using a light cycler (Roche Diagnostics). The RNA preparation was subjected twice to DNase I (Roche Diagnostics) treatment for 30 min at room temperature to remove any contaminating genomic DNA. The enzyme was then inactivated for 15 min at 65°C in the presence of 2.5 mM EDTA. Samples were checked for residual genomic DNA by real-time PCR using the cpxP-RT-5 and cpxP-RT-3 primers (see Supplementary material, Table S5). Reactions were performed in a 20 µl reaction volume using LightCycler FastStart DNA master SYBR Green I (Roche Diagnostics) according to the manufacturer's instructions. RNA samples were considered to be free of genomic DNA if no amplification was detected after at least 35 cycles of amplification. Quantitative RT-PCRs were performed twice with two independent RNA preparations and using primers specific for several biofilm upregulated genes (see Supplementary material, Table S5) or control 16S rDNA primers (TM1, 5′-ATGACCAGCCACACTGGAAC-3′; and TM2, 5′-CTTCCTCCCCGCTGAAAGTA-3′) with 50 ng of total RNA. Control 16S rDNA primers were always used to ensure the same quantity of total RNA in each reaction sample. Quantification of mRNA or 16S rRNA (as control) was done using RNA master SYBR Green I (Roche Diagnostics) according to the manufacturer's instructions. Amplification of a single PCR product was confirmed by fusion curve analysis and electrophoresis on 2% agarose gels.

Construction of GFP-tagged strains

The strain TG1gfp was constructed by integration at the λ-att site of a bla-gfpmut3 cassette amplified from plasmid pZER1-GfpSal using a three-step PCR procedure (Table S4) as described by Chaveroche et al. (2000) and Derbise et al. (2003). Plasmid pZER1-GfpSal was a gift from C. C. Guet, in which the gfpmut3 gene (Cormack et al., 1996) is controlled by the lambda right promoter. Strains TG1gfpΔycfJ, TG1gfpΔyccA, TG1gfpΔcpxP and TG1gfpΔcpxR were constructed by P1vir transduction into TG1gfp.

Flow chamber experiments

Biofilms were cultivated at 30°C in three-channel flow cells with individual channel dimensions of 1 × 4 × 40 mm. The flow system was assembled and prepared as described previously (Christensen et al., 1999). A microscope glass coverslip (Knittel 24 × 50 mm st1; Knittel Gläser) was used as substratum for biofilm growth.

Inocula were prepared as follows: 16–20 h overnight cultures in LB supplemented with the appropriate antibiotics were harvested and resuspended in 0.9% NaCl; 250 µl of OD600-normalized dilutions in 0.9% NaCl (OD600 = 0.05) were injected into each flow channel after medium flow was arrested. Flow was started 1 h after inoculation at a constant rate of 3 ml h−1 using a Watson Marlow 205S peristaltic pump.

Microscopy and image analysis

Biofilm development in microfermenters was recorded with a Nikon Coolpix 950 digital camera. Transmission and scanning laser electronic microscopy were performed on biofilm grown in microfermenters on thermanox slides (Nalgene) attached to the internal removable glass slide and treated as described by Prigent-Combaret et al. (2000).

For flow chamber experiments, microscopic observations and image acquisitions were performed on a Zeiss LSM510 scanning confocal laser microscope (Carl Zeiss). Images were obtained using a 40 × /1.3 Plan-Neofluar oil objective. Simulated three-dimensional images were generated using the imaris software package (Bitplane). Images were processed further for display using Adobe photoshop. For comstat analysis (Heydorn et al., 2000) and quantification of the E. coli biofilm development with the wild type and the different mutants, each strain was grown in two separate channels, and six image stacks were acquired randomly down through each channel at different time points (20 h, 45 h, 70 h and 95 h after inoculation).


We thank Claude Lebos for preparation of the SEM micrographs. We are grateful to E. Krin for assistance in macroarray procedure. We thank T. Pugsley and C. Dorel for the kind gift of some strains used in this study, and A. Idja, J. Bellalou and R. Longin (PT5, Institut Pasteur) for technical assistance. We thank S. Da Re, B. Lakowski, C. Buchrieser, U. Dobrindt, T. Msadek, I. Lasa, M. Swanson and P. Delepelaire for critical reading of the manuscript. We also thank the referees for their helpful suggestions. This work was supported by grants from the ‘PRFMMIP – Réseau Infections Nosocomiales’ and the Institut Pasteur.

Supplementary material

The following material is available from

Fig. S1. COG functional classes for genes underexpressed in TG1 biofilm versus exponential growth phase.

Fig. S2. Functional profiling of mature E. coli biofilm: biofilm formation in microfermenters.

Fig. S3. Functional profiling of early steps in E. coli biofilm formation.

Table S1. Genes overexpressed in E. coli TG1 biofilm versus exponential growth phase.

Table S2. Genes underexpressed in E. coli TG1 biofilm versus exponential growth phase.

Table S3. Genes overexpressed (≥2) in E. coli TG1 biofilm versus both exponential and stationary growth phase.

Table S4. Inactivation of the genes described in this study and TG1gfp strain construction: primers used in the linear DNA, three-step PCR inactivation protocol.

Table S5. Primers used for the Q-RT-PCR experiments.