Correspondence: Jeff Cole, School of Biosciences, University of Birmingham, Birmingham B15 2TT, UK. Tel.: +44 121 414 5440; fax: +44 121 414 5495; e-mail: firstname.lastname@example.org
A C-terminal green fluorescent protein (GFP) fusion to a model target protein, Escherichia coli CheY, was exploited both as a reporter of the accumulation of soluble recombinant protein, and to develop a generic approach to optimize protein yields. The rapid accumulation of CheY∷GFP expressed from a pET20 vector under the control of an isopropyl-β-d-thiogalactoside (IPTG)-inducible T7 RNA polymerase resulted not only in the well-documented growth arrest but also loss of culturability and overgrowth of the productive population using plasmid-deficient bacteria. The highest yields of soluble CheY∷GFP as judged from the fluorescence levels were achieved using very low concentrations of IPTG, which avoid growth arrest and loss of culturability postinduction. Optimal product yields were obtained with 8 μM IPTG, a concentration so low that insufficient T7 RNA polymerase accumulated to be detectable by Western blot analysis. The improved protocol was shown to be suitable for process scale-up and intensification. It is also applicable to the accumulation of an untagged heterologous protein, cytochrome c2 from Neisseria gonorrhoeae, which requires both secretion and extensive post-translational modification.
Many biopharmaceutical projects require the production of recombinant proteins in heterologous bacterial hosts. When the protein itself is the end product, the requirement for rapid, bulk production in high yield drives the design of a successful process. At the other extreme, high-quality protein is often required as the starting point for NMR or X-ray crystallographic structure determination, or for understanding the biology underlying a process. Quality rather than quantity or production intensity now becomes the overriding requirement.
The ability to express almost any gene at a controllable level makes bacterial hosts and plasmids attractive vehicles for generating the desired product. Despite the availability of a plethora of expression systems, detailed knowledge of the genome sequences, molecular biology, physiology and biochemistry of a range of production hosts, many proteins remain difficult to produce at the scale or quality required. Frequently encountered problems include the deposition of the target protein in insoluble inclusion bodies (Villaverde & Carrio, 2003), lysis of the production host due to the physiological stress induced by high-level synthesis of mRNA and the heterologous protein (Gill et al., 2000), and accumulation of multiple fragments of the target protein due to proteolysis (Dürrschmid et al., 2008).
The primary cause of many of the problems is the accumulation of incorrectly folded intermediate forms of the target protein. In bacteria such as Escherichia coli, this is a signal that induces not only the general stress response, but also other overlapping stress responses (Hoffmann & Rinas, 2004; Gasser et al., 2008). If the correct folding of the target protein is the only problem to be solved, overexpression of chaperones might be sufficient to achieve success, for example, by preinducing the RNA polymerase RpoH regulon with a heat shock (Hoffmann & Rinas, 2004), or the coexpression of groEL, dnaK or other chaperone genes (Nishihara et al., 1998; Chen et al., 2003; Schrodel et al., 2005; Mitsuda & Iwasaki, 2006; de Marco, 2007; Hu et al., 2007). More often, however, failure is due directly to the consequences of the induction of the RpoH-dependent stress response (Rabhi-Essafi et al., 2007; Vera et al., 2007; Lin et al., 2008) especially the disaggregation complex in which DnaK, ClpB and IbpAB remove aggregated recombinant proteins for proteolysis (reviewed by Gasser et al., 2008). We now report results of experiments designed to analyse the physiological cause of failure to accumulate a soluble, cytoplasmic recombinant protein, and the design of a generic strategy to minimize the problem. Although our main model system is based upon the production of the E. coli chemotaxis protein CheY, fused with a C-terminal green fluorescent protein (GFP) tag, we show that a similar approach can be used to accumulate an untagged recombinant protein that requires both secretion to the periplasm and extensive post-translational modification.
Materials and methods
Escherichia coli strain and plasmids
The E. coli strains BL21*(DE3) (Invitrogen) and its derivatives, C41 and C43 (Miroux & Walker, 1996) were used for recombinant protein expression work. Escherichia coli strain JM109 (Promega) was used to clone the cytochrome c2 gene from Neisseria gonorrhoeae. The overexpression of CheY∷GFP fusion gene or the gonococcal gene encoding cytochrome c2 from N. gonorrhoeae was induced from the isopropyl-β-d-thiogalactoside (IPTG)-inducible T7 promoter of the expression vectors pET20bhc-CheY∷GFP and pET20bhc-c2, respectively, both of which are derived from a slightly modified version of pET20b (Novagen; Waldo et al., 1999; Jones et al., 2004).
The cccA gene (accession number NGO0292) encoding cytochrome c2 was amplified from N. gonorrhoeae strain F62 genomic DNA using primers CTACGTCATATGAACACAACCCG and CATAGGGATCCTTAGAAAGGTTTGATTTG (incorporating NdeI and BamHI restriction sites, respectively, shown in bold type) and PCR SuperMix High Fidelity (Invitrogen) according to manufacturer's instructions. The thermal cycling profile included one cycle at 94 °C for 3 min, 10 × (94 °C for 30 s, 40 °C for 30 s, 68 °C for 1 min), 27 × (94 °C for 30 s, 55 °C for 30 s, 68 °C for 1 min) and 1 × 68 °C for 10 min. The 465 bp PCR product was cloned into pGEM-T Easy (Promega), sequenced and the cccA fragment was transferred as an NdeI–BamHI fragment into pET20bhc vector digested with NdeI and BamHI. In experiments to accumulate mature cytochrome c2 from N. gonorrhoeae, bacteria were cotransformed with the second plasmid, pST2, that encodes the E. coli cytochrome c maturation proteins, CcmA-H (Turner et al., 2003).
Growth conditions for the standard protocol
Bacteria were grown aerobically in 100-mL shake flasks with 20 mL working volume or in a 3.6-L bioreactor (Infors) with a 2.8 L working volume of Luria–Bertani (LB) medium (10 g tryptone, 5 g yeast extract, 5 g NaCl) supplemented with 2% (w/v) glucose and 100 μg mL−1 carbenicillin for plasmid maintenance. The shake flasks were set up in duplicates or triplicates per culture condition. In fermentations, aeration was maintained at 1 vvm and a stirring speed of 700 rpm. The pH was controlled at 6.3 by the automated addition of 5% (v/v) HCl and 10% (v/v) NH3, and 0.1% (v/v) silicone antifoam was added in the medium to prevent foaming of the culture during the late stages of the fermentation. The medium was inoculated with 2% (v/v) of seed culture grown aerobically at 30 °C for approximately 14 h. Bacteria were grown at 37 °C to an OD650 nm of approximately 0.5 at which recombinant protein expression was induced with 0.5 mM IPTG. Bacteria were grown at 25 °C thereafter to facilitate correct folding of the recombinant protein and, for the bioreactor, threonine, serine and asparagine were added at a final concentration of 1 mM 5 h postinduction. Culture samples were taken before induction and at various intervals up to 25 h postinduction. The OD650 nm and fluorescence of the culture were measured using serial dilutions with phosphate-buffered saline (Sambrook et al., 1989) and were used to determine the bacterial biomass and the yield of soluble CheY∷GFP, respectively.
Growth conditions for the improved protocol
Bacteria were inoculated and grown in the same medium as that used for the standard protocol. However, bacteria were grown aerobically at 25 °C to an OD650 nm of approximately 0.5 at which point recombinant protein production was induced with 8 μM IPTG. Bacterial growth was continued at 25 °C for up to 70 h postinduction.
A seed culture (2% v/v) of BL21*(DE3)/CheY∷GFP was used to inoculate a 3.6-L bioreactor containing 1.5 L LB supplemented with 0.5% glucose and carbenicillin (100 μg mL−1). Threonine, serine and asparagine were added at a final concentration of 1 mM 5 h postinduction. Bacteria were grown and induced according to the improved protocol, with aeration of 6 vvm and a stirring speed of 900 r.p.m. Feeding at a constant rate of 20 mL h−1 was started 6–8 h postinduction. The 0.6 L of feed contained 10 × NaCl-free LB, 20% glucose, 100 μg mL−1 carbenicillin, 8 μM IPTG and 10 mM serine, threonine and asparagine. The pH was maintained at 6.3 by the automatic addition of 10% NH3.
Accumulation of recombinant cytochrome c2 from N. gonorrhoeae in E. coli
Escherichia coli BL21*(DE3) (Invitrogen) containing pET20bhc-c2 and pST2 was used to produce mature cytochrome c2 in an anaerobic bioreactor. The fermentation medium contained 50% LB and 40% minimal salts, which contained, per litre of distilled water: 4.5 g KH2PO4, 10.5 g K2HPO4, 1 g (NH4)2SO4, 0.5 g sodium citrate, 0.05 g MgSO4·7H2O, 1 μM ammonium molybdate, 1 μM sodium selenate and 0.1%E. coli sulphur-free salts (per 1 L of distilled water: 82 g MgCl2·7H2O, 10 g MnCl2·4H2O, 4 g FeCl2·6H2O, 1 g CaCl2·6H2O, and 20 mL concentrated HCl). This medium was supplemented with 10 mM trimethylamine N-oxide dihydrate, 10 mM sodium fumarate, 2% glucose, 0.1% silicone antifoam, 100 μg mL−1 carbenicillin and 30 μg mL−1 chloramphenicol. The bioreactor was inoculated with 4% of a seed culture that had been grown for 16 h at 30 °C with aeration. The culture in the bioreactor was grown at 30 °C at 100 r.p.m. stirring speed without aeration; pH was controlled at 6.3 with 5% HCl and 10% ammonia. Protein expression was induced by adding 10 μM IPTG at OD650 nm of approximately 0.5. The bacterial culture samples were taken before induction and at various time points for up to 24 h postinduction.
Plating and replica plating for plasmid retention
To test the effect of recombinant protein production on the culturability of the bacterial host, serial dilutions of the bacterial culture were plated onto nonselective nutrient agar (Oxoid) and incubated at 30 °C. The proportion of plasmid-bearing bacteria was estimated by replica plating the resultant colonies on nutrient agar supplemented with carbenicillin (100 μg mL−1).
Analysis of recombinant protein accumulation by sodium dodecyl sulphate-polyacrylamide gel electrophoresis (SDS-PAGE)
Proteins were resolved by Tris/Tricine SDS-PAGE using a 15% (w/v) polyacrylamide gel (Sambrook et al., 1989) and stained with 0.2% (w/v) Coomassie Blue. Total protein was analysed from whole cell samples resuspended in 67 μL of sample buffer per OD650 nm unit so that the biomass per volume for all samples was the same. The samples were heated to 100 °C for 10 min before loading at the same volume. Recombinant protein yield was estimated by densitometry using the quantity onesoftware (Bio-Rad).
The yield of soluble and insoluble recombinant protein accumulated was determined from fractionated bacterial samples. Bacterial cell pellets were resuspended in BugBuster lysis reagent (67 μL of BugBuster per OD650 nm unit) (Novagen) and incubated at room temperature for 10 min with gentle shaking. The soluble and insoluble cell fractions were separated by centrifugation at 8500 g at 4 °C for 20 min. The separated fractions were resuspended in the same volume of sample buffer as the volume of BugBuster used for lysis and boiled at 100 °C for 10 min. To ensure that the samples contain equal biomass, two volumes of the soluble and one volume of the insoluble cell fractions were loaded and analysed by SDS-PAGE.
Proteins containing covalently attached haem were detected using haem-dependent peroxidase activity (Thomas et al., 1976).
Fluorescence of culture samples taken at various intervals throughout the experiment were measured using a Perkin-Elmer fluorescence spectrophotometer model 203 at settings that allowed accurate readings from a calibration curve. The accumulation of the soluble CheY∷GFP was detected using an excitation wavelength of 485 nm and an emission wavelength of 509 nm. The same culture diluted in phosphate-buffered saline (PBS; Sambrook et al., 1989) was used for OD measurement and for fluorescence measurement.
The proportion of green fluorescent bacteria overproducing the GFP-tagged recombinant protein in the culture and their physiological state were analysed by flow cytometry (BD FACSAria II: Becton, Dickinson & Co.). Bacteria were diluted in PBS at a final concentration of 105–106 mL−1 and analysed at a data rate of 1000–2000 events s−1. An 85 μm nozzle was used for the analysis. The red fluorescent dye propidium iodide (PI, Sigma) was added to the samples to stain dead bacteria. The PI stock solution was made up at 1 mg mL−1 in distilled water and used for staining at the working concentration of 5 μg mL−1 (Hewitt et al., 1999). All solutions were passed through a 0.2-μm filter immediately before use to remove particles. The backflush cleaning was applied between samples to prevent cross-contamination. The sample was excited with a 488-nm solid-state laser (13 mW). The software discriminator was set on the forward scatter to reduce electronic and small particle noise. Forward and side scatter data were collected along with GFP fluorescence (502LB, 530/30BP) and PI fluorescence (610LP, 616/23BP). For each experiment, 100 000 data points were collected and analysed using bd facsdiva software (BD Biosciences).
For Western analysis, culture samples were resuspended in sample buffer as described above (SDS-PAGE section) and loaded onto NuPAGE 4–12% Bis–Tris gel (Invitrogen). The proteins were transferred onto Hybond-ECL nitrocellulose membrane (Amersham) in the Xcell II blot module (Invitrogen). The blots were incubated with T7 RNA polymerase antibodies (Novagen) and then with peroxidase-conjugated anti-mouse immunoglobulin G (Amersham) according to the manufacturers' instructions. The blots were developed using EZ-ECL Chemiluminescence detection kit (Biological Industries) according to the provided protocol.
Production of CheY∷GFP using a standard protocol
Many laboratories use the commercially available pET plasmids to express a cloned gene in an E. coli host under the control of T7 RNA polymerase that is both chromosomally encoded and regulated by an IPTG-inducible promoter. This system was used in initial experiments to produce a 42-kDa CheY∷GFP fusion protein (Jones, 2007). It had been observed that, in contrast to N-terminal GFP fusion proteins, optimization of fermentation conditions for the accumulation of a fluorescent recombinant protein with a carboxy-terminal GFP fusion provides a good prediction of how to generate the correctly folded N-terminal target protein without a fusion tag. Expression of the gene encoding CheY∷GFP cloned into plasmid pET20bhc-CheY∷GFP was induced with 0.5 mM IPTG at a low biomass density of 0.2 g dry mass L−1, and the temperature was decreased from 37 to 25 °C. Samples of the culture taken before induction and at intervals for 24 h postinduction were analysed for growth, fluorescence, plasmid retention, colony-forming ability and the accumulation of recombinant protein in both soluble and insoluble cell fractions.
The OD of the culture increased only slowly soon after IPTG addition, but growth resumed after a lag of between 10 and 14 h for up to 24 h postinduction. Plating of serial dilutions of samples taken 2–4 h postinduction revealed that only about 1% of the bacteria were able to form colonies on nonselective agar, but high plating efficiency was restored after 24 h (Fig. 1a). In contrast to colonies from samples taken before induction that were pale green due to leaky expression of the recombinant protein, colonies from samples taken 24 h postinduction were white. This was readily shown to be due to overgrowth of the population by plasmid-free bacteria (Fig. 1a). Consistent with these results, SDS-PAGE analysis revealed a rapid burst of CheY∷GFP synthesis immediately postinduction (Fig. 1b), but little increase after a further 2–4 h. Furthermore, about 80% of the CheY∷GFP fusion protein accumulated in inclusion bodies in the insoluble fraction (Fig. 1b), which was almost nonfluorescent, indicating that GFP was incorrectly folded and therefore inactive.
Analysis of samples by flow cytometry revealed that the population preinduction was relatively homogenous but moderately fluorescent, reflecting the leakiness of the pET promoter (Fig. 2a). Fluorescence had increased substantially within 3 h of IPTG addition. However, the small population of nonfluorescent bacteria already present in the culture increased from 4% to 12% postinduction. After 25 h, only a minority of the bacteria in the culture were fluorescent due to overgrowth by unproductive, plasmid-free bacteria. Around 20% of the population were permeable to PI, indicating loss of viability, though these were split equally between fluorescent and nonfluorescent bacteria.
Optimization of soluble CheY∷GFP production by minimizing postinduction growth arrest
Although the consequences of rapid overexpression of cloned genes are well documented, less reported is the loss of CFUs during recombinant protein production (Striedner et al., 2003; Sundström et al., 2004). We therefore investigated whether it was possible to optimize the IPTG concentration not on the basis of quantity or speed of recombinant protein production, but on the maximum level of fluorescence under conditions that greatly decreased the general stress response. This involved growing the culture at the same temperature both before and after induction to avoid any stress caused by a change in temperature; and the determination of the concentration of the inducer, IPTG, that would allow maximum GFP fluorescence 24 h postinduction. Based upon these criteria, optimal results were obtained with cultures grown at 25 °C and induced with 8 μM IPTG, which had only a slight effect on exponential growth and avoided selection of plasmid-free bacteria (Fig. 1c and d). Flow cytometry analysis showed that fluorescence continued to increase for at least 25 h (Fig. 2b), and much greater homogeneity in the culture: around 98% of bacteria were in the GFP+ population both before induction and after 3 and 25 h postinduction, the number of nonviable PI+ bacteria at 25 h was greatly reduced compared with the original protocol, and the GFP− population actually decreased in size from 2% preinduction to 0.7% after 25 h (Fig. 2b). Analysis of samples from the culture by SDS-PAGE revealed that about 90% of the CheY∷GFP had accumulated throughout the induction phase in the soluble protein fraction, in contrast to <20% soluble product from the standard protocol (Fig. 1e). Furthermore, the yield of recombinant protein was fourfold higher from the improved protocol than from the standard protocol due to the production of a higher yield of biomass (Table 1). Control experiments established that the level of fluorescence attained by the uninduced BL21*(DE3)/CheY∷GFP cultures was typically at least twofold lower than that in the low IPTG induction protocol (not shown). Finally it was demonstrated that much higher yields of product could be generated following prolonged expression in fed-batch cultures (Table 1).
Table 1. Yields of recombinant CheY::GFP from different types of culture
Specific fluorescence (fluorescence units/OD650 nm)
Total fluorescence (U)
Biomass was calculated on the assumption that a culture with an OD650 nm of 1.0 contains 0.4 g dry mass L−1.
† The percentage of recombinant protein was estimated from SDS-PAGE gels by densitometry.
Recombinant protein yield was estimated based on the assumption that 70% of the bacterial culture dry mass is protein.
Molecular basis for the increased accumulation of CheY∷GFP
Aware of the stress on the bacterial host associated with the use of the BL21/pET system, Miroux & Walker (1996) isolated mutants that were resistant to stress and therefore continued to accumulate recombinant protein far longer than their parent strain. Two of these strains were called C41 and C43, and it was subsequently shown that the basis for the improved performance was a down-mutation of the promoter of the T7 polymerase gene (Wagner et al., 2008). To determine whether the low concentration of IPTG coupled with the low expression temperature might simply limit production of the T7 RNA polymerase and hence explain why the improved protocol was successful, Western blots of samples from both the standard and improved protocols were probed with anti-T7 RNA polymerase antibody (Fig. 3). In contrast to the strong bands of cross-reacting antigen from the standard protocol, so little T7 polymerase was produced using the improved protocol that it was not visible.
We then compared yields of CheY∷GFP from strain BL21*(DE3) using the improved protocol with those from strains C41 and C43 generated using the standard protocol (Fig. 4). The level of fluorescence (per unit volume) from the improved protocol was the same as that from C41 and considerably higher than from C43 (not shown). The specific fluorescence (per unit biomass) at the point of harvest was highest for BL21*(DE3) using the improved protocol (Fig. 4a) and its cell density was only slightly lower than that for strain C41 (Fig. 4b).
High-level production of a secreted recombinant c-type cytochrome using the improved protocol
The model protein, CheY∷GFP, is a soluble, cytoplasmic protein. It was therefore of interest to determine whether the improved protocol for CheY∷GFP production was sufficiently generic to be exploited in the production of a secreted protein that requires extensive post-translational modification and assembly in the bacterial periplasm. A N. gonorrhoeae gene of unknown function predicted to encode a c-type cytochrome, which we have designated cytochrome c2, was cloned into the expression plasmid, pET20bhc, and expressed either using the standard protocol (induction with a high concentration of IPTG followed by a decrease in growth temperature from 37 to 25 °C) or the same optimized conditions that were used for CheY∷GFP. Under both sets of conditions, two bands of recombinant protein were detected by SDS-PAGE stained for total protein: the upper band was preapocytochrome c located in the cytoplasm; the lower band stained positively for covalently attached haem, confirming that it was mature cytochrome located in the periplasm. Using the standard protocol, cytochrome rapidly accumulated for a short time, but then production stopped as the culture was overgrown by plasmid-deficient bacteria (not shown). Even with the standard protocol, some mature cytochrome was produced, but >95% of the product was preapoprotein located in cytoplasmic inclusion bodies. In contrast, yields of mature cytochrome c2 from the improved protocol were so high that the resulting E. coli culture was slightly orange, and the cytochrome with covalently bound haem accumulated in the soluble, periplasmic fraction of the bacteria (Fig. 5). This demonstrated that the improved protocol had enabled post-translational secretion, periplasmic haem attachment and folding to keep pace with the synthesis of the preapoprotein. Furthermore, analysis by SDS-PAGE revealed that only a small percentage of the preapoprotein had accumulated in the cytoplasmic fraction, or been deposited into inclusion bodies (Fig. 5a).
Accumulation of other recombinant proteins using the improved protocol
The overproduction of two other recombinant proteins, namely the gonococcal cytochrome c peroxidase (CCP; 47 kDa) from N. gonorrhoeae (Turner et al., 2003) and a non-E. coli protein D-GFP (45 kDa) (Intellectual property; GSK) were produced using both approaches (data not shown). There was at least an eightfold increase in the yields of mature CCP with covalently attached haem and of soluble protein D-GFP when the modified approach was used compared with the normal protocol, clearly reflecting the robustness of this approach for improving soluble recombinant protein yields regardless of size, properties or bacterial host origin.
The primary cause of failure to produce a correctly folded recombinant protein in high yield is well understood, namely, the accumulation of incorrectly folded intermediates due to rates of protein synthesis overwhelming post-translational modifications such as folding, secretion, folding into membrane-spanning helices or the incorporation of prosthetic groups (reviewed by Gasser et al., 2008). Different, and sometimes opposite, strategies depending on the properties of the target protein are required to solve these problems (Miroux & Walker, 1996; Soriano et al., 2002; Hoffmann & Rinas, 2004; Gasser et al., 2008; Wagner et al., 2008). Stress postinduction is especially severe when the IPTG-inducible T7 RNA polymerase system in the E. coli BL21 host is used to accumulate high concentrations of recombinant protein (Striedner et al., 2003). Many genetic strategies have been described to decrease this stress response, for example, selection of mutations that reduce expression rates of a recombinant protein (Miroux & Walker, 1996; Soriano et al., 2002; Wagner et al., 2008) or modulation of recombinant gene expression levels by limiting the amount of inducer (Striedner et al., 2003; Schultz et al., 2006). A suite of commercially available derivatives of BL21 and pET plasmids have also been designed to overcome these problems. Consequently the Holy Grail of recombinant protein production, the availability of generic protocols and hosts for the production of even the most difficult target product, has yet to be achieved: recombinant protein production remains as much an art as a science.
Although the consequences of rapid overexpression of cloned genes are well documented, less reported is the loss of CFUs during recombinant protein production (Striedner et al., 2003; Sundström et al., 2004). We therefore adopted a physiological approach to investigate whether it was possible to optimize the IPTG concentration not on the basis of quantity or speed of recombinant protein production, but on the yield of GFP fluorescence under conditions that greatly decreased the stress on the host. This involved growing the culture at the same temperature both before and after induction to avoid any stress caused by a change in temperature, and determination of the concentration of the inducer, IPTG, that would allow optimal yields of GFP fluorescence 24 h postinduction. This approach defined conditions that were suitable for the accumulation of two vastly different types of recombinant protein to levels approaching 30% of the total protein content of the bacteria. Experiments currently in progress are designed to determine whether this approach is applicable to other hosts and expression systems, or limited to expression systems based upon the bacteriophage T7 polymerase that, due to the very high rates of transcription postinduction, impose an excessive stress on the host bacterium (Soriano et al., 2002; Sørensen & Mortensen, 2005). It will be particularly interesting to know whether it can also be beneficial for other, less stressful expression systems.
This work was funded by the UK Biotechnology and Biological Sciences Research Council grant number BB/E005934/1 and a BBSRC CASE Research Studentship to S.A. The FACSAria II cell sorter was funded by BBSRC Research Equipment Initiative grant BBF0112371. We are grateful to Lesley Griffiths for excellent technical support.