• Open Access

A synthetic gene increases TGFβ3 accumulation by 75-fold in tobacco chloroplasts enabling rapid purification and folding into a biologically active molecule


(fax +44 161 275 5082; email anil.day@manchester.ac.uk)


Human transforming growth factor-β3 (TGFβ3) is a new therapeutic protein used to reduce scarring during wound healing. The active molecule is a nonglycosylated, homodimer comprised of 13-kDa polypeptide chains linked by disulphide bonds. Expression of recombinant human TGFβ3 in chloroplasts and its subsequent purification would provide a sustainable source of TGFβ3 free of animal pathogens. A synthetic sequence (33% GC) containing frequent chloroplast codons raised accumulation of the 13-kDa TGFβ3 polypeptide by 75-fold compared to the native coding region (56% GC) when expressed in tobacco chloroplasts. The 13-kDa TGFβ3 monomer band was more intense than the RuBisCO 15-kDa small subunit on Coomassie blue–stained SDS-PAGE gels. TGFβ3 accumulated in insoluble aggregates and was stable in leaves of different ages but was not detected in seeds. TGFβ3 represented 12% of leaf protein and appeared as monomer, dimer and trimer bands on Western blots of SDS-PAGE gels. High yield and insolubility facilitated initial purification and refolding of the 13-kDa polypeptide into the TGFβ3 homodimer recognized by a conformation-dependent monoclonal antibody. The TGFβ3 homodimer and trace amounts of monomer were the only bands visible on silver-stained gels following purification by hydrophobic interaction chromatography and cation exchange chromatography. N-terminal sequencing and electronspray ionization mass spectrometry showed the removal of the initiator methionine and physical equivalence of the chloroplast-produced homodimer to standard TGFβ3. Functional equivalence was demonstrated by near-identical dose–response curves showing the inhibition of mink lung epithelial cell proliferation. We conclude that chloroplasts are an attractive production platform for synthesizing recombinant human TGFβ3.


Human transforming growth factor beta 3 or TGFβ3 (TenDijke et al., 1988) is an important therapeutic protein involved in wound healing (Okane and Ferguson, 1997). One hundred million patients are estimated to be scarred annually in the developed world, with more than half of these cases arising from elective surgery (Bayat et al., 2003; Young and Hutchison, 2009). Exogenous application of recombinant TGFβ3 reduces dermal scarring in humans and a range of different animal models (Shah et al., 1995; Ghosh et al., 2006; Occleston et al., 2008a,b, 2009; Ferguson et al., 2009). TGFβ3 exemplifies an emerging class of biologic therapeutics that pose a range of challenges in their manufacture. Unlike the majority of new small chemical entities, biologics often require substantial expenditure to provide sufficient material of high quality for use in clinical studies. This is particularly problematic for first in human studies where relatively small amounts of material are required for clinical trials. Therefore, there is an urgent requirement for novel and scalable production platforms and methodologies that can satisfy the demand for highly pure biologics for therapeutic use. Within the human cell, TGFβ3 is synthesized as a 38-kDa propeptide (Figure 1a), which is processed into the active homodimer comprised of two 13-kDa polypeptide chains held together by disulphide bonds (Okane and Ferguson, 1997). Mammalian cells do not produce enough protein naturally to isolate the native protein. To produce sufficient material for use in the clinic has required the development of large-scale expression and manufacturing processes. Considerable time and money has to be invested in removing impurities and ensuring the recombinant protein can be delivered safely (Shire, 2009). Any advances that reduce the time or cost of manufacture would provide considerable advantages to companies wishing to develop biologics as novel therapeutics.

Figure 1.

 The active region coding sequence (CDS) of TGFβ3 was cloned in plastid transformation vectors. (a) The 13-kDa monomeric unit of TGFβ3 (active region) is processed from a 38-kDa propeptide. (b) Sequence of the Brassica napus (Bn) plastid rrn promoter showing -35 and -10 boxes, 5′ UTR from bacteriophage T7 gene 10 with ribosome-binding site (RBS) flanked by Xma I and Nco I sites. (c) Map of plastid transformation vectors p201 and p202 showing native plastid atpBE, rbcL and accD genes and aadA marker and TGFβ3 trait gene. Bn rrn promoter, RBS and 3′ UTRs are indicated. p201 contains TGFβN and an aadA marker with Bn psbC 3′ UTR. p202 contains TGFβO and an aadA marker with a Chlamydomonas reinhardtii rbcL 3′ UTR. (d) Map of Nicotiana tabacum wild-type (WT) plastid DNA (Nt ptDNAwt). (e–f) Maps of transgenic plastid genomes transformed with p201 and p202. Shown are Hind III sites (H) and fragment sizes and locations of rbcL, aadA and TGFβ3 hybridization probes used in DNA blot analysis.

Plants provide a highly scalable and sustainable production platform for biologics with minimal risk of contamination with animal pathogens (Staub et al., 2000; Fernandez-San Millan et al., 2003; Leelavathi and Reddy, 2003; Arlen et al., 2007; Ruhlman et al., 2007a; Daniell et al., 2009a,b). Chloroplast-based expression enables the high yields and subcellular compartmentalization that would facilitate the processing of recombinant TGFβ3 into a highly purified and biologically active product. The 13-kDa TGFβ3 polypeptide is free of glycosylation or other modifications (Okane and Ferguson, 1997), which is compatible with a chloroplast-based production platform. Ribulose 1,5-bisphosphate carboxylase/oxygenase (RuBisCO) is the most abundant native protein in leaves representing 30%–65% of total leaf protein (Ellis, 1979; Bally et al., 2009). When total leaf protein samples from plastid transformants were fractionated by sodium dodecyl sulphate polyacrylamide electrophoresis, highly expressed and stable recombinant proteins gave rise to clear bands of intensities higher than or comparable to the large (∼55-kDa) and small (∼15-kDa) subunit bands of RuBisCO (De Cosa et al., 2001; Kuroda and Maliga, 2001a,b; Tregoning et al., 2003; Molina et al., 2004; Herz et al., 2005; Dufourmantel et al., 2007; Zhou et al., 2008; Bally et al., 2009; Daniell et al., 2009b; Oey et al., 2009a,b; Ruhlman et al., 2010; Boyhan and Daniell, 2011). In some cases, the recombinant proteins associated as functional oligomers (Daniell et al., 2001) or assembled into virus-like particles (Fernández-San Millán et al., 2008). The levels of recombinant proteins accumulated in chloroplasts are influenced by the transcriptional and translational regulatory elements used (Herz et al., 2005; Verma and Daniell, 2007). Protein stability (Fernandez-San Millan et al., 2003; Birch-Machin et al., 2004; McCabe et al., 2008; Apel et al., 2010) and the sequences influencing translation appear to be particularly important (Kuroda and Maliga, 2001b; Ye et al., 2001). Translational fusions to the N-termini of plastid-encoded proteins (Kuroda and Maliga, 2001b) or to larger protein tags (Leelavathi and Reddy, 2003; Molina et al., 2004; Ruhlman et al., 2007a), and substitution of the amino acid following the initiator methionine (Apel et al., 2010), provide three approaches to enhance protein accumulation in chloroplasts. These additional N-terminal amino acids are not present in the primary sequences of the corresponding therapeutic proteins. Lack of equivalence to an existing therapeutic protein is undesirable, and these extra amino acids would need to be removed before use or subject to the additional testing needed for regulatory approval as a therapeutic medicine.

Expression of recombinant proteins can be raised by altering the codon composition of genes to reflect frequent codons used by the expression host (Gustafsson et al., 2004). The presence of codons rarely used by the expression host appears to be particularly deleterious to the yields of recombinant proteins (Gustafsson et al., 2004). Raising recombinant protein yields by altering the codon composition of genes has the advantage of producing protein products that are identical to the native human proteins. However, previous reports of changes in the codon composition of foreign genes introduced into tobacco chloroplasts have not enhanced expression (Daniell et al., 2009a) or only given rise to modest two- to fourfold increases in recombinant protein accumulation (Lutz et al., 2001; Ye et al., 2001; Madesis et al., 2010). In vitro studies using chloroplast lysates have indicated that codon usage does not always correlate with translation efficiency for chloroplast genes (Nakamura and Sugiura, 2007). These observations would appear to indicate a minor role for codon optimization in enhancing recombinant protein expression in chloroplasts. However, the modest impact of codon usage on protein accumulation might also reflect other limiting factors including the regulatory elements used or protein stability. Alternatively, the relatively high rates of accumulation of the unmodified coding sequences chosen (Lutz et al., 2001; Tregoning et al., 2003; Daniell et al., 2009a) might have masked the potential of codon usage to enhance the expression because of yield ceilings caused, for example, by the deleterious impact of a recombinant protein on the translational capacity of chloroplasts (Tregoning et al., 2003).

Here, we compared the expression of the native human coding sequence for the 13-kDa TGFβ3 polypeptide with a synthetic sequence, which was codon-optimized for expression in chloroplasts. Our results showed that codon optimization of the TGFβ3 coding region raised expression levels by over 75-fold such that the TGFβ3 protein accumulated to levels comparable to the small subunit of RuBisCO. The N-terminal methionine was removed from the recombinant protein that accumulated as the 13-kDa monomer and higher-order protein oligomers in the insoluble fraction of lysed chloroplasts. Purification and refolding of chloroplast-expressed TGFβ3 gave rise to a biologically active homodimer that appeared equivalent in size and potency to the native TGFβ3 protein.


Construction of plant transformation vectors

Two TGFβ3 coding sequences (CDS) were expressed in plastids. The first sequence encompassed the native 336-bp human TGFβ3 active region CDS (56% GC) with an additional methionine codon for translation initiation referred to as TGFβ3N. The second TGFβ3 active region CDS (33% GC), named TGFβ3O, was codon-optimized for expression in Nicotiana tabacum chloroplasts. Both coding regions share 70% base identity but encode an identical polypeptide (Figure S1). Both TGFβ3N and TGFβ3O CDSs were inserted into the same plastid expression cassette driven by a Brassica napus rrn promoter fused through an Xma I site to the T7 bacteriophage gene 10 leader sequence (T7g10) that contains a ribosome-binding site (RBS) (Figure 1b). The CDSs were flanked by the 3′ UTR of the B. napus psbC gene. Therefore, the regulatory sequences driving the expression of the two TGFβ3 coding regions were identical. The aminoglycoside adenylyltransferase (aadA) marker gene was located upstream of each TGFβ3 CDS and contained the B. napus rrn promoter, N. tabacum rbcL RBS and either the B. napus psbC 3′ UTR (Bn psbC) in p201 or Chlamydomonas reinhardtii rbcL 3′ UTR (Cr rbcL) in p202. Use of the Cr rbcL 3′UTR in p202 was to remove duplication of the 289-bp Bn psbC 3′ UTR regulatory element present in p201 in order to remove the possibility of loop-out recombination (Iamtham and Day, 2000). Use of heterologous regulatory elements reduces undesirable recombination events with the native sequences present in the N. tabacum plastid genome. These advantages of using heterologous regulatory elements need to be balanced with a report indicating enhanced efficacy of homologous 5′ and 3′ UTRs (Ruhlman et al., 2010). Both p201 and p202 vectors target the integration of foreign genes to an identical insertion point in the intergenic region between the plastid rbcL and accD genes (Figure 1d–f).

Integration of foreign genes into the Nicotiana tabacum plastome

Transplastomic shoots were isolated following spectinomycin and streptomycin selection of N. tabacum cv W38 leaves transformed by particle bombardment with the p201 and p202 vectors. Antibiotic-resistant shoots formed roots and following transfer to soil flowered. Seeds collected from self-fertilized plants gave rise to seedlings that were uniformly resistant to spectinomycin (not shown), consistent with maternal inheritance and homoplasmy. Transplastomic TGFβO seedlings and young leaves exhibited a transient pale-green coloration compared with wild-type (WT) seedlings (Figure S2) but were otherwise normal in growth and development. DNA blot analysis was used to verify targeted integration and homoplasmy. Hind III restriction maps based on targeted integration predict the replacement of the WT 11.5-kbp fragment (Figure 1d) with a 7.0-kbp fragment (Figure 1e–f) in transplastomic plants. This was confirmed by using a rbcL hybridization probe against Hind III digests, which showed replacement of the 11.5-kbp WT band with a 7.0-kbp band in p201 and p202 lanes (Figure 2a), corresponding to transplastomic lines isolated from independent transformation events. DNA blots hybridized with the TGFβ3 probes showed the predicted 5.3-kbp TGFβ3 bands in the digests of p201(TGFβN) and p202(TGFβO) transplastomic DNA (Figure 2b). An aadA hybridization probe hybridized to the predicted 1.3-kbp p201(TGFβN) and 1.0-kbp p202(TGFβO) bands in Hind III digests of total DNA (Figure 2c). Loop-out recombination between the duplicated Bn psbC 3′ UTRS in p201 transgenic plastid genomes would give rise to a 5.8-kbp band hybridizing to aadA, which was not detected in the 5C, 7A and 8A p201(TGFβN) transplastomic lines (not shown). Unintegrated vector or targeting to other sites in the plastid genome would not give rise to the bands detected on DNA blots (Figure 2).

Figure 2.

 DNA blot analysis of total DNA from p201 and p202 transplastomic plants isolated from independent transformation events. Hind III digests were probed with (a) rbcL, (b) TGFβ3N or TGFβ3O or (c) aadA hybridization probes. MW standards and hybridizing band sizes are indicated.

Accumulation of TGFβ3 RNA in transplastomic plants

Total leaf RNA preparations from transplastomic lines expressing the TGFβN and TGFβO genes were fractionated on denaturing agarose gels for RNA blot analysis (Figure 3). Blots stained with methylene blue staining showed sharp ribosomal RNA bands consistent with intact RNA preparations. Two dilutions of TGFβO RNA facilitated quantitative phosphorimage analysis. An rbcL probe hybridized to a single WT band (Figure 3b, lane 1) but to three bands in transplastomic lanes (Figure 3b, lanes 2–4). The larger bands result from transcriptional read-through into downstream genes (Figure 3d) and have been observed before for genes placed downstream of the rbcL gene (Staub and Maliga, 1994; Madesis et al., 2010). The TGFβ3 probe binds to a common region in the 5′ UTR regions of TGFβ3N and TGFβO transcripts allowing hybridizing band intensities to be compared. Using the monocistronic rbcL band intensities to normalize loadings, phosphorimage analysis indicated the monocistronic TGFβO RNA (Figure 3c, lanes 2–3) accumulated to approximately threefold higher levels than the monocistronic TGFβN RNA species (Figure 3c, lane 1). Because both native and codon-optimized TGFβ coding sequences were expressed using identical promoters, 5′ UTRs and 3′ UTRs, this indicates an influence of TGFβ coding sequences on RNA accumulation. An influence of coding regions on RNA accumulation has been previously described in Chlamydomonas (Kasai et al., 2003) and tobacco chloroplasts (Madesis et al., 2010; Elghabi et al., 2011). The aadA-TGFβ3 dicistronic transcripts in TGFβN plants (Figure 3c, lane 1) run ahead of the corresponding transcripts in TGFβO plants (Figure 3c, lanes 2–3). This is because of differences in the sizes of the 3′ regulatory elements associated with the aadA marker. The aadA marker is flanked by a 239-bp Bn psbC 3′ regulatory region in TGFβN plants and a 501-bp Cr rbcL 3′ regulatory region in TGFβO plants.

Figure 3.

 RNA blot analysis of TGFβ3N and TGFβ3O transplastomic leaves. (a) Total RNA stained with methylene blue showing abundant ribosomal RNA bands. Blots probed with (b) rbcL and (c) an oligonucleotide that hybridizes to 34 bases in the shared 5′ UTRs of TGFβ3N and TGFβ3O transcripts. Bands hybridizing to multiple probes are indicated. (d) Map showing monocistronic and polycistronic transcripts.

Accumulation of TGFβ3 in transplastomic plants

Total leaf protein samples from TGFβ3 and WT plants were compared following fractionation by sodium dodecyl sulphate-polyacrylamide gel electrophoresis (SDS-PAGE) and Coomassie blue staining. The major bands visible in total leaf protein from WT and TGFβ3N plants were the large (LSU) and small subunits (SSU) of RuBisCO (Figure 4a, lanes 2–3), which accumulate in stochiometric amounts (Ellis, 1979; Parry et al., 2003; Bally et al., 2009). TGFβ3O leaf samples accumulated an additional prominent band of 13 kDa, which comigrated with a TGFβ3 standard on SDS-PAGE gels (Figure 4a, lanes 4–5). A polyclonal antibody specific for TGFβ3 bound to the 13-kDa band on protein blots (Figure 4b, lane 4). Bands corresponding in size to the TGFβ3 dimer (2n) and diffuse higher-order oligomers were also detected by the antibody. The 13-kDa TGFβ3 monomer band was only visible in TGFβ3N lanes when the alkaline phosphatase reaction to visualize bound antibody was left to develop for longer times (Figure S3a).

Figure 4.

 SDS-PAGE and protein blot analysis of TGFβ recombinant protein in TGFβ3N and TGFβ3O transplastomic lines. Fractionated total leaf protein (10–20 μg) from indicated lines visualized with (a) Coomassie blue or (b) polyclonal TGFβ3-specific antibody. TGFβ3 standards (μg) are indicated. Fractionated total protein from 2nd (young), 4th (mature) and 8th yellowing leaf (old) of plant with eight leaves visualized with (c) Coomassie blue and (d) a polyclonal TGFβ3-specific antibody. Total fractionated proteins from seeds stained with (e) Coomassie blue or (f) incubated with polyclonal TGFβ3-specific antibody. TGFβ3 standards (μg) and MW standards are indicated. TGFβ3 monomer (1n) and dimer (2n) bands are indicated.

Digital analysis using Quantity One 1-D Analysis Software (Bio-Rad, Hemel Hempstead, UK) of TGFβ3O gel lanes indicated the TGFβ3 peak represented 10 ± 1% (n = 5) of the sum of all the protein peaks within a lane stained with Coomassie blue; the value for RuBisCO SSU (14.6 kDa) was 7% (Figure 4a). Estimates of protein amounts based on Coomassie blue can be skewed by uneven staining of proteins. Band intensity is related to the number of positive charges in a protein (Tal et al., 1985). TGFβ3 (H:2, K:5, R:5) contained five fewer positively charged amino acids than RuBisCO SSU (H:1, K:9, R:7), yet the 13-kDa TGFβ3 band stained 1.4-fold more intensely than the RuBisCO SSU band. The RuBisCO LSU band was estimated from band intensity to represent 50% of total leaf protein in WT leaves in agreement with other studies (Bally et al., 2009). A twofold decrease of RuBisCO LSU to 25% of total leaf protein was observed in TGFβ3O leaves (Figure 4a, lane 4) compared with WT (Figure 4a, lane 2). Very high levels of recombinant protein accumulation in chloroplasts have been associated with a reduction in RuBisCO levels (Bally et al., 2009; Oey et al., 2009a,b). However, this does not always result in a measurable decrease in growth or photosynthetic rates (Bally et al., 2009).

TGFβ3 accumulation was quantified by Western blot analysis. The intensities of TGFβ3 bands in different leaves from TGFβ3O and TGFβ3N plants were compared with known amounts of dilutions of a TGFβ3 standard to estimate average amounts of recombinant protein present (Figure S3a,b). The 13-kDa TGFβ3 band was estimated to accumulate to 8.2 ± 2% (n = 3) of total leaf protein in TGFβ3O leaves and to 0.11 ± 0.04% (n = 2) of total leaf protein in TGFβ3N plants. Estimates of the TGFβ3 13-kDa monomer in the leaves of TGFβ3O plants based on Western blot analysis (8% of leaf protein) and Coomassie blue staining (10% of leaf protein) were in reasonable agreement. Summing monomer and oligomer bands indicated TGFβ3 accumulated to 12% of total leaf protein in TGFβ3O leaves. This corresponds to 2.1 ± 0.3 mg of TGFβ3 per gram of fresh leaves. The Western blot analysis indicated that the codon-optimized TGFβ3O sequence was associated with a 75-fold increase in the accumulation of the TGFβ3 monomer compared to the product of the native TGFβ3N gene. Independently isolated TGFβ3O lines accumulated recombinant proteins to similar levels in leaves (Figure S4). Plastid expression cassettes are expressed in Escherichia coli. Recombinant TGFβ3 accumulation in E. coli (Figure S5) was higher for p201 (TGFβ3N) than that for p202 (TGFβ3O) presumably because the codons in the native TGFβ3 sequence are more compatible with bacterial expression.

Stable accumulation of the TGFβ3 monomer was observed in leaves of different ages from a TGFβ3O plant (Figure 4c,d). The protein was detected on Western blots and accumulated to high levels in all leaves tested including an old yellowing leaf (Figure 4d, lane 4) in which the absence of high MW proteins (Figure 4c, lane 4) indicates extensive protein degradation. N. tabacum provides a leaf-based production platform, and the absence of TGFβ3 in seed stocks for long-term storage would be advantageous for containment. We were unable to detect TGFβ3 in seeds from TGFβ3O plants using protein blot analysis (Figure 4e–f). Low TGFβ3 expression in seeds reduces the risk of hazards associated with unintended exposure of organisms to the recombinant protein in transplastomic seeds.

Analysis of TGFβ3 oligomers

TGFβ3 accumulated as monomer, dimer and higher-order oligomers in TGFβ3O plants. The dimer and higher-order oligomers were detectable on reducing SDS-PAGE gels, which would be expected to disaggregate TGFβ3 protein complexes. This was investigated further by using a monoclonal antibody that detects the correctly folded TGFβ3 dimer. The correctly folded TGFβ3 dimer standard (TGFβ3 STD) migrates as a 25.4-kDa band on a Ponceau red–stained blot from a nonreducing gel (Figure 5a, lane 5). The TGFβ3 dimer standard was detected by a polyclonal antibody that is insensitive to conformation (Figure 5b, lane 5) as well as a conformation-sensitive monoclonal antibody (Figure 5c, lane 5). In contrast, the oligomers detected in leaf protein from TGFβ3O plants with the polyclonal antibody (Figure 5b, lane 3) were not detected with the conformation-sensitive monoclonal antibody (Figure 5c, lane 3). These results suggest the high MW TGFβ3 molecules in transplastomic plants represent protein aggregates rather than correctly folded oligomers. A small amount of nonspecific binding gives rise to faint bands in all lanes including the WT lane (Figure 5c, lane 2).

Figure 5.

 Blot analysis of total leaf protein from TGFβ3N and TGFβ3O transplastomic leaves using a conformation-dependent TGFβ3 antibody. Ponceau red–stained blots from SDS-PAGE gels loaded with (a) nonreduced samples, or (d) reduced samples, and incubated with (b and e), a polyclonal antibody recognizing TGFβ3 that is not sensitive to correct folding, and (c and f), a monoclonal antibody recognizing the correctly folded TGFβ3 homodimer. MW size standards are indicated. TGFβ3 monomer (1n) and dimer (2n) bands are indicated.

On reducing gels, the TGFβ3 25.4-kDa dimer was largely resolved into its 12.7-kDa monomer (1n) chains (Figure 5d, lane 4). The monomer was detected with the polyclonal antibody (Figure 5e, lane 4) but not with the conformation-sensitive monoclonal antibody (Figure 5f, lane 4). This is expected because the conformation-sensitive monoclonal antibody only detects correctly folded TGFβ3. The majority of the TGFβ3 standard was reduced to the monomer on denaturing gels (Figure 5d, lane 4, Figure 5e, lane 4). In contrast, under the reducing conditions used, the TGFβ3 dimers and higher-order oligomers from TGFβ3O plants are relatively stable (Figure 5e, lane 2). This indicates that adding β-mercaptoethanol and placing in a boiling water bath for 5 min was sufficient to reduce and disassemble the TGFβ3 chains in the standard but not the TGFβ3 oligomers present in transplastomic plants. Only nonspecific binding was detected in the WT lane with the polyclonal antibody (Figure 5e, lane 1) and in the blot from a reducing gel incubated with the conformation-specific monoclonal antibody (Figure 5f, lanes 1–3).

Purification, refolding and characterization of plant TGFβ3

Differential centrifugation was used to isolate chloroplasts from the leaves of TGFβ3O plants. TGFβ3 was present in the insoluble fraction allowing its rapid purification. The 13-kDa monomer is clearly visible in the chloroplast lysate (Figure 6a, lane 3). Following sedimentation of the lysate, the 13-kDa protein is not visible in the supernatant (Figure 6a, lane 4) or the buffer used to wash the pellet (Figure 6a, lane 5). Solubilization of the pellet in buffer containing 6 m urea showed a predominant 13-kDa TGFβ3 band (Figure 6a, lane 6). The majority of chloroplast proteins including the RuBisCO SSU are removed in the supernatant (Figure 6a, lane 3). Approximately, 3 mg of TGFβ3 was obtained per batch of 80 g of leaves at about 80% purity. The solubilized TGFβ3 pellet was concentrated and refolded using a glutathione buffer. Refolded chloroplast-expressed TGFβ3 was detected as a band on protein blots using the conformation-specific monoclonal antibody (Figure 6b, lane 5), which comigrated with TGFβ3 standard (Figure 6b, lane 2). The conformation-sensitive monoclonal antibody also recognizes correctly folded monomer (Figure 6b, lane 5). The refolded TGFβ3 was then purified on a Butyl-Sepharose 4 Fast Flow Column (GE Healthcare, Chalfont St Giles, UK) using hydrophobic interaction chromatography and further purified on a UNO S1 cation exchange column (Bio-Rad). Eluted fractions from the cation exchange column are shown in Figure 6c. Fractions 12 and 13 contained the refolded chloroplast-expressed TGFβ3 homodimer (Figure 6c, lanes 1–2), with trace amounts of TGFβ3 monomer and free of any visible contamination with other chloroplast proteins. Residual RuBisCO SSU not removed by earlier purification steps was isolated in fractions 2–5 (Figure 6c, lanes 3–6). The purified and correctly folded chloroplast-expressed TGFβ3 was analysed by electrospray ionization time-of-flight (ESI-TOF) mass spectrometry (LCT; Waters UK, Manchester, UK). The resulting mass spectrum showed a major peak at 25426 Da (Figure 6d), which was identical to the MW of the TGFβ3 standard. Removal of the initiator methionine of the recombinant TGFβ3 expressed in the chloroplasts would provide a mass of 25426 Da. This was confirmed by N-terminal sequencing (Figure 6e).

Figure 6.

 Purification, folding and analysis of TGFβ3 expressed in chloroplasts. (a) Coomassie blue–stained reducing SDS-PAGE (12% W/V resolving gel) showing chloroplast lysate, supernatant and washed pellet fractions following sedimentation at 8000 g × 30 min. The small subunit of RuBisCO (SSU) and TGFβ3 monomer (1n) are indicated. (b) Protein blot of nonreducing SDS-PAGE gel with solubilized and refolded chloroplast TGFβ3 incubated with a monoclonal antibody specific for correctly folded TGFβ3. (c) Eluted fractions from UNO-S1 cation exchange column (Bio-Rad) using linear gradients of 0%–60% and 60%–100% elution buffer (10% V/V glacial acetic acid, 30% V/V ethanol, 1 m sodium chloride, pH 4) fractionated by SDS-PAGE (12% resolving gel) and visualized by silver staining. (d) Mass spectrum of purified refolded chloroplast TGFβ3 using electrospray ionization time-of-flight mass spectrometry. The major peak visible has an identical mass to the TGFβ3 standard (25426 Da). (e) N-terminal sequence of purified refolded chloroplast TGFβ3 determined by Edman degradation.

Biological activity of refolded chloroplast-expressed TGFβ3

The method for evaluating biological activity was based on the ability of TGFβ3 to inhibit the proliferation of mink lung epithelial cells (Parker et al., 2002). Epithelial cell proliferation was determined by the AlamarBlue dye assay where the fluorescence intensity of sampled media was directly related to cell proliferation. Dose–response curves obtained with chloroplast-expressed TGFβ3 and the standard TGFβ3 overlap apart from divergence at the highest dose used (Figure 7). This demonstrates that chloroplast-expressed TGFβ3 is functionally equivalent to the standard TGFβ3.

Figure 7.

 Comparison of dose–response curves of chloroplast-expressed (cp) TGFβ3 and standard (STD) TGFβ3. TGFβ3 bioactivity was assessed by the inhibition of proliferation of mink lung epithelial cells (CCL-64 cells, ATCC, Manassas, VA). Fluorescence at 590 nm is directly related to cell proliferation using the AlamarBlue dye assay. Four biological replicates were used for each dose, and two readings were taken for each replicate.


Recombinant human TGFβ3 was expressed in N. tabacum chloroplasts using the native human coding region and a synthetic gene containing codons that are frequently used in chloroplast genes. The codon-optimized TGFβ3 gene raised recombinant TGFβ3 accumulation by 75-fold relative to the native human coding sequence and gave rise to a 13-kDa band that stained 1.4 times more intensely than the RuBisCO SSU band when SDS-PAGE-fractionated total leaf protein was stained with Coomassie blue. Recombinant TGFβ3 accumulated in insoluble aggregates, which on denaturing SDS-PAGE gels were resolved into monomer, dimer and higher-order oligomers. TGFβ3 accumulated to 12% of leaf protein in TGFβ3O plants. One kilogram of TGFβ3O leaves (fresh weight) contained approximately 2 g of recombinant TGFβ3. On a per weight basis (1 kg = 1 L), this represents a 250-fold increase in recombinant TGFβ3 expression relative to Chinese hamster ovary cells where a yield of 8 mg/L of TGFβ3 was obtained (Zou and Sun, 2006).

Transplastomic p202 (TGFβO) plants expressing TGFβ3 to high levels were pale green at early stages of growth. The transient pale-green phenotype can be explained if chloroplast protein synthesis is limiting in young plants. Diverting protein production to TGFβ3 will reduce the synthesis of native chloroplast proteins, which would in turn delay chloroplast development. Once past this early period of growth, TGFβ3O transplastomic plants grew well, which is essential for a plant-based production platform. The twofold reduction in RuBisCO content observed in TGFβ3O transplastomic plants was presumably because of the competing synthesis of TGFβ3 in chloroplasts. Reductions in RuBisCO content resulting from high rates of accumulation of recombinant proteins in chloroplasts have a limited effect on plant growth (Bally et al., 2009). The impact of a recombinant protein on plant growth will reflect its abundance and toxicity. Transplastomic p202(TGFβ3O) plants grew well and did not exhibit the extreme slow growth (Oey et al., 2009a) or chlorotic phenotype (Tregoning et al., 2003) observed for other abundant recombinant proteins expressed in chloroplasts.

Sedimentation of insoluble TGFβ3 aggregates from soluble chloroplast proteins following lysis of chloroplasts provided a rapid initial purification step. Solubilization of TGFβ3 aggregates in urea buffer was followed by refolding the 13-kDa monomers into the active dimer in glutathione buffer. In a recent report, insoluble aggregates of human proinsulin fused to cholera toxin B subunit were sedimented from leaf homogenates and refolded into an active molecule using guanidine hydrochloride (Boyhan and Daniell, 2011). Hydrophobic interaction chromatography and cation exchange chromatography were used to further purify the refolded chloroplast-expressed TGFβ3 homodimer to a single band on silver-stained SDS-PAGE gels. Complete removal of the initiator methionine was detected by ESI-TOF mass spectrometry and N-terminal sequencing of the purified and refolded chloroplast-expressed TGFβ3. Incorrect N-termini have been observed for the major protein products of a number of foreign genes expressed in tobacco chloroplasts (Staub et al., 2000; Tregoning et al., 2003; McCabe et al., 2008) as well as correct N-termini (Dufourmantel et al., 2007). The molecular mass of refolded chloroplast TGFβ3 and standard TGFβ3 was identical. Functionality of the protein was demonstrated by inhibition of proliferation of mink lung epithelial cells. The dose–response curves of the refolded chloroplast TGFβ3 and standard TGFβ3 were indistinguishable apart from a small degree of divergence at the highest dose tested. Physical and biological equivalence of refolded chloroplast TGFβ3 to the standard TGFβ3 is crucial for developing a viable plant-based production system for this new therapeutic.

Chloroplasts contain protein maturation pathways that enable the folding and formation of correct intrachain disulphide bonds in a number of monomeric proteins (Staub et al., 2000; Arlen et al., 2007; Ruhlman et al., 2007b; Daniell et al., 2009a; Lee et al., 2010; Ruhlman et al., 2010). Unlike these examples, TGFβ3 is a dimer held together by interchain disulphide bonds, which provides a more complex situation. In mammals, the mature TGFβ3 homodimer containing interchain disulphide bonds is processed from a longer dimeric precursor (Figure 1a). In this work, we were able to detect dimers and higher-order oligomers of TGFβ3 in chloroplasts. Dimers and higher-order oligomers have also been reported for a number of recombinant proteins containing intrachain disulphide bonds when expressed in chloroplasts (Arlen et al., 2007; Ruhlman et al., 2007b; Lee et al., 2010; Boyhan and Daniell, 2011). Use of a conformation-specific monoclonal antibody showed that the TGFβ3 oligomers were not folded correctly. The high yields and insolubility of chloroplast-expressed TGFβ3 allowed rapid initial purification by sedimentation and refolding of the solubilized TGFβ3 into the homodimer. Very high levels of recombinant protein expression can result in the accumulation of misfolded proteins. Reduced expression did not give rise to correctly folded TGFβ3 in our p201 (TGFβ3N) transplastomic plants, in which TGFβ3 accumulated to 0.11% of total leaf protein. Chloroplasts would appear to lack pathways to assemble the correctly folded TGFβ3 homodimer.

Codon optimization of the TGFβ3 coding sequence led to a 75-fold increase in recombinant protein accumulation relative to the native human coding sequence. A small part of this increase may result from threefold higher levels of mRNA accumulated from the synthetic coding sequence. The greater part of this 75-fold increase in TGFβ accumulation can be attributed to enhanced translation if raised RNA levels account for a corresponding threefold increase in recombinant protein amounts. Protein stability is a major determinant of recombinant protein accumulation in chloroplasts (McCabe et al., 2008; Apel et al., 2010). Both synthetic and native TGFβ3 genes encode identical proteins with alanine as the second amino acid, which rules out protein stability as an explanation for the large differences in their accumulated products. The second amino acid after the initiator methionine is often alanine in chloroplast proteins (Apel et al., 2010). Alanine at the second position confers moderate stability on recombinant green fluorescent protein (GFP) expressed in chloroplasts (Apel et al., 2010). Our results conflict with previous work, suggesting that codon optimization leads to modest two- to fourfold increases in recombinant protein expression in tobacco chloroplasts (Lutz et al., 2001; Ye et al., 2001; Daniell et al., 2009a; Madesis et al., 2010) and suggests that it is a viable strategy to dramatically raise recombinant protein accumulation in angiosperm chloroplasts. In Chlamydomonas, codon optimization did raise the accumulation of GFP by 80-fold in chloroplasts (Franklin et al., 2002). Codon optimization might be particularly applicable to foreign coding regions with a large number of codons rarely used in chloroplast genes.

Including the 10–14 N-terminal amino acids from either native plastid proteins (Kuroda and Maliga, 2001b) or GFP (Ye et al., 2001) raised the accumulation of recombinant proteins by two- and 30-fold, respectively. The molecular bases for these results are unclear. Synonymous codon substitutions in the 14 N-terminal coding region from rbcL but not from atpB reduced recombinant protein accumulation by 35-fold (Kuroda and Maliga, 2001b). Reduced translation resulting from codon substitutions was ascribed to changes in mRNA structure that affect ribosome recognition but an influence of codon usage cannot be ruled out. It is possible that the N-terminal codons in the synthetic TGFβ3 sequence are largely responsible for raised expression levels. Short N-terminal fusions that alter the second amino acid, which is exposed following excision of the initiating methionine, would alter the stabilities of recombinant proteins (Apel et al., 2010) and might account for some of the observed increases in recombinant protein accumulation reported in the literature.

Coexpression of a potential chaperonin increased Cry2a protein accumulation from 0.4% to 45% of total leaf protein, resulting in the accumulation of insoluble Cry2A protein (De Cosa et al., 2001). Insolubility, which might shield proteins from proteolytic attack, is often associated with very high levels of recombinant protein accumulation in plastids (Oey et al., 2009a; Ruhlman et al., 2010; Boyhan and Daniell, 2011). Because identical TGFβ3 proteins were expressed from the native and synthetic coding sequences, differences in protein stability would not have been expected. However, it remains possible that raised expression of the codon-optimized sequence increased TGFβ3 levels beyond a threshold that promoted accumulation in insoluble aggregates. Whilst the precise molecular mechanisms responsible for raised expression will require further analysis, our results using TGFβ3 clearly demonstrate that changing the coding sequence alone can have a large positive impact on recombinant protein accumulation in tobacco chloroplasts.

Experimental procedures

Isolation of transplastomic plants

Leaves from aseptic wild-type N. tabacum cv Wisconsin 38 were transformed by particle bombardment as previously described (Madesis et al., 2010). Briefly, green shoots and cell lines were selected on RMOP media (Lutz et al., 2001) with spectinomycin dihydrochloride pentahydrate (250 mg/L) plus streptomycin sulphate (500 mg/L). After three cycles of regeneration on RMOP media containing both antibiotics, shoots were transferred to Magenta jars (Sigma–Aldrich, Poole, UK) containing Murashige and Skoog medium (3% W/V sucrose) supplemented with 200 mg/L spectinomycin to allow the formation of roots. Plants were propagated in a growth cabinet at 25 °C in a 12-h day/12-h night cycle with light intensities of 40–100 μE m−2 s−1. Plants with roots were transferred to soil. Plants in soil were grown in a walk-in growth room at 25 °C in a 16-h day/8-h night cycle with light intensities of 80–200 μEm2/s.

Construction of plastid transformation vectors

The synthetic codon-optimized TGFβ3 sequence was obtained by reverse translating the TGFβ3 protein sequence using frequently used chloroplast codons. It was made by overlap PCR using 14 oligonucleotides in multiple reactions to make dimers, tetramers, octamers and the final product as described (Madesis et al., 2010). The sequence (Acc. No. FM211593) cloned in pGEMT®-T easy vector (Promega, Southampton, UK) is shown in Figure S1 together with the human TGFβ3 coding region. The pUM34 plasmid contains the aadA coding region (Goldschmidt-Clermont, 1991) flanked by B. napus rrn fused to the N. tabacum rbcL ribosome-binding site (Acc. No. AJ276677) and C. reinhardtii rbcL 3′ UTR (Goldschmidt-Clermont, 1991). pUM35 is derived from pUM34 by replacing the 3′UTR with a B. napus psbC 3′ UTR (Acc. No. AJ578474). TGFβ3 coding regions were expressed using the B. napus rrn promoter and phage T7 g10 ribosome-binding site (Figure 1b) and B. napus psbC 3′ UTR (Acc. No. AJ578474). The expression cassettes containing the aadA and TGFβ3 genes were cloned between the Apa I and Not I sites in pATB27 (Madesis et al., 2010), which contains a 7.2-kb region of N. tabacum chloroplast DNA (bases 53613-60864, Acc. No. Z00044), to construct p201 and p202 (Figure 1c).

Nucleic acid manipulations

Methods for DNA and RNA extraction from leaves, using the RNeasy and DNeasy plant mini kits (Qiagen, Crawley, UK) and blot analyses, have been described (Madesis et al., 2010). For RNA analysis, the third leaves (from top) from four plants were combined before RNA extraction. [α-32P]dCTP hybridization probes prepared with High Prime (Roche Applied Science, Lewes, UK) were comprised of a 0.8-kbp Nco I-Pst I aadA fragment from pUCatpXaadA (Goldschmidt-Clermont, 1991), a 1.4-kbp rbcL PCR product prepared using primers TOB-rbcL-F (5′- ATGTCACCACAAACAGAGACTA) and TOB-rbcL-R (5′- TTACTTATCCAAAACGTCCACT) on pATB27, a 0.306-bp TGFβ3N probe prepared using primers TGF-AR-F (5′- CCAATTACTGCTTCCGCAACTTGGAG) and TGF-AR-R (5′- CCACCATGTTGGAGAGCTGCTC) on human TGFβ3 cDNA and a 0.271-bp TGFβ3O probe using primers TGF-β3-2A-F (5′- TCAAGATCTTGGTTGGAAATGGGTACATGAACCTA) and TGF-β3-4B-R (5′- CTGCAGTTAAGAACATTTACAACTTTTAACTACC) on the cloned synthetic sequence. A TGFβ3 hybridization probe that hybridizes to 34 bases including the 5′ UTR and first two codons that are common to the TGFβ3O and TGFβ3N transcripts was made by annealing oligonucleotides pUM35T7-RTPCR-F (5′- GTTTAACTTTAAGAAGGAGATATACC) and T7-TGFbeta-probe-R (5′—GGGAAAGCCATGGTATATCT), which share a 9 base overlap, and filling in single-stranded regions with Klenow enzyme and [α-32P]dCTP, dATP, dGTP and dTTP. Blots were washed in 0.1 × standard saline citrate (SSC) (1 × SSC is 0.15 m NaCl, 0.015 m trisodium citrate) and 0.1% W/V sodium dodecyl sulphate at 55 °C. Aida software was used to measure phosphorimage band intensities.

Protein blot analysis

Leaves were ground to a powder in liquid nitrogen. Twenty micrograms of leaf powder was mixed thoroughly with 100 μL of reducing sample buffer (10% glycerol (v/v), 5%β-mercaptoethanol (v/v), 3% SDS (w/v) and 0.0625 m Tris, pH6.8) and placed in a boiling water bath for 5 min, and the supernatant kept for analysis after centrifugation at 16 000 g (Eppendorf 5415C). Nonreduced plant samples were prepared as described earlier with nonreducing sample buffer (10% glycerol (v/v), 3% SDS (w/v) and 0.0625 m Tris, pH6.8). Twenty micrograms of seeds extracted in 50 μL of reducing sample buffer was processed as mentioned previously and the supernatant further diluted 1 : 10 in reducing sample buffer. Methods for SDS polyacrylamide electrophoresis and protein blotting have been described (Madesis et al., 2010). Total leaf protein was estimated using the BioRad DC assay. Band intensities were determined using ImageJ software (Rasband WS, U. S. National Institutes of Health, Bethesda, USA, http://rsb.info.nih.gov/ij/). Protein samples were fractionated using Tris-HCl 10–20% (V/V) polyacrylamide gradient gels (Invitrogen Paisley, UK) in running buffer (0.025 m Trizma base, 0.192 m glycine and 1% SDS W/V, pH 8.6). Protein blots were carried out using an affinity-purified polyclonal TGF-β3 biotinylated antibody (Ref BAF243; R&D Systems Europe, Abingdon, UK) and a conformation-dependent monoclonal TGF-β3 antibody (Ref MAB643; R&D systems). Primary antibodies were diluted 1 : 500. Biotinylated antibodies were detected using ExtrAvidin linked to alkaline phosphatase (Sigma–Aldrich) or an anti-mouse secondary antibody linked to alkaline phosphatase. Bound antibodies were visualized with bromo-4-chloro-3-indolyl phosphate/nitroblue tetrazolium liquid substrate (Sigma–Aldrich).

TGFβ isolation, folding and purification from chloroplasts

N. tabacum plants were kept in darkness for 48 h prior to harvesting to reduce the starch content in leaves. All the following procedures used chilled buffers and rotors (4 °C) unless indicated. Hundred grams of leaves was placed in 400 mL of chloroplast isolation buffer (0.3 m d-mannitol, 0.05 m Tris, 0.003 m EDTA and 0.001 mβ-mercaptoethanol, pH 8.0) in a blender. Chloroplasts were released by homogenizing leaves using short (3–5 s) bursts. The homogenate was filtered through 100-μm and 20-μm pore-sized mesh into 250-mL centrifuge tubes. Samples were sedimented at 1020 g for 15 min, the supernatant was removed, and the pellet was resuspended with a paint brush in chloroplast isolation buffer. The chloroplasts were then sedimented again, the supernatant was removed, and the pellet was resuspended in an equal volume of lysis buffer (10 mm HEPES, 5 mm EDTA, 2% W/W Triton X-100 and 0.1 m DTT at pH 8.0). TGFβ3 aggregates were easily sedimented from chloroplast lysates (8000 g, 30 min) and washed twice with 0.05 m Tris base and 0.01 m EDTA (pH 8.0). The washed pellet was solubilized in 0.05 m tris base, 0.1 m DTT and 6 m urea at pH 8.0. The pH was raised to 9.5 and the buffer exchanged for 0.05 m Tris base, 0.01 m DTT and 3 m urea (pH 9.5) using tangential flow filtration. The protein solution was diluted in refolding buffer (0.7 m CHES, 1 m NaCl, 0.002 m reduced glutathione, 0.0004 m oxidized glutathione and 0.25 mg/mL TGF-β3 monomer at pH 9.5) and left at 10 °C for 3 days. The refolded protein was further purified on a Butyl-Sepharose 4 Fast Flow column (GE-Healthcare, Chalfont St Giles, UK) using washes of 0.02 m sodium acetate, 1 m ammonium sulphate, 10% V/V acetic acid (pH 3.3) and TGFβ3 eluted with 0.02 m sodium acetate, 10% V/V acetic acid, 30% V/V ethanol (pH 3.3). The eluate was further purified on a UNO-S1 cation exchange column (Bio-Rad) washed with buffer (20 mm sodium acetate, 10% V/V glacial acetic acid and 30% V/V ethanol, pH 3.9–4.1). TGF-β3 species were eluted using a two-step elution process. A linear gradient of 0% to 60% elution buffer (10% V/V glacial acetic acid, 30% V/V ethanol and 1 m sodium chloride, pH 3.9–4.1) preceded a second linear gradient of 60%–100% elution buffer. Eluted TGFβ3 fractions were pooled and concentrated using a preconditioned ultrafiltration/diafiltration system (5-kDa exclusion), buffer exchanged for formulation buffer (0.12% V/V acetic acid, 20% V/V ethanol, pH 4) and stored at 10 mg/mL (by A278 nm).

Mass spectrometry and N-terminal sequencing

Mass analyses of purified refolded chloroplast TGFβ3 and the TGFβ3 standard were determined by electrospray ionization mass spectrometry (LCT ESI-TOF; Waters) in the biomolecular analysis core facility (University of Manchester). Protein mass was calculated using the maximum entropy iteration process within MassLynx software (Waters). The N-terminal sequence of the purified refolded chloroplast TGFβ3 was determined by Edman degradation using a ProSorb cartridge (Applied Biosystems, Warrington, UK) and Applied Biosystems Procise sequencer at the Proteomics Facility (University of Leeds).


We thank Drs Paul Fulwood (DNA sequencing Facility), David Knight (Bimolecular Analysis Facility), Jeff Keen (Proteomics Facility, Leeds), Emma Mellors and Adam Bentley for support with TGFβ3 purification, Dr Lis Mudd for critically reading the manuscript and the Biotechnology and Biological Sciences Research Council for financial support.