Simultaneous expression of multiple proteins in plants finds ample applications. Here, we examined the biotechnological application of native kex2p-like protease activity in plants for coordinate expression of multiple secretory proteins from a single transgene encoding a cleavable polyprotein precursor. We expressed a secretory red fluorescent protein (DsRed) or human cytokine (GMCSF), fused to a downstream green fluorescent protein (GFP) by a linker containing putative recognition sites of the kex2p-like protease in tobacco cells and referred to them as RKG and GKG cells, respectively. Our analyses showed that GFP is cleaved off the fusion proteins and secreted into the media by both RKG and GKG cells. The cleaved GFP product displayed the expected fluorescence characteristics. Using GFP immunoprecipitation and fluorescence analysis, the cleaved DsRed product in the RKG cells was found to be functional as well. However, DsRed was not detected in the RKG culture medium, possibly due to its tetramer formation. Cleaved and biologically active GMCSF could also be detected in GKG cell extracts, but secreted GMCSF was found to be only at a low level, likely because of instability of GMCSF protein in the medium. Processing of polyprotein precursors was observed to be similarly effective in tobacco leaf, stem and root tissues. Importantly, we also demonstrated that, via agroinfiltration, polyprotein precursors can be efficiently processed in plant species other than tobacco. Collectively, our results demonstrate the utility of native kex2p-like protease activity for the expression of multiple secretory proteins in plant cells using cleavable polyprotein precursors containing kex2p linker(s).
Crops can be genetically manipulated to improve nutritional value, to resist diseases and stresses and to produce high-value products such as secondary metabolites, therapeutic agents, vaccines and industrial enzymes. First-generation genetically modified crops involve simple traits (e.g. herbicide resistance) that call for introduction of a single gene (in addition to the selection marker gene). In order to introduce multiple traits or more complex traits, however, it often requires coordinated regulation of multiple endogenous genes or introduction of multiple transgenes. Two notable examples are the production of biodegradable plastics polyhydroxybutyrate in Arabidopsis (Nawrath et al., 1994) and the synthesis of β-carotene in ‘golden rice’ (Potrykus, 2001). In producing multimeric proteins, it is also necessary to achieve similar levels of expression for the individual monomeric protein constituents, as in the development of plants producing recombinant antibodies, where co-expression of antibody heavy and light chains is required (Ma and Hein, 1996). As value-adding traits become ever more important to increase the profitability of agriculture and energy crops, the needs to produce multiple co-products from the same crop will continue to grow, which will most certainly benefit from a technology that enables efficient co-expression of multiple proteins (Walker and Vierstra, 2007).
A handful of strategies have been reported for co-expression of multiple proteins in plants (Halpin et al., 2001; Hunt and Maiti, 2001; El Amrani et al., 2004; de Felipe et al., 2006). One promising strategy is to encode multiple proteins, connected in tandem, on a single open reading frame in the form of polyproteins. Specific proteolytic processing of the polyprotein precursor in vivo leads to the release of the individual proteins. This approach enables coordinate expression through the use of a single promoter. Beachy’s group demonstrated the feasibility of this approach by co-expressing two proteins separated by the tobacco etch virus (TEV) NIa protease recognition sequence (heptapeptide cleavage recognition sequence ENLYFQS) together with the Nla proteinase (Marcos and Beachy, 1994, 1997). Dasgupta et al. (Dasgupta et al., 1998) reported a similar polyprotein vector using a NIa proteinase from the tobacco vein mottling virus and demonstrated subcellular localization signals can be effectively recognized in the context of a polyprotein. The TEV NIa proteinase domain is about 48 kDa (while a truncated version is about 27 kDa), which adds to the overall size of the polyprotein. The large size of the polyprotein could cause protein misfolding and result in low protein production yield (Ceriani et al., 1998; Dasgupta et al., 1998). More recently, Walker and Vierstra described a modified ubiquitin (Ub)-based vector for co-expression of two reporter proteins LUC and GUS in tobacco (Walker and Vierstra, 2007). While proper polyprotein processing was achieved using the Ub vector, the protein expression level was lower than those when the same protein was expressed alone. Walker and Vierstra (2007) suggested that the large size of the polyprotein (152 kDa) may have caused unfavourable folding and lead to low protein yield.
In lieu of co-expressing a heterologous protease as a constituent of the polyprotein which is energetically wasteful and could complicate proper protein folding, it is plausible to utilize endogenous plant proteases to process the polyprotein precursor in vivo to release the individual protein elements. François et al. (2002a,b) utilized linker peptide sequences originating from a natural polyprotein occurring in seeds of Impatiens balsamina in constructing polyprotein constructs for expressing two antifungal proteins in Arabidopsis thaliana and demonstrated post-translational polyprotein precursor cleavage. Urwin et al. (1998) linked two protease inhibitors via a propeptide derived from a pea metallothionein–like protein and showed that the chimeric polyprotein precursor could be partially cleaved in A. thaliana. The actual protease(s) that are responsible for the polyprotein precursors reported in these studies have yet to be identified. Another plausible strategy in this type of approach is to use linker sequences that are putative substrates of known endogenous plant proteases. An example to this end is to exploit the bacterial subtilisin or yeast kexin-like serine protease activities that have also been noted in plant systems. Yeast or mammalian kexins are type I integral membrane endopeptidases that reside in the trans-Golgi network (Wilcox and Fuller, 1991). The presence of kex2p-protease-like activity in tobacco was reported by Bruenn’s group by demonstrating the processing of a virally encoded antifungal preprotoxin in tobacco (Kinal et al., 1995). Subsequently, Rogers’ group showed that the substrate specificity of plant kex2p-like protease is similar to those of the fungal and yeast kex2p proteases (Jiang and Rogers, 1999). Although no gene encoding plant kex2p proteases has been cloned from tobacco, expression of subtilisin-like serine proteases has been reported in a number of plant species including Arabidopsis (Liu et al., 2009) and tomato (Janzik et al., 2000). The findings from these fundamental research studies prompted us to examine the biotechnological applications of kex2p-like activities in plants for the expression of multiple secretory proteins from a single transgene encoding a cleavable polyprotein precursor.
In this study, we created polyprotein constructs that encode a red fluorescent protein variant (DsRed) or a human cytokine (granulocyte macrophage colony stimulating factor; GMCSF), linked to a green fluorescent protein variant (GFP) by three copies of a putative recognition site of the kex2p-like protease, and characterized the expression and processing of the polyprotein precursors to determine the general applicability of the kex2p substrate linker in cleavable polyprotein precursors for co-expression of multiple proteins in plants. Here, GMCSF serves as a model pharmaceutical protein, and we have exploited the unique benefit of stoichiometric expression using the cleavable polyprotein constructs to estimate the production level of the processed GMCSF based on the GFP fluorescence without requiring direct covalent fusion of GMCSF to GFP as in conventional GFP fusion protein approaches. By encoding GMCSF and GFP connected via a kex2p substrate linker in the polyprotein construct, it provides a novel alternative for monitoring protein expression using fluorescent protein fusion, and it is especially useful if permanent fusion hampers the protein function (Su, 2005).
Design of polyprotein expression constructs
In order to exploit the endogenous kex2p-like activity in plants as a general strategy for processing heterologous polyprotein precursors, we developed two binary expression vectors (respectively termed RKG and GKG herein) each encodes a secretory signal peptide at the N-terminal, an upstream protein, followed by a linker that contains putative cleavage sites for the kex2p-like protease (the kex2p linker) and a downstream protein (Figure 1). The upstream protein is DsRed or GMCSF, for RKG and GKG, respectively, whereas the downstream protein is GFP in both constructs. The rice α-amylase signal sequence and the Arabidopsis basic chitinase signal sequence were used in directing protein secretion for RKG and GKG, respectively. In both constructs, the kex2p linker sequence is LEAGG(IGKRGK)3 EF (amino acids Leu/Glu and Glu/Phe were incorporated into the linker as a result of introducing the XhoI and EcoRI cloning sites, respectively) and it contains three copies of a putative recognition site of the kex2p-like protease (i.e. IGKR) which was modelled after the naturally occurring kex2p substrate sequence (IGKRGKRPR) in the Ustilago maydis KP6 preprotoxin. It was reported that when a gene encoding the KP6 preprotoxin was expressed in transgenic tobacco, the preprotoxin was correctly processed into α and β subunits at the kex2p cleavage site (Kinal et al., 1995). In a later study, a linker containing only a single kex2p cleavage site (either IMRKY or IGKRG) was used to link two reporter proteins—a mutated proaleurain and a truncated vacuolar sorting receptor BP-80—yet no cleavage of the linker was observed when the reporter construct was expressed in transgenic tobacco (Jiang and Rogers, 1998). Subsequently, the reporter construct was modified to incorporate a linker having three tandem copies of the IGKRG motifs to improve access of the peptide substrate by the kex2p-like proteases, and cleavage of the new linker was confirmed when the construct was expressed transiently in tobacco protoplasts (Jiang and Rogers, 1999). The kex2p linker was examined in this study for its cleavability in stably transformed Nicotiana tabacum calli, suspension cells and plant tissues, as well as in Agrobacterium tumefaciens-infiltrated leaves of Lactuca sativa L. var. longifolia (Romaine lettuce) and Nicotiana benthamiana. In addition, a construct similar to GKG but lacking the linker (i.e. GMCSF directly fused to the N-terminus of GFP, termed GG herein, Figure 1) (Peckham et al., 2006) was assembled to examine cellular processing of the fusion protein in the absence of the kex2p cleavage sites.
Over twenty independent RKG, GKG or GG transgenic tobacco lines were developed and screened using PCR and GFP Western blot. One line each was selected for initiating callus and cell suspension cultures. The RKG callus cells displayed bright GFP and DsRed fluorescence when examined using an epifluorescence microscope fitted with GFP and DsRed filter set, respectively (Figure 2). Strong fluorescence was seen in the periphery of the cell and around the nucleus, indicating cell wall and perinuclear endoplasmic reticulum (ER) localization, which is similar to those observed in Arabidopsis and tobacco cells expressing a secretory GFP (Haseloff et al., 1997; Su et al., 2004). GFP fluorescence is also visible in GKG and GG cells. These results suggest in planta expression of the fusion protein constructs yielded properly folded proteins. More detailed characterizations were conducted subsequently to examine the cellular processing of the fusion proteins.
Processing of polyprotein precursors in planta
Proteolytic cleavage of the RKG and GKG polyprotein precursors in N. tabacum was investigated using Western blot analysis of protein extracts from calli, suspension cells, leaf, stem and root tissues of transgenic plants, agroinfiltrated leave discs, as well as intercellular fluid from RKG and GKG calli or spent media from RKG and GKG cell suspension cultures. Antibodies against GFP, DsRed and GMCSF were used in Western blots to probe the protein processing. As shown in lane 1 of Figure 3a,c, a strong immunoreactive doublet corresponding to GFP (approximately 27 kDa) cleaved from the polyprotein is clearly visible in both RKG and GKG callus extracts. The doublet likely indicates GFP with and without the kex2p linker peptide. Some uncleaved polyprotein precursors were still present in the RKG extracts (approximately 55 kDa), but GKG processing appeared to be almost complete. It is known that kex2p activity is inhibited by PMSF and EDTA (Bader et al., 2008), which are included in our protein extraction buffer. Therefore, any post-extraction cleavage by kex2p should be inhibited, and what is observed on Western blots (e.g. Figures 3 and 4) should reflect the kex2p-mediated processing in vivo.
In concentrated spent media of either RKG or GKG suspension cultures, a strong anti-GFP antibody reacting band matching the GFP molecular weight (approximately 27 kDa) was also detected (Figure 3a,c, lane 2). Concentrates of spent media from RKG and GKG cultures exhibited fluorescence and absorption spectra that are characteristic of GFP. Similar to the processed GFP, an immunoreactive band that matches the expected molecular size of the processed DsRed protein (approximately 26 kDa) could be seen on the Western blot of RKG callus extract probed with a anti-DsRed antibody, while a second, less intense band matching the molecular weight of the RKG fusion protein (approximately 55 kDa) was somewhat visible (Figure 3b, lane 1). The processed DsRed produced in the RKG cells (Figure 3b, lane 1) has a slightly lower molecular weight than the bacteria-expressed DsRed standard which is known to migrate on SDS-PAGE at a higher than expected molecular weight (Gross et al., 2000). The intensity of the immunoreactive bands recognized by the DsRed antibody was not as strong compared with the bands seen on the GFP Western blots. This might be caused by the lower affinity of the DsRed antibody because we noted that DsRed standard at 100 ng generated a less intense band than that of GFP standard at just 50 ng (cf. Figure 3a,b, lane 3). This also explains why almost no RKG fusion protein is visible in the anti-DsRed Western blot (Figure 3b, lane 1). Unlike GFP, however, no DsRed protein was detected in the spent media of RKG suspension cultures (Figure 3b, lane 2). It is known that native DsRed is a tetramer (Baird et al., 2000) and the tetrameric structure is required for its fluorescent function (Sacchetti et al., 2002). The lack of detectable DsRed in the spent media is therefore likely resulting from formation of tetramers that are confined to the cell wall/apoplastic space. In subsequent experiments, we confirmed that the processed DsRed as well as the RKG fusion proteins indeed formed tetramers as to be discussed further in the next section.
For the GKG culture, both processed GFP and GMCSF could be detected in callus extract (Figures 3c,d, lane 1). The processed GFP product exhibited expected molecular size, while for GMCSF, two immunoreactive bands (approximately 18 and 28 kDa) were noted on the Western blot (Figure 3d, lane 1). A similar dual band pattern was also reported for GMCSF expressed in rice seeds (Sardana et al., 2007). The higher molecular weight GMCSF band might be caused by glycosylation or dimerization as reported in transgenic tobacco cell culture (James et al., 2000), transgenic rice seeds (Sardana et al., 2007) and rice cell suspension culture (Shin et al., 2003) expressing human GMCSF. Whereas essentially no secreted GMCSF can be detected in the spent media of GKG cell culture, a putative secreted GMCSF is observed on the anti-GMCSF blot in the concentrated GKG callus apoplastic fluid sample, though at a level disproportionately lower than the amount of secreted GFP (cf. Figure 3c,d, lane 2). The putative extracellular GMCSF product also displayed a size smaller than its intracellular counterpart. This, along with the very low level of extracellular accumulation, could be attributable to the high susceptibility of GMCSF to proteolytic degradation in tobacco cell suspension culture as previously reported (Shin et al., 2003).
Because yeast kexin-like serine protease activities are likely to be responsible for the observed processing of the RKG and GKG polyprotein precursors in tobacco and that yeast or mammalian kexins reside in the trans-Golgi network (Wilcox and Fuller, 1991), we have exploited the unique properties of the H/KDEL receptor and ER retention signal in order to gain further insights into the cellular location of the plant kex2p-like activity. We infected N. benthamiana leaves with an RNA-viral vector containing coding sequence for the RKG polyprotein tagged with an ER retention signal (HDEL) at the C-terminal (i.e. the TTOSA1 RKG-HDEL viral vector, Figure 1). The mesophyll tissues of the viral-infected leaves exhibited both green and red fluorescence with high intensity, indicating that the fusion protein was expressed in these tissues at high levels. However, Western blot analyses with anti-GFP and anti-DsRed antibodies showed only the RKG fusion protein but no cleaved proteins of either GFP or DsRed (Figure 3e,f). As the HDEL receptor resides in the cis-Golgi, the fact that the RKG polyprotein cleavage did not occur supports the notion that the tobacco kex2p activity is localized in the secretory pathway past the cis-Golgi stage. This finding is consistent with that of Jiang and Rogers (1999) who reported kex2p-like activity was attenuated by treating with brefeldin-A which is known to induce destruction of Golgi stacks.
Tissue specificity of the kex2p-like activity on polyprotein processing was then analysed in transgenic tobacco RKG and GKG plants. Leaf, stem and root tissues from the transgenic RKG and GKG plants were collected and analysed by Western blot probed with GFP antibody. As shown in Figure 4, cleaved GFP along with some uncleaved polyprotein precursor can be detected in all tested tissues. While the extent of processing does not appear to differ much between the three types of tissues tested, the highest expression was seen in the root tissues for both RKG and GKG (note that equal total protein was loaded in each lane). This is likely due to the use of the (ocs)3/mas promoter, which was reported to give higher expression in roots compared with other parts of the plant (Ni et al., 1995). It is worthy of noting that comparing with the root tissues, expression and processing in the leaf tissues are also quite good (Figure 4, lane 2). Also noted was that both young developing leaves and mature developed (fully expanded) leaves showed similar processing efficiency. Considering that leaf agroinfiltration has gained favour as a preferred industrial molecular farming platform (D’ Aoust et al., 2009), the endogenous kex2p-like activity in leaf tissues could be exploited for multi-protein expression at a large scale.
In summary, the majority of the RKG and GKG polyprotein precursors expressed in N. tabacum was indeed cleaved in vivo, yielding GFP that was secreted and either DsRed or GMCSF for RKG and GKG cells, respectively. We further verified that the observed cellular cleavage of RKG and GKG fusion proteins was indeed attributable to the presence of the kex2p linker as discussed in the following section.
Characterization of proteins processed from the polyprotein precursors
Using Western blots, we were able to demonstrate in vivo processing of the RKG and GKG proteins in tobacco cells. To determine whether the processed DsRed product was functional, we performed GFP immunoprecipitation (IP) to remove uncleaved RKG protein and processed GFP to allow analysis of the size and functionality of the processed DsRed product. After subjecting RKG callus extracts to IP with an anti-GFP antibody, almost all of the GFP fluorescence was lost (Figure 5a; indicating removal of both RKG fusion protein and processed GFP product) but about two-thirds of DsRed fluorescence remained in the solution (Figure 5b). These results indicate that the cleaved DsRed is functional (i.e. displays expected fluorescence characteristics). To investigate whether the RKG fusion protein and the released DsRed protein form tetramers, RKG callus extract was subjected to GFP IP, followed by DsRed Western blot analyses with partial denaturing (nonboiling) SDS-PAGE which has been shown to preserve the tetrameric structure of DsRed (Baird et al., 2000). As shown in Figure 5c, processed DsRed migrated as a tetramer (lane 4; approximately 100 kDa), while the uncleaved RKG fusion protein also appeared to be a tetramer (lane 2 upper band; approximately 200 kDa) for unboiled samples. Upon sample boiling, the corresponding monomeric RKG and processed DsRed became visible (Figure 5c, lanes 1 & 3).
Because in the kex2p-linker-based polyprotein construct, the protein moieties are translated from a single open reading frame and then released post-translationally by kex2p-like protease activity, the individual protein moieties should in principle be accumulated at approximately stoichiometric levels inside the cell, provided that they have similar protein stability. Using RKG cell extracts, we determined the concentrations of the processed DsRed and GFP based on fluorescence calibration curves established using pure fluorescent proteins. Approximately equimolar quantities of DsRed (monomer) and GFP were noted, demonstrating the feasibility to achieve stoichiometric equivalence of individual protein moieties using the kex2p-linker-based polyprotein constructs.
Next, the biological activity of processed GMCSF from the GKG cells was determined using GMCSF-dependent human TF1 cell proliferation assay. Utilizing the concept of stoichiometric expression of the protein moieties encoded in the polyprotein construct, as demonstrated in the expression of DsRed and GFP in RKG, the GMCSF concentration in the GKG extract was estimated based on the GFP fluorescence of the GKG extract. As GKG cleavage in vivo was found to be highly efficient (Figure 3c,d), biological activity of the GKG extract should derive primarily from the processed GMCSF. Different amounts of GKG callus extract that gave equivalent GMCSF concentrations from 0 to 20 ng/mL were added to the growth medium of human TF1 cells, and the extents of cell proliferation were measured and compared. To cancel out potential effect from the tobacco callus extract, we used RKG extracts spiked with 0, 0.5, 1.0 and 2.0 ng/mL of purified recombinant GMCSF for comparison. The amount of the RKG extract used was chosen to match the total protein concentration of the GKG extract sample that gave 20 ng/mL of equivalent GMCSF. We obtained similar results when the background RKG extract protein amount was cut in half. As shown in Figure 6, increased TF1 proliferation (indicated by the increasing absorbance at 490 nm owing to the bioreduction of a tetrazolium compound and formation of coloured formazan by live cells) with increasing GMCSF concentrations was noted for both GKG extract as well as RKG extract spiked with pure GMCSF. The growth-stimulating effect tapered off at a GMCSF concentration exceeds about 5 ng/mL. This result demonstrates that the processed GMCSF produced in the GKG cells is biologically active. Furthermore, provided that the specific GMCSF activity in GKG extract is similar to that of the pure GMCSF spiked into the RKG extract, the GMCSF concentration estimation based on GFP fluorescence of the GKG extract turned out to be quite accurate, evident by the closeness of the two curves in Figure 6. At a GMCSF concentration of 2 ng/mL, the absorbance is differed by <10% between the GKG extract and the RKG extract spiked with GMCSF.
To demonstrate that the observed cellular cleavage of polyprotein precursor was indeed attributable to the presence of the kex2p linker, we transformed tobacco plants with a binary expression construct coding for GMCSF and GFP in direct tandem fusion, lacking the kex2p linker, but with a signal peptide for secretion (i.e. pBIN-GG, Figure 1). As shown in Figure 7a, lane 2, no cleavage occurred in the direct fusion protein. In addition, no GFP or GMCSF-GFP (GG) fusion protein was detected in the spent culture media (Figure 7a, lane 3). Note that the Arabidopsis basic chitinase secretion signal used in pBIN-GG was also used in pE1226-GKG, which did lead to secretion of the GFP-processing product (Figure 3c, lane 2).
We have taken several approaches to examine the location of cleavage in the polyprotein precursor. First, we purified the secreted GFP from the spent media of RKG suspension culture (expressing pE1226-RKG) and analysed its N-terminal amino acid sequence using Edman degradation. We found that the N-terminus of the processed and secreted GFP was short of five amino acids residues (GKEFS) from the anticipated linker cleavage site, i.e. at the C-terminus of IGKR (Table 1). Of the five missing amino acids, serine is the first amino acid of native GFP (Table 1). We also purified and sequenced secreted GFP from cultured transgenic tobacco cells that express GFP alone with a secretory signal (i.e. pBIN-GFP, Figure 1) (Su et al., 2004). Interestingly, about half of the GFP population was also missing the N-terminal serine residue exactly as the processed and secreted GFP product from the RKG cells, while the remaining population had its signal peptide removed at exactly the expected site, leaving two extra amino acids, glutamic acid and phenylalanine (resulting from introduction of the EcoRI cloning site), in front of the first GFP amino acid serine (Table 1). Cellular processing of secreted GFP proteins expressed from pBIN-GFP vs. pE1226-RKG shared certain common features. For the secretory GFP derived from pBIN-GFP, the signal sequence is removed by the signal peptidase in the ER lumen, and the processed GFP travels through the secretory pathway and is destined for secretion. For the secretory GFP derived from pE1226-RKG, the RKG polyprotein precursor travels through the secretion pathway, cleaved post-cis-Golgi, releasing the processed GFP which is then secreted out of the cells. The secreted GFP from both of these two systems exhibited removal of the N-terminal Ser residue. This finding suggests initial proteolytic cleavage by signal peptidase or kex2p-like protease (for pBIN-GFP and pE1226-RKG, respectively) followed by trimming the exposed N-terminal peptide extension via additional proteases in the secretory pathway. Another piece of evidence that supports this notion is from the GFP Western blots of RKG and GKG calli in which two distinctive immunoreactive bands around 27 kDa are always visible, while the lower band matches the size of the GFP band secreted into the medium (Figure 3a,c, lane 1 vs. lane 2). In another experiment, we digested partially purified polyprotein precursors from extracts of RKG suspension cells (by removing the cleaved DsRed and GFP products using Q-Sepharose and size-exclusion chromatography) with purified recombinant Saccharomyces cerevisiae Sckex2p from transgenic Pichia pastoris and analysed the resulting extracts using Western blot probed with GFP and DsRed antibodies. Intriguingly, the results (Figure 7b,c) indicated that both in vitro (Sc-kex2p) processed GFP and DsRed were slightly larger than the corresponding in planta processed products in the RKG callus extracts (cf. lanes 2 and 3). This result not only confirmed that the (IGKRGK)3 motif is indeed a suitable substrate for kex2p but also supported the aforementioned polyprotein cleavage hypothesis.
Table 1. Molecular analysis of DsRed and green fluorescent protein (GFP) products released from the RKG polyprotein precursor using MS analysis and N-terminal sequencing
In addition to analysing the downstream GFP, we purified the upstream protein DsRed from the RKG cells and analysed it using mass spectrometry (MS). Purification of the processed DsRed from the RKG callus extract was achieved using copper affinity chromatography (Rahimi et al., 2007), followed by DEAE ion-exchange chromatography and size-exclusion chromatography. The purity was over 80% based on SDS-PAGE analysis. This purified DsRed was then analysed using N-terminal sequencing, matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) and eletrospray ionization time-of-flight mass spectrometry (ESI-TOF MS). According to N-terminal sequencing, the DsRed product released from the RKG polyprotein lacks the first three residues (Met, Pro, and Ser; cf. Table 1). The observed molecular mass of the processed DsRed protein measured using MALDI-TOF MS and ESI-TOF MS is 25 700 and 25 694 Da, respectively. This mass matches that of a DsRed protein lacking the three native N-terminal residues and having two additional amino acids (Leu and Glu originated from the kex2p linker) at its carboxyl terminus (calculated mass is 25 699 Da; Table 1). This finding, along with those presented earlier, suggests initial proteolytic processing within the kex2p substrate linker and subsequent digestion of the peptide linker extension by additional proteases in the plant cell secretory pathway. There have been an increasing number of reports on proteolytic removal of a stretch of linear amino acids (such as affinity tags) fused to the termini of proteins targeted to the secretion pathway in some plant species (van Esse et al., 2006; Pizzuti and Daroda, 2008). It is postulated that any stretch of nonstructural and exposed peptide sequences not protected by a compact tertiary protein structure may be particularly vulnerable to proteolytic degradation especially in cells of Solanaceous species (van Esse et al., 2006).
Processing of polyprotein precursors in other plant species
To examine whether plant endogenous kex2p activity could be used in plant species other than N. tabacum to process polyproteins, we conducted transient expression of RKG polyprotein in L. sativa L. var. longifolia (Romaine lettuce) and N. Benthamiana via agroinfiltration and analysed protein processing patterns by Western blotting probed with anti-GFP antibody. As shown in Figure 8, processed GFP can be detected in all three plant species upon agroinfiltration. This result suggests potentially general utility of plant endogenous kex2p-like activity for expressing multiple proteins from a single transgene in planta.
The purpose of this study was to exploit the plant endogenous kex2p-like protease activity in developing a strategy to co-express multiple proteins via in vivo proteolytic processing of a polyprotein precursor. While such strategy has been hinted in the patent and review literature (Broekaert et al., 2000; François et al., 2002a; Jiang and Sun, 2002; Faye et al., 2005), no detailed study has yet been published that investigated this approach as a general strategy for multi-protein expression in plant systems. In two previous studies, endogenous kex2p-like pro-protein convertase activity was implicated in the processing of a preprotoxin and a proaleurain reporter in tobacco, and the protease activity was reported to exhibit substrate specificity characteristic of yeast kex2p (Kinal et al., 1995; Jiang and Rogers, 1999). Using a purified recombinant Saccharomyces kex2p protease, we confirmed the (IGKRGK)3 peptide sequence used in the present study is indeed a suitable substrate for yeast kex2p. We also demonstrated that this peptide sequence can be effectively cleaved in planta, releasing the proteins flanking the sequence. RKG and GKG polyprotein precursors are efficiently processed in tobacco cells into DsRed and GFP, and GMCSF and GFP, respectively. Furthermore, the processed GFP product in both constructs was successfully secreted into the culture medium. The processed DsRed product was found to be functional, though not detectable in the culture medium. The DsRed released from the RKG polyprotein formed tetramers that are likely confined within the cell wall, as penetration through the cell wall is known to be the limiting step for secretion of many recombinant proteins in plant cells (James and Lee, 2001). The GMCSF processed from the GKG polyprotein precursor showed biological activity in supporting proliferation of the GMCSF-dependent TF1 cells in a dosage-dependent manner. Only very low amounts of GMCSF were detected in the GKG callus apoplastic concentrate, which might result from the proteolytic degradation of secreted GMCSF, as very high protease activity in apoplasts and spent culture medium of transgenic tobacco cells has been reported and GMCSF is known to be prone to proteolytic degradation in this environment (Lee et al., 2002; Shin et al., 2003).
The processing of the RKG and GKG polyprotein precursors in planta may involve sequential actions of plant endogenous kex2p-like endoproteases, followed by additional exoproteases that trim off peptide extensions originated from the cleaved kex2p linker. This conclusion was supported by the slightly smaller molecular size of in planta-processed protein products vs. the in vitro yeast kex2p-processed protein products derived from the RKG polyprotein precursor (Figure 7b,c), along with the N-terminal amino acid sequence information that indicated no extra amino acid originated from the kex2p linker sequence is present at the amino terminus of the processed and secreted GFP produced by the RKG cells, and data from N-terminal sequencing and MS-based mass analysis that revealed the processed DsRed contains only two extra residues originated from the kex2p linker (Leu and Glu) at its C-terminus (Table 1). When GFP alone was expressed and directed for secretion using a signal peptide, about half of the secreted GFP was found to lose its N-terminal serine residue exactly as the processed and secreted GFP product from the RKG cells. We also noted that on the GFP Western blots of RKG and GKG callus extracts, two distinctive immunoreactive bands around 27 kDa are always visible, while the lower band matches the size of the GFP band secreted into the medium (Figure 3). The upper band in this case is likely processed GFP that still contains a peptide extension originated from the kex2p linker. All of these findings indicate initial proteolytic cleavage of the polyprotein precursors by kex2p-like protease and subsequent proteolytic trimming of the exposed peptide extension by additional enzymes in the secretory pathway. In fact, a similar observation has been reported for polyprotein vectors that utilize an intervening linker sequence derived from the natural polyprotein present in the I. balsamina seeds (François et al., 2002c). It was reported that initial cleavage of the I. balsamina linker occurred by the action of an endoproteinase, and unidentified exoproteinases were responsible for the subsequent trimming of the cleaved protein products (François et al., 2002c).
We demonstrated in this study that the kex2p-linker-based polyprotein construct can be used to achieve nearly stoichiometric expression of the individual proteins encoded in the construct. This useful feature was exploited in this study to accurately estimate the titre of GMCSF from simple GFP fluorescence measurement in the GKG cell extracts. Using a HDEL ER retention signal, we proved that cleavage of polyprotein precursors occurs en route the secretory pathway in a cellular compartment beyond cis-Golgi. Processing of polyprotein precursors was observed to be similarly effective in tobacco leaf, stem and root tissues. We also demonstrate, via agroinfiltration, that polyprotein precursors can be efficiently processed in plant species other than tobacco. These findings point to potential general application of kex2p substrate linkers in production of multiple proteins from a single transgene in large-scale molecular farming via agroinfiltration. Taken together, the kex2p-linker-based polyprotein constructs provide an effective tool that simplifies coordinate expression of multiple proteins in plant cells and enables new applications such as recombinant protein monitoring without requiring permanent reporter protein fusion.
Construction of TTOSA1 RKG-HDEL viral vector and plant transfection
A hybrid tobamoviral vector construct TTOSA1 RKG-HDEL encoding a fusion protein of DsRed, followed by a linker sequence, LEAGG(IGKRGK)3 EF, and a GFP with a HDEL (His-Asp-Glu-Leu) sequence at its C-terminal for ER retention was constructed as follows. GFP coding region was amplified from pBIN-mgfp5ER (Haseloff et al., 1997) using a forward primer (5′-gccgaattcagtaaaggagaagaacttttc-3′) and a reverse primer (5′-gcggaattccctaggttaaagctcatcatgtttgtatag-3′). The amplified 750-bp GFP fragment was digested with EcoRI and cloned into pLJ607 that contains the kex2p linker (Jiang and Rogers, 1999). The linker-ligated GFP was reamplified using a forward primer with a XhoI site (5′-gcactcgaggcaggaggaataggcaaacgg-3′) and a reverse primer with an AvrII site and the codons for C-terminal ER retention signal HDEL (5′-gcggaattccctaggttaaagctcatcatgtttgtatag-3′). DsRed sequence was amplified from pGDR (Goodin et al., 2002) using a forward primer with SphI site (5′-gcagcatgccctcctccgagaacgtcatc-3′) and a reverse primer with XhoI site (5′-gctctcgagctcatcatgcaggaacaggtggtggcg-3′). The amplified fragments of DsRed and the linker-ligated GFP were digested with the respective restriction enzymes and ligated simultaneously into the viral vector TTOSA1-103SPEKCD43 that had been digested with SphI and AvrII, behind the rice α-amylase signal peptide, to yield the plasmid TTOSA1-RKG-HDEL (Figure 1). The viral vector was transformed into Escherichia coli C600. In vitro RNA transcripts derived from TTOSA1-RKG-HDEL clone were mechanically inoculated onto N. benthamiana plants for systemic infection. Proteins extracted from the leaves 1–2 weeks after the infection were used for analysis.
Construction of plant binary expression vectors
A binary construct pE1226-RKG was created from the TTOSA1-RKG-HDEL construct by eliminating the ER retention signal HDEL at the C-terminal of the fusion protein. The upstream DsRed along with the secretory signal peptide coding sequence was amplified from the TTOSA1-RKG-HDEL construct using a forward primer containing a SalI site (5′-gcagtcgactgtgtctgcaccatgcaggtg-3′) and a reverse primer containing a XhoI site (5′-gctctcgagcaggaacaggtggtggcggcc-3′). Similarly, the downstream GFP along with the kex2p linker sequence was amplified by a forward primer containing a XhoI site (5′-gcactcgaggcaggaggaataggcaaacgg-3′) and a reverse primer containing an XbaI site (3′-gcgtctagattatttgtatagttcatccat-3′). The two fragments were then successively ligated into pBluescript SK after digestion with the respective restriction enzymes to make pBluescript-RKG. Upon confirming the DNA sequence, the entire polyprotein coding sequence was excised from pBluescript-RKG by digesting with SalI and SpeI and ligated into pE1226 digested with the same enzymes to get pE1226-RKG. Expression of the RKG coding sequence is under the control of the chimeric mannopine/octopine synthase (ocs)3/mas promoter (Ni et al., 1995), which gives equivalent or stronger expression than the commonly used CaMV 35S promoter (Ni et al., 1995; Becerra-Arteaga et al., 2006).
A direct fusion of GMCSF in tandem with GFP (without the linker) was expressed under the control of a double 35S promoter with an alfalfa mosaic virus (AMV) enhancer (Datla et al., 1993) using a binary vector pBIN-GG (Figure 1). An Arabidopsis basic chitinase signal sequence was used to direct secretion of the fusion protein. Construction of the pBIN-GG vector is described elsewhere (Peckham et al., 2006). To construct the GKG expression vector, GMCSF along with the Arabidopsis signal peptide was amplified from pBIN-GG by PCR using a forward primer containing a Sal I site followed by the Kozak sequence (5′-cgcgtcgacgccaccatggcgactaatctt-3′) and a reverse primer containing a Xho I site (5′-ccgctcgagctcctggactggctcccagcag-3′). The fragment was then digested with Sal I and Xho I and moved into the pBluescript-RKG that was digested with the same restriction enzymes to make pBluescript-GKG. Finally, the entire fusion sequence was moved into pE1226 to get pE1226-GKG. The construction of the secretory GFP vector (pBIN-GFP, Figure 1) is described in Su et al. (2004).
The binary constructs were transformed into A. tumefaciens LBA4404 containing a disarmed Ti plasmid pAL 4404 (Invitrogen, Carlsbad, CA, USA) by electroporation and selected on YM medium (0.04% yeast extract, 1.0% mannitol, 1.7 mm NaCl, 0.8 mm MgSO4, 2.2 mm K2HPO4) containing 100 mg/L kanamycin and 100 mg/L streptomycin. After confirming by PCR, the transformant bacteria were used to infect tobacco (N. tabacum cv Xanthi) leaves. Leaf discs from 3-week-old plants were inoculated with the transformant Agrobacterium in the presence of 0.2 mm acetosyringone and placed on MS medium (Murashige and Skoog, 1962) containing 1 mg/L l 6-benzyl amino purine (BAP) for shoot induction, 300 mg/L kanamycin for selection and 300 mg/L cefotaxime to inhibit the bacterial growth. The resultant shoots were then transferred onto medium without the BAP for root growth and then screened for the expression level of the recombinant protein by Western blot analyses. Calli were initiated from stem sections of select high-expressing transgenic lines on MS media containing 3% sucrose, 1 mg/L 2,4 dichlorophenoxy acetic acid (2,4-D) and 0.1 mg/L kinetin. Cell suspension cultures were initiated from the callus tissues in the same medium.
Protein extraction was carried out essentially as reported earlier (Peckham et al., 2006) with minor changes. Briefly, intracellular soluble proteins were extracted from leaves or calli by grinding in liquid nitrogen with an equal volume of an aqueous buffer containing 0.25 m sodium borate (pH 8.0), 0.25 m NaCl, 1% caffeine, 1% ascorbic acid, 1 mm DTT, 5 mm EDTA and 1 mm PMSF. Extracellular fluids were concentrated from the spent medium (of suspension culture) or from the intercellular space of calli, using 10-kDa molecular weight cut-off centrifugal membrane filter (Millipore, Billerica, MA, USA). All extract samples were prepared fresh and used immediately for subsequent western, chromatography or bioassay experiments.
SDS-PAGE and Western blot analysis
Extracted proteins were separated on SDS-PAGE and blotted onto a PVDF membrane (Immobilon-P; Millipore). Protein-blot membranes were blocked and hybridized with rabbit anti-GFP (Invitrogen), rabbit anti-DsRed (clontech) or mouse anti-GMCSF antibody (R&D system, Minneapolis, MN, USA) followed by goat anti-rabbit or goat anti-mouse alkaline phosphatase-conjugated secondary antibodies (Southern Biotech, Birmingham, AL, USA) and detected by the BCIP-NBT coupling reaction. Partial denaturing SDS-PAGE used for determining the oligomeric structure of the RKG polyprotein precursor and processed DsRed was based on the observation that DsRed remains as a tetramer in SDS-PAGE without sample boiling (Baird et al., 2000). Slight modification to the method of Baird et al. (2000) was made. Here, samples were mixed 4 : 1 with 5× SDS sample buffer (containing 10%β-mercaptoethanol) and immediately loaded to 10% SDS-PAGE gel without boiling. After electrophoresis, the samples were further analysed by immunoblotting. In all Western blots probed for GFP reported here, except in Figures 3e and 7a in which a recombinant GFP protein from Clontech was used as standard, an E. coli expressed His6-GFP-E3 fusion protein of about 30 kDa was used as GFP standard. The 21-amino acid E3 is a helical coil peptide motif (Litowski and Hodges, 2002).
GFP purification for N-terminal amino acid sequencing
Secreted GFP derived from liquid-cultured tobacco cells expressing RKG or GFP alone (Su et al., 2004) was purified for N-terminal amino acid sequencing. Spent media of liquid suspension cultures were separated from the cells by filtration through a nylon gauge (mesh size 10 μm) and clarified by adding ammonium sulphate to 30% saturation. The clarified medium was then concentrated with a 10-kDa molecular weight cut-off centrifugal filter device (Millipore), followed by SDS-PAGE separation and blotted onto a PVDF membrane. The candidate bands were excised from the membrane after staining with a Coomassie blue dye. The N-terminal amino acid sequencing was conducted using the Edman degradation technique on a Perkin Elmer Applied Biosystems Procise 494 (Applied Biosystems, Carlsbad, CA, USA) protein/peptide sequencer with an on-line Perkin Elmer Applied Biosystems Model 140C PTH Amino Acid Analyzer, performed by the Protein Facility at Iowa State University.
DsRed purification for N-terminal sequencing, MALDI-TOF MS and ESI-TOF MS analysis
DsRed was purified from RKG calli using a three-step column chromatography procedure. RKG callus extract was first subjected to copper affinity chromatography described by Deo and coworkers (Rahimi et al., 2007) with some modifications. Briefly, 0.1 m CuSO4 was loaded onto a HiTrap Chelating HP column (GE Healthcare, Piscataway, NJ, USA) connected to a Biologic Duo Flow chromatography system (BioRad, Hercules, CA, USA). The column was washed with deionized water, followed by 20 mm sodium acetate (pH 4.0) buffer containing 1 m NaCl and equilibrated with a binding buffer (50 mm sodium phosphate containing 300 mm NaCl and 5 mm imidazole at pH 8). After loading the sample, the column was washed with the binding buffer and eluted with a linear gradient of imidazole from 5 to 100 mm in the binding buffer over 20 column volumes. Fluorescent fractions were concentrated and further separated on Shodex IEC DEAE-825 column connected to a Shimadzu HPLC system equipped with a fluorescence detector. Prior to sample injection, the IEC column was pre-equilibrated with 20 mm Tris–Cl buffer (pH 8.0) (buffer A). After sample loading, the column was isocratically washed with a mixture of buffer A and the same buffer containing 500 mm NaCl (buffer B) at a volume ratio of 95 : 5 for 10 min, followed by elution with a linear gradient of buffers A and B from 95 : 5 to 50 : 50 over 45 min. A flow rate of 1 mL/min was used during the entire purification procedure.
After the ion-exchange separation, the DsRed fluorescent fractions were concentrated and separated on a BIOSEC S-2000 size-exclusion column (Phenomenex, Torrance, CA, USA) connected to a Shimadzu HPLC system equipped with a fluorescence detector, using a mobile phase of 20 mm sodium phosphate buffer (pH 7.0) containing 200 mm NaCl at a flow rate of 1 mL/min. The DsRed fluorescent fractions were concentrated and desalted with Zeba spin desalting columns (Thermo Scientific, Waltham, MA, USA). N-terminal sequencing of the processed DsRed was performed using the same procedure as for GFP described earlier. The MALDI-TOF mass spectrometry was performed by the Protein Facility at Iowa State University using a Voyager DE-Pro MALDI-TOF mass spectrophotometer (Applied Biosystems) operated at a linear mode. Electrospray Ionization (ESI) mass spectrometry was performed on an Agilent 6210 LC/MS-TOF system fitted with an ESI source operated in positive ion mode. MassHunter software (Agilent Technologies, Inc., Santa Clara, CA, USA) was used for data acquisition. ESI scanning was performed with an m/z range from 150 to 3200. Samples were acidified in 1% formic acid for 10 min to disassociate DsRed tetramers into monomers before being applied to the mass spectrometer with a solvent consisting of 60% acetonitrile in water plus 0.1% formic acid.
Expression and purification of yeast kex2p for in vitro proteolytic digestion
Pichia pastoris expressing S. cerevisiae kex2p gene (Sckex2p) was a kind gift from professor Bernhard Hube of the Friedrich-Schiller-University, Jena, Germany. Expression of Sckex2p in P. pastoris and subsequent purification were performed essentially as described by Hube and coworkers (Bader et al., 2008). Briefly, P. pastoris cells were grown in buffered minimal glycerol medium at 30 °C overnight and then in buffered minimal methanol medium for 16 h. The culture supernatant was then collected and concentrated with a 30-kDa molecular weight cut-off centrifugal filter and desalted with a PD-10 column. The desalted crude enzyme sample was loaded onto a HiTrap ANX FF anion-exchange column (GE Healthcare), rinsed with the 50 mm BisTris buffer (pH 4.5) containing 10 mm NaCl and then eluted with the same buffer containing 100 mm NaCl. The purity of the recovered yeast kex2p protein was verified by SDS-PAGE. For in vitro proteolytic digestion, purified yeast kex2p was incubated with RKG and GKG protein extracts at room temperature for 1 h. The enzymatic digestion was stopped by rapid heating to 100 °C for 4 min in the SDS-PAGE loading buffer.
Separation of cleaved DsRed and RKG fusion protein
RKG calli were extracted and loaded onto a Q-Sepharose HP cation-exchange column (GE Healthcare) connected to a Biologic Duo Flow chromatography system (Bio-Rad), washed with 20 mm Tris–Cl (pH 8.0) and eluted with a linear gradient from 0 to 500 mm NaCl over 20 column volumes. Fractions with DsRed fluorescence were collected and further separated on a BIOSEC S-2000 size-exclusion column (Phenomenex) connected to a Shimadzu HPLC system equipped with a fluorescence detector. The mobile phase was 50 mm sodium phosphate buffer (pH 6.8), and the flow rate was set at 1 mL/min. The corresponding molecular weights of peaks displaying DsRed fluorescence were calculated based on a calibration curve established with gel filtration standards (Bio-Rad). By combining the ion-exchange and size-exclusion chromatography, the uncleaved RKG protein could be separated from the cleaved DsRed and GFP products. In the experiments to confirm the functionality (fluorescence) of DsRed processed from the RKG fusion protein, uncleaved RKG fusion protein and processed GFP in the RKG protein extracts were removed by GFP IP using Dynabeads® protein G (Invitrogen) conjugated with anti-GFP antibody (Cell Sciences, Canton, MA, USA) following the manufacturer’s instructions.
Transient expression was carried out in leaf discs by vacuum-infiltrating the recombinant Agrobacteria (Joh et al., 2005). Briefly, fresh streaks were made from the glycerol stocks of A. tumefaciens C58C1 transformed with the respective binary vector on LB-agar plate containing 100 μg/mL of kanamycin and 15 μg/mL of tetracyclin and incubated for about 48 h at 28 °C. A single colony was then inoculated into a starter culture of 5 mL of YEP medium (10 g/L yeast extract, 10 g/L peptone, and 5 g/L sodium chloride) containing 100 μg/mL of kanamycin and 15 μg/mL of tetracyclin and incubated for about 48 h at 28 °C with shaking at 200 rpm. The starter culture was in turn inoculated into 250 mL of the same media containing the antibiotics and incubated under the same conditions until the optical density of the culture at 600 nm reached about 0.5. The Agrobacterium cells were then collected by centrifuging the cultures at 3000 g for 5 min at room temperature and resuspended in 125 mL of sterile distilled water containing 200 μm acetosyringone and 0.01% Tween-20. Leaves from about 3-week-old tobacco, or N. benthamiana plants grown in the laboratory, or L. sativa L. var. longifolia (Romaine lettuce) purchased from a local market were used for the infiltration. Leaf discs of about 1.3 cm diameter were punched out using a cork borer. Approximately 2 g of leaf discs were used per 100 mL of Agrobacterium suspension. Erlenmeyer flasks (250 mL)with baffles were used with 125 mL of Agrobacterium suspension per transformation. The flasks were placed in a desiccator connected to vacuum. About 250 mbar vacuum was applied for 20 min with gentle shaking (150 rpm) of the desiccator. The vacuum was then quickly released to facilitate efficient infusion of the bacteria into the tissue. The leaf discs were drained off of Agrobacterium suspension and placed on sterile moistened filter paper in 10-cm Petri plates. The plates were sealed with paraffin film and incubated in the dark at 26 °C for 3–6 days.
Cell proliferation assay of GMCSF
The biological activity of GMCSF expressed in GKG calli was assayed using TF-1 cells (Kitamura et al., 1989). Cryo-preserved stock culture of TF-1 cells (ATCC CRL-2003) was inoculated into RPMI-1640 medium supplemented with 10% foetal bovine serum, 5 ng/mL GMCSF and 1× penicillin–streptomycin and incubated at 37 °C under 5% CO2. The cells were subcultured into fresh medium at least three passages before using for the assay. The assay procedure was essentially as described previously (Peckham et al., 2006). Briefly, TF-1 cells of at least 90% viability were collected by brief centrifugation and suspended in the above medium without GMCSF to a density of 2–3 × 105 cells/mL and supplemented with different amounts of GKG callus extracts or RKG callus extract spiked with different amounts of pure GMCSF protein. The callus extraction buffer consisted of 20 mm sodium phosphate buffer (pH 7.4), 150 mm NaCl and 1 mm PMSF. Ninety-six-well culture plates were used with 100 μL of culture per well. After 44 h of incubation, AQueous One Solution (Promega, Madison, WI, USA) was added and the colour change was monitored at 490 nm for 4 h. All treatments were in triplicate.
This work was supported in part by NSF (BES01-26191), USDA Hatch and USDA TSTAR (2008-34135-19407). The authors thank Professors Bernhard Hube, John Rogers and Stanton Gelvin for the Saccharomyces kex2p-expressing Pichia strain, the pLJ607 plasmid and the pE1226 vector, respectively, Mr. Rosanto Paramban and Dr. Bob Bugos for constructing the pBIN-GG construct, and Dr. Monto Kumagai for his assistance in constructing the TMV viral vector. The authors are also indebted to Professors Philip Williams and Jon-Paul Bingham, and Mr. Zhibin Liang for their assistance in the ESI-TOF MS analysis.