Inhibition of 2A-mediated ‘cleavage’ of certain artificial polyproteins bearing N-terminal signal sequences

Where 2A oligopeptide sequences occur within ORFs, the formation of the glycyl-prolyl peptide bond at the C-terminus of (each) 2A does not occur. This property can be used to concatenate sequences encoding several proteins into a single ORF: each component of such an artificial polyprotein is generated as a discrete translation product. 2A and ‘2A-like’ sequences have become widely utilised in biotechnology and biomedicine. Individual proteins may also be co- and post-translationally targeted to a variety of sub-cellular sites. In the case of polyproteins bearing N-terminal signal sequences we observed, however, that the protein downstream of 2A (no signal) was translocated into the endoplasmic reticulum (ER). We interpreted these data as a form of ‘slipstream’ translocation: downstream proteins, without signals, were translocated through a translocon pore already formed by the signal sequence at the N-terminus of the polyprotein. Here we show this effect is, in fact, due to inhibition of the 2A reaction (formation of fusion protein) by the C-terminal region (immediately upstream of 2A) of some proteins when translocated into the ER. Solutions to this problem include the use of longer 2As (with a favourable upstream context) or modifying the order of proteins comprising polyproteins.

2As are neither proteolytic elements nor substrates for cellular proteinases, but mediate a newly discovered form of translational recoding event referred-to variously as ribosome 'skipping', 'stopgo' and 'stop-carry on' translation [5,8,9]. When a ribosome encounters 2A within an ORF it 'skips' the synthesis of the glycyl-prolyl peptide bond at the C-terminus of 2A. The nascent protein is released from the ribosome by eukaryotic release (termination) factors 1 and 3 (eRF1, eRF3), 2A forming its C-terminus [9]. The ribosome then resumes translation of the downstream sequences. As one test of our intra-ribosomal, co-translational, model for the mechanism of the 2A reaction, we assembled complementary DNA (cDNA) encoding a synthetic polyprotein comprising a protein targeted to the exocytic pathway (prepro-α factor; αF), 2A, and green fluorescent protein (GFP). Expression in yeast showed that the translation product upstream of 2A was, indeed, targeted to the endoplasmic reticulum (ER) whilst the downstream product localised to the cytosol. The 2A reaction occurred even though the nascent protein was 'shielded' from cytosolic proteinases by the establishment of a ribosome:translocon complex [7].
To extend the utility of 2A in the co-expression of proteins targeted to different sub-cellular compartments, we constructed plasmids encoding a panel of polyproteins similar to those used in our yeast analyses. Our 'basic' polyprotein construct comprised enhanced yellow fluorescent protein (EYFP), 2A, enhanced cyan fluorescent protein (ECFP), a second 2A and puromycin resistance (PAC; plasmid pPDF20). All proteins were co-expressed and located to the cytoplasm [28]. This construct was modified by the insertion of a GalT type II signal-anchor sequence (β-1,4 galactosyltransferase, GT) onto the N-terminus, to encode a [GT-EYFP-2A-ECFP-2A-PAC] polyprotein (plasmid pPDF18). When analysed using translation systems in vitro the expected translation profile was observed: high-level 'cleavage' (>90%) at both of the 2As, producing the major translation products [GT-EYFP-2A], [ECFP-2A] and PAC. When this construct was used to transfect HeLa cells, however, whilst the fluorescence signal from EYFP (correctly) localised in the Golgi, the signal from ECFP localised to the ER [28]. The latter was unexpected since [ECFP-2A] did not bear a signal sequence. Furthermore, this result was at variance with our findings in yeast, where the protein downstream of 2A localised to the cytosol and was also at variance with reports in the literature where secreted heterodimers were co-expressed using 2A (discussed below).
Our conclusion from the data derived from the fluorescent proteins bearing N-terminal co-translational signal sequences (types I and II) was that (i) translation of the first protein (bearing a signal sequence) lead to the formation of the ribosome:translocon (Sec61) complex, (ii) the N-terminal protein was translocated into the ER and (iii) that the protein downstream of 2A (no signal sequence) simply 'slipstreamed' through the pore of the translocon complex already formed [28].
Here we present data that shows this interpretation to be incorrect. We show that an interaction between the C-terminal region of certain nascent peptides and the translocon complex can affect the structure of the C-terminus of 2A within the peptidyl-transferase centre of the ribosome. This leads to inhibition of the 2A reaction, greatly increasing peptide bond formation and the production of 'uncleaved' fusion proteins. The fluorescence patterns we observed were primarily due, therefore, to the localisation and, to some extent, the fluorescent properties of these uncleaved forms.

In vitro translation
Coupled transcription/translation assays were performed using rabbit reticulocyte lysates (Promega) as described [7].

Immunofluorescence
Transfected cells were fixed and then incubated with primary antibodies, either; (i) rabbit polyclonal anti-2A (kind gift of Dr D. Vignali) or (ii) mouse monoclonal anti-V5 (kind gift of Prof. R. Randall). Texas red goat anti-rabbit and Texas red goat antimouse (Molecular Probes) were used as secondary antibodies.

Western blotting
HeLa cells were transfected and lysates collected 48 h later. Samples were run in 10 or 12.5% SDS-PAGE gels, transferred to Immobilon-P membranes (Millipore). Membranes were probed with primary antibodies, either (i) mouse monoclonal anti-GFP (Roche), (ii) anti-V5 or (iii) anti-2A antibodies. Secondary antibodies used were ECL anti-mouse IgG-Peroxidase from sheep (GE Healthcare) and antirabbit IgG-Peroxidase from goat (Sigma). Membranes were developed using ECL Plus Western Blotting Detection System (GE Healthcare).

Sub-cellular localisation
To simplify certain plasmid constructions and to more closely mimic those constructs analysed in yeast [7], the [2A-PAC] portion of the pPDF18 polyprotein was deleted to create plasmid pPDF45ΔB, encoding [GT-EYFP-2A-ECFP] (Fig.  1A). Transfection of HeLa cells with pPDF45ΔB produced the same pattern of fluorescence as that obtained with pPDF18: yellow fluorescence in the Golgi and cyan fluorescence in the ER (Fig. 1B). When the GT type II signal-anchor in pPDF45ΔB was replaced by a type I signal sequence from the ER luminal protein calreticulin, both fluorescence signals were located in the ER (data not shown).
We suspected that the difference between the observations made in yeast and mammalian cells could lie either in; (i) some peculiarity of the signal sequences used in these analyses or (ii) the nature of the sequences upstream of 2A. To address the first possibility we replaced the GT Golgi targeting sequence with either of the two signal sequences used in the yeast experiments (αF and Dap2p -D N αF), obtaining plasmids pPDF90 (encoding [D N αF-EYFP-2A-ECFP]) and pPDF91 (encoding [αF-EYFP-2A-ECFP]). The yeast αF signal sequence has been shown to function within mammalian cells [30,31]. Transfection of HeLa cells with these plasmids resulted in a similar pattern of fluorescence: both yellow and cyan signals in the ER. Again, it appeared that in both cases ECFP apparently 'slipstream' translocated into the ER (Fig.  1B).This effect was not, therefore, due to the nature of the signal sequences.
To address the second possibility we analysed a panel of deletions within EYFP. In this, and in all subsequent cases below, the numerals given in plasmid designations indicates the residues present in the EYFP deletion forms. The 2A sequence used in these analyses was derived from plasmid pSTA1/34 [6].  To further alter the sequence context immediately upstream of 2A, a range of deletions were made in the C-terminal region of EYFP. Plasmids pGL1 and pPDF107 both encode the same small fragment of the N-terminus of EYFP between the GT signal and the 2A. When either construct was transfected in HeLa cells, the ECFP (or, in the case of pPDF107 [ECFP-N-PAC]) apparently 'slipstreamed' in the ER (as with pPDF45ΔB), while the [GT-ΔEYFP 1-30 -2A] localised to the Golgi (Figs. 2A,  B and 3A, B). In both cases a substantial amount of the full-length translation product was observedthe 2A reaction was inhibited (see below). Interestingly, when this N-terminal region of EYFP was also deleted (pPDF87, Fig. 2A; pGL3, Fig. 3A), the 2A reaction was no longer inhibited (see below) and the vast majority of the protein downstream of 2A was not translocated into the ER (Figs. 2B, 3B). Plasmid pAN1.9 encodes EYFP with a much smaller deletion at the C-terminus ([GT-ΔEYFP 1-140 -2A-ECFP]; Fig. 2A). Here, the result obtained was similar to that using pPDF87: while the [GT- If the context immediately upstream of 2A is crucial in producing this apparent slipstream translocation, then small deletions in the C-terminal region of EYFP might be sufficient to reduce or eliminate this effect. A series of constructs were made with progressive deletions within EYFP (plasmids pPDF118, pPDF117, pPDF116 and pPDF115; Fig. 4A). In all cases, the protein upstream of 2A was translocated into the ER and the downstream [ECFP-N-PAC] localised to the nucleus (with a weak signal in the ER). The data from all of these plasmids contrasts with the ER localisation of [ECFP-N-PAC] in cases when the C-terminal region of EYFP was present immediately upstream of 2A (plasmid pPDF67; Figs. 3 and 4B). Deletion of just the C-terminal 20aa of EYFP (pPFD118) relieved the inhibition of the 2A reaction.

Effect of the upstream context on the 2A reaction in vitro and in vivo
We routinely monitor the cleavage activity of 2Acontaining artificial polyproteins using coupled transcription/translation rabbit reticulocyte cell-Biotechnol. J. 2010, 5, 213-223 www.biotechnology-journal.com When we investigated the cleavage in vivo of the polyproteins we had created that gave rise to the apparent slipstream sub-cellular localisation -  Both chains bore their native signal sequences and active IL-12 was secreted into the media [32]. Similarly, the heavy and light chains of antibodies cleaved highly efficiently, were assembled and secreted [33]. The αand β-subunits of the T-cell receptor (in both orientations) cleaved highly efficiently, assembled and were localised to the plasma membrane [13]. Human iduronidase alpha-L (IDUA) was targeted to the secretory pathway whilst a marker protein downstream of 2A (Discosoma spred fluorescent protein, DsRed2) localised to the cytosol [34].
In our previous study with fluorescent polyproteins where the first protein was targeted to the secretory pathway and the protein downstream of 2A was cytosolic, we attributed the unexpected ER sub-cellular distribution of the downstream protein to slipstream translocation [28]. Here we show a complete correlation between those polyproteins displaying the apparent slipstream effect and those with low levels of the 2A reaction in vivo -monitored by Western blotting. A large proportion of the translation products are uncleaved, leading to translocation of the fusion protein into the exocytic pathway.
Analyses of protein targeting using a control construct (pPDF93), encoding [GT-EYFP-ECFP], showed that (i) the fusion protein partitions between the Golgi and ER (mainly ER) and (ii) both proteins fluoresce in both compartments -producing complete co-localisation upon merging the images [28]. However, when 2A is present between the fluorescent proteins (pPDF45ΔB), whilst the fusion protein partitions between the Golgi and ER, the [GT-EYFP-2A] product formed by the 2A reaction (strongly fluorescent) localises to the Golgi, whilst we assume the [ECFP] product (much more weakly fluorescent than EYFP) is diffused throughout the cytoplasm and nucleus and not detectable.
In the case of constructs with N-terminal signal sequences, the source of the inhibitory effect of the EYFP C-terminal sequences (immediately upstream of 2A) must lie in the interaction with the translocon complex, since they cleave highly efficiently using in vitro translation systems. We have proposed that nascent 2A forms an α-helix with a tight-turn at its C-terminus. The those bearing N-terminal signal sequences -we found a very different picture. For example, in the case of pPDF45ΔB and pPDF67, translation in vitro produced only a small amount of the full-length translation product (Figs. 2-4, panel D), whilst expression in vivo showed this form to be the major product with the 'cleaved' forms being weak bands -sometimes visible only with prolonged exposure (Figs. 2-4, panel C).The in vivo analyses of cleavage showed a major difference in efficiency between those constructs which gave rise to the apparent 'slipstream' effect and those which did not. The slipstream effect showed complete correlation with low efficiency of the 2A reaction -evidenced by the levels of the full-length translation products (Figs.  1-4, panel D). Taken together these results strongly suggest that in the case of proteins targeted to the exocytic pathway the immediate upstream context of 2A may strongly influence the efficiency of the reaction. In the case of pGL1 and pPDF107 the same number of residues lie between the end of the transmembrane domain of the GT signal sequence and the N-terminal residue of 2A ( Figs. 2A and 3A, respectively). We designed pGL11 ([GT-ΔEYFP 150-180 -2A-ECFP-N-PAC]) to encode the same number of residues between the GT signal and 2A as in pGL1/pPDF107, but in pGL11 the upstream context for 2A is identical to that in pPDF116 -a construct that showed a highly efficient 2A reaction (Fig. 4C)

Discussion
2A and 2A-like sequences are now widely used as a tool for co-expression in biomedicine and biotechnology due to (i) their shortness (20-30aa), (ii) highly efficient cleavage, (iii) the uniform stoichiometry of the cleavage products and (iv) their activity in all eukaryotic cell-types tested to date. Many laboratories have successfully used 2As to co-express proteins targeted to the exocytic pathway. For example, the p35 and p40 chains of interleukin-12 (IL-12) were cleaved highly efficiently. helical portion is proposed to interact with the ribosome exit tunnel such that the C-terminal portion is sterically constrained within the peptidyltransferase centre of the ribosome [4,5]. The ester linkage between 2A and transfer RNA (tRNA) gly is precluded from nucleophilic attack by the prolyl-tRNA in the A site, effectively 'jamming' translation.We have recently shown that this block is relieved by the action of translation release factors 1 and 3 [9].
Residues that influence the 2A-mediated cleavage of cytosolic proteins map to within 30aa of the cleavage site. This is consistent with many proteinase protection studies which have shown the length of nascent peptide within the ribosome exit tunnel to be 30-40aa ( Fig. 5; reviewed in [35]). The eukaryotic ribosome exit tunnel is ∼ 100 Å long and has an average diameter of ∼ 20 Å [36].
The C-terminal 20aa of EYFP upstream of 2A that inhibited the 2A reaction are 28-48aa distal from the peptidyl-transferase centre -since we know that the C-terminus of 2A (23aa long + 4aa linker) is in the P site when cleavage occurs (Fig. 5). Disulphide bond formation, photochemical crosslinking, proteinase protection, glycosylation acceptor site and protein folding studies performed upon nascent proteins transiting into the lumen of the ER show that 65-70aa are present within the ribosome:translocon complex [37][38][39][40][41][42] (Fig. 5). The consensus arising from structural studies on Sec61 or Sec61/ribosome complexes is that the protein conducting channel is formed at the interface of an oligomeric assembly -probably a tetramer -of the Sec61α,γ,β complex. The channel is co-axial with the ribosome exit tunnel and is thought to be some 48-60 Å long [43][44][45][46][47][48].
Theoretical work suggests that nascent proteins adopt an α-helical conformation [49] stabilised by the exit tunnel [50]: indeed, dynamic simulations suggested 2A adopted such a conformation [4]. Taken together, these data suggest that at the stage in elongation when cleavage occurs (the C-terminal glycyl-tRNA gly of 2A in the P site of the ribosome peptidyl-transferase centre), the C-terminal 20aa of the protein upstream (EYFP) most probably lie within a region defined by the interface between the ribosome and Sec61 complex and the translocon tunnel itself (Fig. 5). Interactions between the nascent protein and the translocon tunnel may affect the conformation of 2A within the ribosome exit tunnel and, in consequence, the C-terminal tight-turn of 2A in the peptidyl-transferase centre itself -inhibiting the 2A reaction.
The problem we have identified in the co-expression of certain proteins targeted to the exocytic pathway has significance for the biotechnologi-cal utilities of 2A. The LOCATE sub-cellular localisation database indicates that of all human proteins currently analysed, ∼ 40% are either secreted from the cell, located within the lumen/membranes of cytoplasmic vesicular structures (excluding mitochondria), or are plasma membrane proteins (http://locate.imb.uq.edu.au/). Given that such a high proportion of cellular proteins are initially translocated into the ER, the ability to co-express multiple proteins targeted to such sites is essential.
In this study we identified two regions of EYFP which, when placed immediately upstream of 2A, inhibited the 2A reaction; (i) the C-terminal region (residues 220-240: pPDF118) and (ii) the N-terminal region (residues 1-29: pGL1/pPDF107). However, in constructs we assembled designed to coexpress influenza haemagglutinin (HA) or neuraminidase (NA) linked via 2A to fluorescent proteins ([HA-2A-CherryFP] or [NA-2A-CherryFP]), we observed the same effects we describe here: very high-level cleavage using translation systems in vitro, whilst expression in vivo showed the fluorescent proteins localised in the ER and not to the cytoplasm -again, the apparent slipstream effect.
This inhibition of the 2A reaction may be overcome in two ways. Firstly, by the use of longer versions of 2A which incorporate a 39aa tract from the C-terminus of protein 1D, immediately upstream of 2A in the FMDV polyprotein [6]. This extension does not interact with the translocon pore to affect the activity of 2A in the ribosome -effectively 'insulating' the 2A sequence from the upstream protein. A number of studies suggest that cleavage ef- Figure 5. Position of 2A and sequences immediately upstream in the ribosome:translocon complex. In the case of polyproteins bearing an N-terminal signal sequence, the 2A oligopeptide (white ovals) and the sequences immediately upstream (grey ovals) are predicted to be located within the ribosome exit tunnel and translocon pore, respectively. The lengths of the exit tunnel and translocon pore, plus the lengths of peptides (amino acids) which may be accommodated [44][45][46][47][48][49] are also shown. ficiency may be improved by using a flexible Gly-Ser-Gly or Ser-Gly-Ser-Gly linker sequence separating the upstream protein from the 2A sequence [51][52][53]. Recent work by Yang et al. (2008) combined a furin cleavage site and a V5 oligopeptide spacer sequence immediately upstream of 2A [54]. Furin is a proteinase enriched in the Golgi, with a canonical recognition site of -R-X-↓ (R/K)-R-. For proteins targeted to, or transiting through, the Golgi, incorporation of this site results in proteolysis. This strategy may provide an excellent solution since it removes the extended 2A linker [55]. We have used a linker comprising the furin proteinase cleavage site, 39aa of 1D, plus 2A (derived from plasmid pSTA1/31; [6]) to overcome this problem. In our influenza HA/NA co-expression studies the use of this linker resulted in the correct localisation of the cherry fluorescent protein to the cytoplasm, and not the exocytic pathway (S. Vater, pers.comm.).
The second solution lies in the basic design -the "gene" order -of the polyprotein. Previously described 2A-based multigene vectors have revealed the 2A region functions properly within different contexts, but the cleavage efficiency varies as flanking context changes. By swapping the order of proteins in several artificial polyproteins the stoichiometry was affected by the gene upstream of 2A [56,57]. If a problematic secreted/membrane protein is identified, then this could be incorporated as the C-terminal component of the polyprotein system. It should be noted in this context that in mammalian and yeast cells an N-terminal proline (such as that produced by 2A-mediated cleavage) confers a long half-life (>20 h) on proteins (http://www. expasy.ch/tools/protparam-doc.html).