When placed downstream of the start codon, multimers of the dinucleotide CA stimulated translation from lacZ, gusA and neo mRNAs in the presence or absence of an untranslated leader sequence. Enhanced expression in the absence of a leader and Shine–Dalgarno sequence indicated that stimulation by CA multimers was independent of translation signals contained within the untranslated leader. Multimers of CA stimulated a significantly higher level of lacZ expression than multimers of individual C or A nucleotides. Translation levels increased as the number of CA repeats increased; fewer multimers were required for enhanced expression from leadered mRNA than from mRNA that was deleted for its leader sequence. Addition of downstream CA multimers increased the ribosome binding strength of mRNA in vitro and the amount of full-length mRNA in vivo, suggesting that the enhanced expression resulted from translation of a more abundant functional message containing a stronger ribosome binding site. The presence of downstream CA-rich sequences, occurring naturally in several Escherichia coli genes, might contribute to translation of other mRNAs. Addition of CA multimers might represent a general mechanism for increasing expression from genes of interest.
The Escherichia coli genome contains more than 4200 protein-encoding genes (Blattner et al., 1997), with as many as 1500 different proteins produced under defined cultural conditions (Phillips et al., 1987) from mRNAs that can be translated with a 1000-fold range of translational efficiency (Ray and Pearson, 1974; 1975). Regulated transcription and translation is required to ensure the optimal number and composition of macromolecules for growth. Of the many factors involved in the initiation of translation, the most variable component is the mRNA. The sequence and structure of mRNA specifies its interaction with the translational machinery and determines the efficiency and frequency of translation. The identification and analysis of mRNA features that determine the levels of protein translated from complex mixtures of in vivo mRNA is fundamentally important to our understanding of gene regulation and expression.
A statistical analysis of translation initiation regions revealed a non-random distribution of nucleotides (nts) before and after the start codon (Stormo et al., 1982; Gren, 1984), suggesting that specific upstream and downstream nts play a role in ribosome binding and translation initiation. The ribosome binding site (RBS) is the mRNA region bound by an initiating ribosome and extends approximately ± 15 nts relative to the start codon (Steitz, 1969; Steitz and Jakes, 1975; Steitz and Steege, 1977). Translation signals of the RBS, contained within specific sequence and structural features of the mRNA, are major determinants of protein levels produced from mRNA. A strong RBS results in frequent binding of mRNA to ribosomes, thereby protecting the mRNA from degradation and providing for a more abundant full-length message and increased translation (Yarchuk et al., 1991; 1992).
The Shine–Dalgarno (SD) sequence is an important translation signal contained within the mRNA's upstream untranslated leader region (Shine and Dalgarno, 1974); contribution of the SD sequence to the strength of the RBS is estimated from its degree of complementarity to the 16S rRNA anti-Shine–Dalgarno (ASD) sequence (Hui and deBoer, 1987; Jacob et al., 1987). Although increased gene expression is often associated with increased SD/ASD complementarity, the inability to reliably predict expression levels merely from the extent of SD/ASD complementarity indicates that other features of the initiation region contribute to the strength of the RBS.
Some mRNAs initiate at the translational start site and lack untranslated leader sequences. Examples of leaderless mRNAs are found in Bacteria, Archaea, Eucarya and eukaryotic organelles (for a review, see Janssen, 1993; Wu and Janssen, 1996). Although the signals necessary for leaderless mRNA translation are not well understood, the absence of a leader suggests that sequences, or secondary structures, within the coding region contain the signals that allow leaderless mRNA to compete with leadered mRNA for ribosomes in vivo. In addition to naturally leaderless mRNA, the untranslated leader can be removed from conventionally leadered messages without loss of translatability (Wu and Janssen, 1996; Van Etten and Janssen, 1998). After removal of the leader, translation would occur independent of a conventional SD/ASD interaction, and coding region signals would contribute to translational efficiency. The coding region provides only a portion of the translation signals present in leadered mRNA, but contains the only mRNA signals for messages lacking a leader. Analysis of leaderless mRNA will identify coding region translation signals and mRNA : ribosome interactions that will contribute to our general understanding of translation initiation and ribosome function.
In a recent study (Altman, 1996), an increase in β-galactosidase expression was observed when CA multimers were added 3′ of the initiation codon of a lacZ gene that had been deleted for its untranslated leader. In this report, we characterize the effect of CA multimers on lacZ expression by examining the following: the number of CA multimers needed for enhanced expression; the effects on expression of CA repeat position relative to the start codon; the contribution of CA multimers to expression in the presence or absence of an untranslated leader sequence; the mechanism by which CA repeats increase expression; and the ability of CA multimers to stimulate expression of other E. coli genes. Our results suggest that the presence of downstream CA multimers increases the ribosome binding strength of lacZ mRNA and might serve as a general mechanism for stimulating expression of other genes in E. coli.
Construction of leadered and unleadered lacZ
To investigate the contribution of a CA-rich sequence to gene expression, CA multimers were added to a lacZ reporter gene with (pSD), or without (pATG), an untranslated leader (Table 1). The unleadered lacZ reporter plasmid, pATG, was constructed by deletion of the lac 38 nt untranslated leader from pSD such that transcription was predicted to initiate at the translational start site. Analysis of the lacZ transcriptional start site in pATG revealed that the mRNA initiated primarily at the A of the lacZ translational start site (Fig. 1).
Table 1. . Sequences at the transcriptional and translational start sites of leadered and unleadered lacZ, neo and gusA reporter genes. a. The sequence indicated is the DNA sequence located between the transcriptional start site of a modified lac promoter and codon 5 of lacZ (pATG and pSD), codon 5 of neo (pNeoATG and pNeoSD), or codon 7 of gusA (pGusATG and pGusSD). The upper case ATG identifies the start codon; an additional G nt is present in some constructs immediately after the SalI site (italicized) to ensure in-frame fusions of the ATG with downstream coding sequences. A 38 nt modified lac leader, with Shine–Dalgarno sequence underlined, is indicated for pSD, pNeoSD and pGusSD constructs.
CA multimers stimulate expression from an unleadered lacZ
CA multimers were added immediately 3′ to the start codon of the unleadered lacZ gene in pATG (Table 1). β-Galactosidase assays were performed on cells containing unleadered lacZ constructs with (CA)n multimers (n = 2, 3, 5, 8, 9, 11). The amount of β-galactosidase activity increases as the number of CA repeats increase (Fig. 2). Relative to pATG, LacZ levels expressed from pATG(CA)2 and pATG(CA)3, containing two and three CA repeats respectively, showed only a slight stimulation. Increasing the CA repeat length to five, pATG(CA)5, or eight, pATG(CA)8, increased expression 16- and 54-fold respectively. Further increase of CA multimers to nine, pATG(CA)9, or 11, pATG(CA)11, stimulated expression 137- and 143-fold respectively.
The contribution to lacZ expression by the CA dinucleotide within the TCA triplet preceding the SalI site (Table 1) was assessed by deleting TCA from pATG(CA)9. LacZ activity from the resulting construct, containing the start codon followed by (CA)8 and the SalI site, was similar to pATG(CA)8, containing (CA)7 followed by a TCA triplet (data not shown). These results indicate that the CA dinucleotide within the TCA triplet contributes to the stimulation of lacZ expression and that the CA multimers need not be contiguous for their effect on expression (i.e. the repeated CA sequence can be disrupted by a U and still stimulate lacZ expression). Because of this result, the CA dinucleotide within the TCA triplet has been included in our system for enumerating the CA multimers present in all of our described constructs.
The increased LacZ activity might result from an increased amount of enzyme; alternatively, the addition of histidine and threonine to the amino terminus of LacZ, resulting from the addition of CA multimers to lacZ coding sequence, could produce an enzyme with higher activity. Cell-free extracts were analysed by SDS polyacrylamide gel electrophoresis (SDS–PAGE) to determine whether the increased β-galactosidase activity correlated with increased amounts of β-galactosidase protein. Comparison of extracts (Fig. 3) revealed an increased amount of the presumed LacZ protein present in cells containing pATG(CA)11 relative to cells containing pATG, indicating that the increased LacZ activity resulting from the addition of CA multimers to lacZ mRNA is accompanied by an increased amount of LacZ protein per cell.
Increased expression of lacZ is dependent on the position of the CA repeat relative to the start codon
To investigate the possible importance of upstream or downstream positioning for CA stimulation of expression, (CA)10 was added immediately 5′ of the lacZ start codon present in pATG. Relative to pATG, cells containing p(CA)10ATG (Table 1), with (CA)10 upstream to the start codon, showed only a twofold stimulation of β-galactosidase expression, whereas cells containing pATG(CA)11, with (CA)11 downstream, showed a 143-fold increase (Fig. 2).
Cell-free extracts were prepared to visualize the β-galactosidase protein by SDS–PAGE. Relative to extracts from cells containing pATG, extracts from cells with p(CA)10ATG did not produce a visible increase in the amount of LacZ protein, whereas extracts from cells containing pATG(CA)11 showed a greatly increased abundance in the presumed LacZ protein band (Fig. 3). Therefore, based on both LacZ activity and protein levels, a downstream position is important for CA stimulation of unleadered lacZ translation.
Fewer CA multimers are required for stimulated expression of lacZ in the presence of an untranslated leader sequence
CA multimers were added immediately 3′ of the start codon for the leadered lacZ gene in pSD (Table 1). Based on β-galactosidase assays, the lacZ expression from pSD was fivefold higher than pATG (Table 2). β-Galactosidase assays were performed on cells containing leadered lacZ constructs with three [pSD(CA)3] or five [pSD(CA)5] multimers of CA. As observed with the unleadered lacZ, β-galactosidase activity increases as the number of CA repeats increases (Table 2). Relative to pSD, pSD(CA)3 increased expression 18-fold and pSD(CA)5 increased expression 74-fold.
Table 2. . LacZ activity expressed from leadered lacZ with downstream CA multimers. a. The lacZ start codon and SalI restriction site were separated by the indicated CA multimers. The CA dinucleotide contained within the TCA triplet separating the start codon and SalI site in these plasmids is included in enumerating the CA multimers present.b. LacZ activity was measured according to Miller (1992). Activity measured in cells containing pSD was 200 Miller units (100%).
Addition of (CA)8 or (CA)11 multimers downstream to the lacZ start codon in pSD resulted in low β-galactosidase levels that was accompanied by rearrangements of the pSD(CA)8 and pSD(CA)11 plasmids (data not shown). Presumably, the toxic effects of a highly expressed lacZ gene, expected from the presence of a SD-containing leader and downstream (CA)8 or (CA)11 multimers, selected for variants in which plasmid rearrangements and deletions reduced lacZ expression. Although the presence of eight or 11 CA repeats might contribute somehow to DNA instability, the observed stability of pATG(CA)9 and pATG(CA)11 (and other genes described below) suggests that the physical presence of (CA)8 or (CA)11 multimers does not cause plasmid instability.
Cell-free extracts were made from cells containing pSD(CA)5 to visualize LacZ protein by SDS–PAGE. Comparison of extracts revealed an increased amount of LacZ in cells containing pSD(CA)5 relative to cells containing pSD or pATG(CA)11 (Fig. 3).
Multimers of CA stimulate higher lacZ expression than multimers of individual C or A nucleotides
The contribution of individual C and A nts to the CA stimulation of leadered lacZ expression was investigated by gene constructions in which the start codon was followed immediately downstream by the nucleotides C5, A5, C5A5, A5C5 or (CA)5 (Table 3). Although addition of five cytosines (C5) resulted in reduced expression, all constructs containing adenines showed increased LacZ activity, indicating that adenine was most responsible for the enhanced expression observed in the presence of CA multimers. The highest expression occurred in cells containing (CA)5 downstream to the start codon. The relative increase in lacZ expression ranged from 2.4-fold (A5C5) to 30.6-fold [(CA)5], suggesting that the enhanced expression was influenced significantly by the position, placement, and/or nucleotide context of adenines relative to the start codon.
Table 3. . LacZ activity expressed from leadered lacZ with downstream C, A, or CA multimers. a. The lacZ start codon and SalI restriction site were separated by the indicated nts. The TCA triplet present in other described plasmids was not present in this plasmid series. When needed, one or two additional nts were present downstream of the SalI site to provide in-frame fusions with lacZ.b. LacZ activity was measured according to Miller (1992). Activity measured in cells containing the pSD-0 control plasmid was 666 Miller units (100%).
Downstream CA multimers increase the abundance of full-length lacZ mRNA
RNAs extracted from E. coli RFS859 with or without various lacZ constructs were size-fractionated in an agarose gel and analysed by Northern hybridization. RNA from cells containing pATG(CA)11 or pSD(CA)5 revealed a greater abundance of large molecular weight lacZ mRNA relative to the signals observed for RNA extracted from cells containing p(CA)10ATG, pATG or pSD (Fig. 4). The largest hybridizing signal, estimated at ≈ 4200 nts, is approximately the size of the lacZ mRNA expected from these vectors, and suggests that an increased amount of full-length lacZ mRNA is present in cells containing pATG(CA)11 or pSD(CA)5. The abundant, low-molecular-weight signals observed in RNA extracted from cells containing pATG, p(CA)10ATG or pSD are estimated at 400–700 nt and probably identify a stable 5′-terminal fragment of the lacZ mRNA.
The effect of CA repeats on lacZ mRNA levels was assessed also by Northern dot-blot hybridization (Fig. 5). Total RNA extracted from E. coli RFS859 with or without various lacZ constructs were probed for lacZ mRNA (Fig. 5A). After scanning for beta emission and autoradiography, the blot was stripped and reprobed for 16S rRNA as an internal control (Fig. 5B). After normalization to the rRNA levels, the lacZ mRNA levels were found to vary by less than twofold (data not shown). Although mRNA levels varied slightly, it is unlikely that the 143-fold difference in LacZ expression between cells containing pATG and pATG(CA)11 resulted from a less than twofold difference in lacZ mRNA levels. Results of the RNA analyses suggest that the CA-mediated stimulation results from an increased translational efficiency relating to the presence of a more abundant full-length message, rather than CA stimulation of transcription initiation at the lac promoter.
Ribosomes bind CA-rich mRNA more efficiently
Primer extension inhibition analyses (toeprinting; Hartz et al., 1988) were carried out to investigate the ribosome binding efficiency of CA-containing mRNA. Using a ribosome/mRNA ratio of 4:1, a ternary (i.e. mRNA, initiator tRNA, and 30S subunit) complex-dependent toeprint signal was observed at position +16 relative to the first position (+ 1) of the start codon on a SD-leadered lacZ mRNA fragment containing (CA)11 (Fig. 6A, lanes 4–6; filled arrow), similar to the position (+ 15) of toeprint signals observed by ternary complexes with other mRNAs (Hartz et al., 1988). Under identical conditions, a SD-leadered lacZ mRNA fragment containing (CA)2 (Fig. 6A, lanes 1–3; open arrow) revealed a weak toeprint signal that was evident only with extended exposure times (data not shown). Equal mixture of lacZ mRNA fragments containing either (CA)2 or (CA)11 revealed a strong toeprint signal only for mRNA containing (CA)11 (Fig. 6A, lanes 7–9). Using similar reaction conditions, we have been able to demonstrate a toeprint signal at the expected position with pATG(CA)9 mRNA, but not with pATG mRNA (data not shown).
With a mixture of mRNAs and an increasing concentration of ribosomes, mRNA containing (CA)11 revealed a toeprint signal at a ribosome/mRNA ratio of 1:1 (Fig. 6B, lane 2; filled arrow), whereas mRNA containing (CA)2 required a more than fourfold ribosome excess (Fig. 6B, lanes 3 and 4; open arrow). The ribosome excess needed to generate toeprint signals from mRNA containing (CA)2 indicates that mRNAs containing (CA)11 are better able to stably bind ribosomes at lower ribosome concentrations. In the presence of a ribosome/mRNA ratio of 10:1 and a mixture of mRNAs containing either (CA)2 or (CA)11, toeprint signals varied directly with concentration of the corresponding mRNA (Fig. 6C, lanes 2–6); however, a reaction containing equal molar mRNAs (lane 4) revealed a significantly stronger toeprint signal for (CA)11 than for (CA)2 mRNA. The relative strength of toeprint signals obtained with a mixture of mRNAs containing (CA)2 or (CA)11 indicates that mRNA containing (CA)11 are better able to compete for ribosomes.
The toeprinting results indicate that downstream CA multimers provide a stronger ribosome binding inter-action and suggest that increased expression from CA-containing mRNA reflects an increased level of translation. The increased ribosome binding efficiency observed for CA-rich mRNA in vitro suggests that the increased abundance of full-length CA-containing lacZ mRNA observed in vivo (Fig. 4) results also from increased ribosome binding strength and ribosome protection of mRNA from degradation.
Increased gene expression in the presence of (CA)nsequences is independent of coding sequence
i. The Tn5 neomycin phosphotransferase gene (neo)
CA multimers were added to a leadered and unleadered Tn5 neo gene (Table 1) to investigate the stimulatory effect of CA repeats on expression of other coding sequences in E. coli. The neo gene, encoding neomycin phosphotransferase (NPTII), confers resistance to the aminocyclitol antibiotics neomycin and kanamycin (Beck et al., 1982). Cells containing neo plasmids were inoculated onto L agar plates with, or without, an increasing gradient of kanamycin (0→0, 0→25, 0→1000 μg kanamycin ml−1) (Fig. 7). E. coli DH5α and DH5α containing pΔNeo, a plasmid resulting from deletion of the promoter and 70% of the neo coding sequence from pNeoSD(CA)11, were used as negative controls. Cells containing pNeo(CA)10ATG, with (CA)10 upstream to the start codon, were slightly more resistant than pNeoATG; however kanamycin resistance of cells containing pNeoATG(CA)11, with (CA)11 located 3′ of the start codon, increased to over 1000 μg ml−1 kanamycin. Cells containing pNeoSD, with an untranslated leader upstream to the start codon, showed increased resistance levels relative to cells containing pNeoATG; however the resistance level conferred by pNeoSD(CA)11, with an upstream untranslated leader and (CA)11 following the start codon, is greatly increased. With CA multimers downstream to the start codon, in the presence or absence of an untranslated leader, there was a dramatic increase in kanamycin resistance.
Cell-free extracts were made from cells containing pNeo constructs in an effort to visualize the NPTII protein by SDS–PAGE. When extracts of cells containing pNeoATG(CA)11 and pNeoSD(CA)11 are compared with extracts from cells containing pNeoATG and pNeoSD, there is an increased abundance of a protein migrating at the approximate position expected for NPTII (29.6 kDa) (Fig. 8A). Along with increased kanamycin resistance, there is increased abundance of the probable NPTII protein in extracts from cells containing the neo gene with downstream CA multimers.
Variable amounts of protein were separated by SDS–PAGE and transferred to a nylon membrane for probing by Western blotting with an anti-NPTII antibody (Fig. 8B). After quantification of band intensity and normalization to the amount of protein electrophoresed, extracts from cells containing pNeo(CA)10ATG contained a 1.2-fold higher NPTII level than extracts of cells with pNeoATG; and extracts from cells containing pNeoATG(CA)11 showed an approximate 2000-fold increase in NPTII levels compared with extracts with pNeoATG. Extracts from cells containing pNeoSD(CA)11 showed a 60-fold increase in NPTII over cells containing pNeoSD. As observed with lacZ, the presence of CA multimers downstream to the start codon, in the presence or absence of an untranslated leader, dramatically increased the amount of NPTII protein, as determined by the stained gel (Fig. 8A) and the NPTII antibody-probed Western blot (Fig. 8B).
ii. The β-glucuronidase gene (gusA)
CA multimers were added to a leadered and unleadered gusA gene (Table 1) to investigate further the general stimulatory effect of CA multimers on gene expression in E. coli. Cell-free extracts were made from cells containing various gusA plasmids to visualize the β-glucuronidase protein by SDS–PAGE. When extracts of pGusATG(CA)11 and pGusSD(CA)11, containing (CA)11 downstream of the start codon in the absence or presence, respectively, of an untranslated leader sequence, are compared with extracts from cells containing corresponding pGus plasmids without CA repeats (pGusATG and pGusSD respectively), there is an increased abundance of a protein band at the approximate position expected for GusA protein (68.2 kDa; Fig. 9A). Extracts from cells containing pGus(CA)10ATG, with (CA)10 upstream to the start codon revealed no visual increase in the suspected GusA protein band relative to pGusATG.
Proteins separated by SDS–PAGE were transferred to a nylon membrane for probing with an anti-GusA antibody. The resulting Western blot (Fig. 9B) identified a protein in extracts from cells containing pGusSD(CA)11 that corresponded to the overexpressed protein suspected to be GusA (Fig. 9A); also, the protein identified by the anti-GusA antibody was missing from extracts of the E. coli gusA deletion strain PK0803. The overexpression of GusA protein, as evidenced by the stained gel and the antibody-probed Western blot, indicates that the presence of CA multimers downstream to the start codon stimulates gusA expression. As seen with lacZ and neo, downstream CA multimers stimulate expression. Attempts to quantify further the CA stimulation of gusA expression by β-glucuronidase assays were unsuccessful because the β-glucuronidase enzyme was inactivated by alterations introduced to the first six codons for introduction of CA multimers (J. Martin-Farmer, N. Bernal and G. Janssen, unpublished).
Downstream CA multimers enhance translation in the presence or absence of an untranslated leader sequence
Insertion of CA multimers downstream of the start codon enhanced translation from leadered and unleadered mRNA containing the lacZ, neo or gusA coding sequences. Addition of CA multimers immediately upstream to the start codon resulted in a negligible increase in expression, thereby emphasizing the downstream position requirement for stimulation with unleadered mRNA. Increased expression from mRNAs lacking an untranslated leader indicates that CA stimulation occurs independent of the SD sequence or other signals contained within the leader. The SD-containing leader increased expression approximately fivefold, relative to unleadered lacZ whereas addition of (CA)9 increased expression 137-fold, indicating that downstream sequences are capable of stimulating significantly higher expression than the SD-containing lac leader used here.
In an effort to determine whether the translation signals within the SD-containing leader might dominate the effect of downstream CAs, lacZ genes were constructed that contained an untranslated leader and a variable number of downstream CA multimers. The combination of a SD-containing leader with lacZ containing (CA)3 downstream resulted in a 45-fold increase in expression relative to unleadered lacZ containing (CA)3 and an 18-fold increase relative to leadered lacZ with a single downstream CA dinucleotide. These results indicate that the combination of upstream leader signals and downstream CA multimers provided for a higher level of expression than observed with either individual sequence or expected from a mere additive effect of the two sequences. The synergistic effect of providing the leader and CA sequences on the same mRNA reduced the number of CA multimers needed for enhanced lacZ expression.
Although our lacZ gene contains the lac leader and SD sequence, we observe a relatively low level of expression compared with wild-type lacZ (data not shown). During construction of our leadered and unleadered lacZ genes (Van Etten and Janssen, 1998), codons two and three (i.e. ACC AUG) were deleted in order to eliminate the downstream secondary translational start site (underlined) (Munson et al., 1984). The deleted region contains a C- and A-rich sequence and, based on the results described here, might contribute importantly to lacZ expression. Removal of this sequence from our constructs might be responsible, in part, for our observed low-level lacZ expression; addition of downstream CA multimers might compensate for the loss of codons 2 and 3 and restore expression.
Possible mechanisms for CA enhancement of gene expression
CA repeats might stimulate expression at the level of transcription, translation, or the rate of mRNA turnover. Effects on transcription might occur if CA repeats affect initiation frequency, rate of promoter clearance, or the frequency of premature transcription termination. Translation effects might occur through CA influence on the strength of ribosome binding, the frequency of initiation, or the rate of ribosome clearance from the initiation site. Dot-blot analyses revealed that the abundance of a 5′ terminal lacZ mRNA fragment was relatively constant, suggesting that transcription initiation was not stimulated significantly by addition of CA multimers. Northern blotting revealed that lacZ mRNA with downstream CA multimers contained a larger proportion of molecules corresponding in size to full-length message. Toeprinting assays revealed that CA-rich lacZ mRNA was better able to compete for ribosomes than mRNA with fewer CA repeats. Together, these results suggest that downstream CA multimers stimulate expression at the level of translation by increasing the binding strength of the mRNA–ribosome interaction.
The strength of the ribosome binding site has been related to the abundance of full-length mRNA; a strong RBS results in frequent translation initiation and ribosome protection of lacZ mRNA from ribonuclease degradation (Yarchuk et al., 1991) and Rho-mediated transcription termination (Yarchuk et al., 1992). As evidenced by toeprinting assays and Northern analysis, the elevated production of LacZ from mRNA containing downstream CA multimers correlates with an increased ribosome binding strength and an increased abundance of full-length lacZ mRNA. Accordingly, the 137-fold difference in LacZ activity between cells containing pATG and pATG(CA)9 is likely to result from the presence of CA multimers providing a stronger ribosome binding site, thereby increasing the frequency of translation initiation, resulting in ribosome protection of mRNA from degradation and premature Rho-mediated termination and an increased abundance of full-length, functional lacZ mRNA.
How might the ribosome binding strength of mRNA increase by the addition of CA multimers? The structure of mRNA around the translational start site has been shown to exert strong effects on translation levels (de Smit and van Duin, 1990a; 1990b). The CA-rich sequence might provide for a specific lack of structure, adjacent to the start codon, that facilitates ribosome entry to the initiation site; the frequency of ribosome binding and translation initiation might increase as the number of CA repeats and the length of unfolded mRNA increases. As an alternative, the CA sequence might present a specific mRNA structure that facilitates ribosome entry and translation initiation. The CA sequence might also contribute to expression by speeding the transition from initiation to elongation, thereby allowing for more initiation events. The CA sequence could also act to disrupt, or displace downstream, mRNA structural features or negatively acting elements that inhibit translation initiation. However, this explanation would require that the leadered and unleadered forms of all three genes examined here (i.e. lacZ, neo and gusA) contain inhibitory structures downstream to their start codons. Also, insertion of C5A5 or A5C5 between the start codon and the putative inhibitory structure did not restore lacZ expression to the level observed with (CA)5, suggesting that the specific nt sequence rather than a mere spacing effect might be important for maximum expression.
CA multimers might increase expression through a sequence-specific inhibition of mRNA turnover. However, the presence of (CA)10 upstream to the lacZ start codon might also be expected to stabilize mRNA and increase expression, but this was not observed. Also, demonstration that CA-rich mRNA is more competitive in ribosome binding assays suggests that the reduced mRNA turnover observed in vivo results from a stronger ribosome binding interaction rather than a sequence-specific inhibition of mRNA degradation.
It is possible that a CA-rich sequence might bind a translation factor that recognizes A-rich regions downstream of the start codon. For example, the ribosomal protein S1 has been shown to bind selectively to RNA pseudoknots containing ACA in the single-stranded loop region (Ringquist et al., 1995). However, the location of proposed S1 binding sites on mRNA (Boni et al., 1990; Zhang and Deutscher, 1991; Tzareva et al., 1994) does not show the required downstream positioning that we observed for CA multimers.
Alternatively, the CA sequence might identify an alternate pathway for interaction of mRNA with a specific sequence or region of ribosomal RNA. A CA-mediated pathway could function independently of the SD/ASD pathway but also contribute to mRNA/ribosome interactions that include SD/ASD pairing. A report of crosslinking poly(A) RNA to 16S rRNA near the ribosomal P-site (Stiege et al., 1988) might suggest a site of potential interaction for the A-rich CA multimers.
Analysis of various combinations of C and A nts suggests that adenine contributes most to the CA enhancement of translation; in addition to the number present, expression appears to be influenced also by the position and/or density of adenines in the downstream region. Downstream multimers of UA, but not GA, also stimulate lacZ expression (data not presented), supporting further the importance of adenine and suggesting that a mRNA downstream sequence rich in ‘pyrimidine-A’ might contribute to strength of ribosome binding. Chen et al. (1994) has shown also that A-rich regions downstream of the start codon can increase expression. The possible importance of adenines to translation is supported further by the observation that A is the most frequently encountered nucleotide within a region from +4 to +18 immediately downstream of the start codon (Rudd and Schneider, 1992).
Implications of CA-rich regions for expression of other genes
Examination of E. coli translational start sites (Rudd and Schneider, 1992) reveals that several genes contain downstream A- and CA-rich regions. The translational efficiency of mRNAs containing downstream regions naturally rich in CA (or UA) might be influenced by the presence and position of the A-rich sequences. Addition of CA or UA sequences might be a relatively easy and general method for increasing gene expression. Although maximum expression was observed with multiple CA repeats, significant increases in expression were observed with leadered lacZ after addition of only two or three CA repeats. Addition of a short stimulatory CA/UA sequence might be accomplished with only minor alteration of the encoded amino acids. Alternatively, the genetic code degeneracy might be utilized to increase downstream CA or UA richness without changing the amino acid sequence. However, a more thorough characterization of CA-stimulated gene expression is needed before reliable predictions could be made for using CA/UA-rich regions in combination with other coding or leader regions.
Translation signals in mRNA
A substantial body of work has demonstrated that the extent of SD/ASD complementarity is an important determinant of translational efficiency in bacteria. We have demonstrated that the addition of downstream CA multimers increases expression from leadered and unleadered mRNA, and that downstream CA multimers could stimulate a higher level of expression than our SD-containing lac leader. These results provide additional evidence that sequences affecting translational efficiency can occur upstream and/or downstream of the start codon and suggest that the combined strength of upstream and downstream signals determines the frequency of translation for a particular mRNA. The absence of coding constraints would suggest that the untranslated leader would most readily accommodate sequence-specific translation signals; however the physical or genetic organization of specific genes might require translation signals to be located within the coding region. A more complete identification and analysis of upstream and downstream translation signals is needed to better understand the interactions of mRNA with ribosomes and the mechanism(s) of translation initiation.
Escherichia coli DH5α [F−ø80dlacZΔM15 Δ(lacZYA–argF)U169 deoR recA1 endA1 hsdR17(rk−, mk+)phoAsupE44 λ-thi-1 gyrA96 relA1] (Bethesda Research Laboratory) was used as the host strain for all plasmid manipulations. E. coli RFS859 [F−, thr-1, araC859, leuB6, Δlac 74, tsx-274, lambda−, gyrA111, recA11, relA1, thi-1] (Schleif, 1972), a lac-deletion strain, was used as the host for expression and assay of lacZ. E. coli PK0803 [thyA, deoB, deoC, pro, his, lac, str-r, tsx-r, lambda+, Δ298 (manA–uidA), pyrF287, ΔtrpE5 ], a gusA-deletion strain, was kindly provided by Peter Kuempel (University of Colorado, Boulder, CO, USA) and used as a gus− control.
Reagents and recombinant DNA procedures
Radiolabelled nucleotides [γ-32P]-ATP (6000 Ci mmol−1, 150 mCi ml−1) and [α–32P]-dATP (3000 Ci mmol−1, 10 mCi ml−1) were purchased from New England Nuclear. Antibodies to NPTII and GusA were obtained from 5 Prime, 3 Prime. Oligonucleotides were synthesized using a Beckman 1000 M DNA Synthesizer. The lacZ specific oligonucleotide 5′-GGGGGATGTGCTGCAAGGCG-3′ was used in DNA sequencing, primer extension, Northern and RNA dot-blot hybridizations, and anneals to positions +97 to +78 of pATG lacZ coding sequence. The 16S rRNA specific oligonucleotide 5′GGTTACCTTGTTACGACTTC-3′ was used in RNA dot-blot hybridizations and anneals to positions +1510 to +1491 of the E. coli 16S rRNA. Restriction endonucleases, T4 DNA ligase, T4 polynucleotide kinase and T4 DNA polymerase were obtained from New England Biolabs. Sequenase and Taq DNA polymerase were obtained from United States Biochemical and Pfu DNA polymerase from Stratagene. E. coli MR600 tRNA, AMV reverse transcriptase and RNase-free DNaseI were purchased from Boehringer-Mannheim. All enzymes were used according to the manufacturer's recommendations. Polyacrylamide gradient gels were purchased from Bio-Rad and Coomassie blue was used for visual detection of protein bands. Plasmid DNA was isolated using the alkaline lysis method (Sambrook et al., 1989). Competent cell preparation and transformation was conducted by the CaCl2 method (Sambrook et al., 1989). All other DNA manipulations were carried out in the standard manner (Sambrook et al., 1989).
Construction of leadered and unleadered lacZ with CA multimers
The lacZ-containing plasmids were constructed from the pBR322 derivatives pSD–AUG and pUL–AUG (Van Etten and Janssen, 1998) and contain strong transcriptional terminators flanking the leadered and unleadered lacZ genes. The lac promoter has been modified from wild type as described previously (Van Etten and Janssen, 1998). Codons 1–4 of lacZ were replaced by the sequence 5′-ATG TCA GTC GAC-3′ which eliminates the secondary translational start site (Munson et al., 1984) and introduces a SalI restriction site (underlined); the SalI site was followed by 0–2 additional ‘stuffer’ nts to facilitate in-frame fusions of the start codon and inserted CA multimers with lacZ codon 5.
Plasmids containing the lacZ gene were used as templates for PCR-directed mutagenesis in which one oligonucleotide primer contained the desired number of CA repeats and the other primer annealed within the lacZ coding sequence. After amplification, the PCR product was trimmed with appropriate restriction enzymes to facilitate cloning, ultimately producing identical plasmids that varied from one another only by the sequences shown in Table 1. All DNA regions generated by PCR amplification were verified by dideoxy DNA sequencing. The sequenced regions were subcloned into full-length lacZ genes, then transformed into the lac deletion strain E. coli RFS859.
Construction of leadered and unleadered neo and gusA with CA repeats
The neomycin phosphotransferase (neo) coding sequence from Tn5 and the β-glucuronidase (gusA) coding sequence were supplied on plasmids constructed from the pUC-derivative pTZ18 U (Kunkel et al., 1987) and contain strong transcriptional terminators flanking the neo and gusA genes. The pNeo plasmids were derived from pL320B (B. Van Etten, unpublished) and the pGus plasmids were derived from the pUL–AUGgus and pSD–AUGgus plasmids described by Van Etten and Janssen (1998). The neo and gusA genes have been modified to contain a SalI restriction site after codons 4 and 6 respectively. The neo and gusA genes were transcribed by the same lac promoter used to express the lacZ genes above.
EcoRI–SalI DNA fragments, containing the lac promoter and extending through the transcriptional and translational start sites, were subcloned from the lacZ constructs described above into plasmids containing the neo or gusA coding sequences. The AUG start codon within the EcoRI–SalI fragment (Table 1) was in-frame with the downstream neo or gusA coding sequence. This resulted in a series of neo- and gusA-containing plasmids that differed only by the sequences shown in Table 1; the SalI site used in subcloning is indicated in Table 1.
Plasmid-containing strains were grown to an OD600 of 0.3–0.4 at 37°C in 20 ml triplicate cultures of 2× YT (per litre, 16 g Difco Bacto tryptone, 10 g Difco Bacto yeast extract, 10 g NaCl, pH 7.4) supplemented with 200 μg ml−1 ampicillin and 0.2 mM IPTG and quick chilled on ice. Triplicate β-galactosidase assays (Miller, 1992) were performed on each of triplicate cultures for each strain.
Kanamycin gradient plates
Levels of Km resistance were assayed according to the method described previously (Ward et al., 1986) with the following changes. L agar was used as the growth medium and the gradients were allowed to form at room temperature for 1 h before inoculation. Three microlitre samples of E. coli cultures grown overnight in L broth were streaked evenly across the surface of the gradient plate. The plates were observed after incubation at 30°C for 20 h.
Cells from cultures grown overnight in L broth were pelleted, washed and resuspended in RS buffer (10 mM tris-HCl, pH 7.6, 10 mM MgCl2, 50 mM NH4Cl, 0.21 μl ml−1β-mercaptoethanol). Cells were lysed by sonication and proteins were separated on 4–20% polyacrylamide gradient gels. Proteins were detected by Coomassie blue staining or transferred to S and S Nytran membranes (0.1 μm pore size) using a semidry blotter (Fisher Biotech). Antibody detection of proteins by Western blotting was carried out according to the ECL Western blotting protocol (Amersham). The resulting chemilumenscent bands were analysed with the image analysis program NIH Image 1.52.
Total RNA was isolated as described previously (Wu and Janssen, 1996). Northern dot-blots were performed as described by Shleicher and Schuell (1987) on S and S Nytran filter membrane (0.1 μm pore size) with the following modifications and additions. Total cellular RNA (0, 1, 2, 4, 6 μg) was supplemented with E. coli tRNA to a total mass of 6 μg RNA and loaded on the membrane. The lacZ-specific end-labelled oligonucleotide was used at 106 cpm ml−1 hybridization mixture and was hybridized at 68°C. The 16S rRNA-specific end-labelled probe was diluted (3:160) with unlabelled 16S rRNA oligonucleotide and hybridized at 53°C. The amount of radiolabelled oligonucleotide hybridized to the membrane was quantified using an Ambis 2000 beta scanner.
Northern blotting for detection of size-fractionated mRNA was carried out with S and S Nytran membrane (0.1 μm pore size) as described previously (Kaiser et al., 1994), except the prehybridization and hybridization steps were as described above for the Northern dot-blots. A control lane was excised from the gel, equilibrated in 0.25 M NH4OAc for 30 min, and the 16S and 23S rRNA bands visualized with 2 μg ml−1 ethidium bromide and UV light. Size estimations of hybridization signals were extrapolated from a plot of rRNA size versus distance migrated.
Primer extension reactions containing 40 μg RNA and 2 pmol end-labelled lacZ-specific oligonucleotide were performed as described previously (Brown et al., 1988). The primer extension reaction was electrophoresed against appropriate dideoxy sequencing reactions and visualized by autoradiography.
Primer extension inhibition (toeprint) assay
Messenger RNAs for use in toeprint assays were generated in vitro using T7 RNA polymerase (W. Van Etten, manuscript submitted). In brief, templates for T7 transcription were prepared by PCR using the gel-purified EcoRI–ClaI restriction fragment, containing a lacZ region from pSD(CA)2 or pSD(CA)11 as template, a downstream primer 5′-TTCCCGCTAGCCACGCCCGG-3′ and an upstream primer 5′-GGAATTCTAATACGACTCACTATAGAATTGTGAGCGG-3′, resulting in DNA fragments encoding T7-promoted, SD-leadered lacZ fragments containing (CA)2 or (CA)11. PCR products were gel purified. Transcription reactions contained 15 mM DTT, 4 mM NTPs, 40 mM tris-HCl (pH 7.9), 20 mM MgCl2, 1 μg template DNA, 500 U T7 RNA polymerase and were carried out for 1 h at 37°C, followed by treatment with 10 units of RNase-free DNaseI for 30 min at 37°C. Transcription reactions were extracted with phenol/chloroform/isoamyl alcohol (25:24:1) and the RNA precipitated with sodium acetate (pH 6) and isopropanol. The final RNA concentration was determined by absorbance at 260 nm (assuming 1 OD260 = 40 μg ml−1 and 330 g mole−1 nucleotide−1).
Conditions for toeprint assays were according to Hartz et al. (1988), with modifications suggested by G. Spedding (personal communication). End-labelled oligonucleotide (6.4 pmol, in H2O) was annealed to mRNA (3.2 pmol, in H2O) in a 10 μl volume of storage buffer without magnesium (SB-Mg; 10 mM tris-acetate, pH 7.4, 60 mM NH4Cl, 6 mM β-mercaptoethanol) by denaturing for 3 min at 65°C, followed by annealing for 20 min at 50°C, and then transferred to ice with the addition of 2 μl storage buffer containing 60 mM magnesium (SB + 60 Mg) (10 mM tris-acetate, pH 7.4, 60 mM magnesium acetate, 60 mM NH4Cl, 6 mM β-mercaptoethanol). Ribosomal subunits (30S) were purified and generously provided by W. Van Etten (manuscript submitted) according to a method of G. Spedding (personal communication); subunits were ‘activated’ by a 15 min incubation at 37°C before use. Ternary complexes were formed by combining 2 μl of the above annealing reaction (containing 0.53 pmol of mRNA) with 10–20 pmol of uncharged tRNAfMet, 1 μl of dNTPs (3.75 mM each dATP, dGTP, dCTP, dTTP) and activated ribosomes in a 9 μl reaction volume of SB for 10 min at 37°C. Depending on the specific toeprinting reaction, ribosome concentrations ranged from zero- to 16-fold molar excess over mRNA. Primer extension was initiated with the addition of 1 unit AMV reverse transcriptase (diluted to 1 unit μl−1 in 50 mM tris-acetate, pH 7.4, 2 mM DTT, 50% glycerol). Extension was continued for 15 min at 37°C until the reaction was terminated by precipitation with 0.3 M sodium acetate (pH 6) and 2.5 volumes ethanol. Dried pellets were dissolved in 10 μl Loading Dye (80% formamide, 10 mM NaOH, 1 mM EDTA, 0.05% Bromophenol blue, 0.05% Xylene cyanol). Reactions were heat denatured for 5 min at 95°C before loading onto a denaturing 6% polyacrylamide sequencing gel.
*Present address: Procter and Gamble Pharmaceuticals, Health Care Research Center, Mason, OH 45040 9462, USA
We thank J. R. Liston, Bill Van Etten and Angela Walker for stimulating discussions and helpful suggestions throughout this study. We thank Bill Van Etten for purified 30S subunits and other materials needed for the toeprint assays, and Bill Van Etten and Gary Spedding for advice on the toeprinting procedure. We thank Luis Actis for assistance with Western blots, Vasker Bhattacherjee for help with Northern blots, and Peter Kuempel for E. coli PK0803. This research was supported by the National Institutes of Health Grant GM45923.