To investigate the determining factors in the selection of the transcription start points (tsp) by RNA polymerase of Escherichia coli, we systematically deleted or substituted single base pairs (bps) at 25 putative critical positions in the two extended −10 promoters, P1 and P2, of the gal operon. These changes extend downstream from −24 to +1 of the P1 promoter. In vitro transcription assays using supercoiled DNA templates revealed a preference for a purine in the non-template strand for tsp in both promoters. The optimal tsp is the 11th bp counting downstream from the −10 position. A single bp deletion anywhere from −10 to +1 switched the tsp to the next available purine 2–3 bp downstream on the non-template strand whereas deleting a single bp at position from −24 to −11 did not affect the tsp. The nature of the 10 bp sequence of the −10 to −1 region, while affecting promoter strength, did not influence tsp. The cAMP–CRP complex, which stimulates P1 and represses P2, did not affect the tsp selection process. The rules of tsp selection by RNA polymerase containing σ70 in gal and pyr promoters discussed here may be applicable to others.
In vivo and in vitro analysis of the 5′ ends of mRNA and the compilation of promoter DNA sequences showed that bacterial RNA polymerases (RNAP) initiates transcription more or less at a fixed position in a promoter (Hawley and McClure, 1983; Harley and Reynolds, 1987). Transcription start points (tsp) are influenced by the nature of the RNAP. For Escherichia coli RNAP holoenzyme containing σ70, the tsp (marked as +1) is usually a purine, with a preference for C at −1 position and a preference for T at +2 position in the non-template strand of DNA (Maitra et al., 1966; Hawley and McClure, 1983). It is presumed that specific sequence in one or more of the −10, extended −10 and −35 elements of the promoter guides the RNAP to select the cognate start point, which is located downstream of the critical A:T-rich −10 hexamer. Some of the factors that influence the selection of tsp are nucleotide pools, RNAP sigma factors, distance from the −10 and −35 regions, and the sequence context around the initiation sites (Sørensen et al., 1993; Jeong and Kang, 1994; Liu and Turnbough., 1994; Fredrick and Helmann, 1997; Tu and Turnbough., 1997; Walker and Osuna, 2002).
The cyclic adenosine monophospate (cAMP) receptor protein (CRP) regulates both promoters in opposite directions. The cAMP–CRP complex (referred to as CCC in this article) activates P1 and represses P2 by binding to its site, centred at −41.5 (Musso et al., 1977; Adhya and Miller, 1979; Taniguchi et al., 1979; Busby et al., 1982). To investigate the role of DNA sequence in the selection of tsp in the gal promoters, we deleted individual bp at every position around the promoters by polymerase chain reaction (PCR)-mediated site-directed mutagenesis. In vitro transcription assays were performed on supercoiled DNA templates carrying the alterations in the absence and presence of CCC to investigate the rules of the site selection and the role of the regulator, if any, in the process. Our results delineate the rules of the selection of tsp by RNAP in the two gal promoters.
Results and discussion
Change of +1A of P1 (+5 of P2)
We investigated the effect of substitution or deletion of the A:T bp at position +1 on tsp by performing in vitro transcription and primer extension assays on the altered supercoiled DNA templates (Figs 2A and 3). The intrinsic strength of P1 was weaker than that of P2 (1:1.34) with the wild-type template, indicating that RNAP formed more productive complexes at P2 than at P1 (Fig. 2A, lanes 1 and 2, and Fig. 3A). CCC enhanced P1 activity by 15-fold and repressed P2 activity by 2.5-fold (Fig. 2). This enhancement level of P1 transcription by CCC was considerably higher than previously reported, the reason for which is not clear (Busby et al., 1982; Choy and Adhya, 1993). Replacement of the A residue with G at position +1 (C−1G+1T+2A+3) (lanes 3 and 4) had no effect on P1 strength, suggesting that A and G are equally efficient in initiation at +1 (Figs 2B and 3B). On this (A→G) template, CCC stimulated P1 and repressed P2 as in wild type. When the +1A was changed to +1C (C−1C+1T+2A+3) (Fig. 2A, lanes 5 and 6), the intrinsic strength of P1 with tsp distributed equally at +1C and +3A was drastically reduced (sixfold) (Figs 2B and 3C). CCC stimulated both tsps by sixfold. When +1A was changed to +1T (Fig. 2A, lanes 7 and 8), P1 initiated more weakly from +1T than from +3A (C−1T+1T+2A+3). The +1 and +3 starts observed in RNA gels were confirmed by primer extension assays (Fig. 3D). CCC enhanced the starts by ≈10-fold (Fig. 2B). The change to +1T did not affect the intrinsic strength of P2. The deletion of +1A (C−1Δ+1T+2A+3) switched the P1 tsp to +3A and as expected reduced the length of the P2 transcript by 1 nt (Fig. 2, lanes 9 and 10, and Fig. 3E). CCC stimulated P1 by 10-fold and reduced P2 by fivefold (Fig. 2B). Taken together, these results showed a flexibility of RNAP in selecting the tsp in P1 depending on the nature of the base at +1. A pyrimidine is not preferable as a tsp for P1. The efficiency at the tsp for P1 follows a hierarchy: A = G > C > T, an observation similar to that made in the E. coli lacUV5 promoter (Jeong and Kang, 1994). In the case of a poor starting base, tsp frequently moved to a downstream purine.
Change at −5A of P1 (+1 of P2)
As the sequence context surrounding the two tsp are different, we investigated whether substitutions or deletions at position −5, which corresponds to +1 of P2, affects P2 (Fig. 1B). An A to G change at −5 did not affect the intrinsic strength of P2 or its tsp(Fig. 4A, lanes 3 and 4, and Fig. 4B). This suggests that P2, like P1, can start efficiently with either purine. An A to C or T substitution reduced the intrinsic strength of P1 (lanes 5–8). P2 was significantly reduced with an A to C change and is inactivated with an A to T change (Fig. 4), indicating that P2 prefers a purine for efficient start. Thus, the efficiency in initiation by a base at P2 (+1) also follows the rule: A = G > C > T. Deleting −5A eliminated P2 transcription (lanes 9 and 10). The lack of any P2 transcription is noteworthy, given the fact that there was no purine available within 2 bp downstream of P2 tsp. Interestingly, the efficiency of P1 initiation with −5T and Δ−5A was partially reduced. By deleting −5A, P1 initiated at +3A instead of +1A. This suggests that besides the nature of the base at +1, the length of DNA between the tsp and the upstream promoter elements (s) perhaps plays a critical role in the tsp selection process. This aspect was studied further as described below. In the above cases, the activation of P1 and the repression of P2 by CCC were more or less normal confirming that tsp selection for the two promoters and their regulation by CCC are independent events.
When +1A was deleted, P1 initiated downstream at the next available purine (+3A). However, when −5A was deleted, there was no detectable P2 transcript possibly due to the lack of purines within 3 bp downstream of position −5. We tested whether the P2 transcript can be restored when a purine was provided downstream of −5 (5′-T−7T−6Δ−5T−4A−3T−2C−1A+1T+2A+3) by constructing a template with double mutations (deletion of the −5A and a substitution of −3T by an A (Fig. 5). The P2 transcript was indeed restored in the double mutant with initiation at the newly introduced purine (Fig. 5A, lanes 5 and 6, and Fig. 5C). The double mutation, however, did not switch back the tsp of P1 from +3A to +1A. This may result from the shorter length of P1 tsp from an upstream tsp determining element. Incidentally, the stimulation of P1 by CCC in the double mutant was approximately ninefold, close to the 13-fold stimulation in wild type (Fig. 5B, lanes 1 and 2 versus lanes 5 and 6).
Deletion analysis of base pairs from −12 to +1
We described above that the deletion of a single bp at +1A or −5A shifted the start site of P1 to the next available purine (+3A) although +1A was still available. This suggested that the distance between the −10 element of P1 and the start site might be a determining factor in the tsp selection process. We further deleted a single bp at every position from −24 to +1 of P1(Fig. 6A) and investigated the effect of each deletion on the selection and strength of the tsp. The deletion of −1C or −2/3/4T shifted the tsp of P1 to +3A and reduced the length of the P2 RNA by 1 nt as observed in the case of +1A deletion (Fig. 6B, lanes 1–8). Although the amount of transcription was marginal, P1 RNA started at +3 for the Δ−5A, Δ–6/7T, Δ−8/9G and Δ−10T (lanes 9–16). CCC enhanced P1 RNA synthesis to wild-type levels in the latter four cases (Fig. 6B and E). Deletion of −10T reduced the intrinsic strength of P1 significantly because of the disruption of the −10 element of P1. The start site was at +3A both in the absence and in the presence of CCC (lanes 15 and 16).
P2 RNA synthesis was normal and initiated at −5 (+1 for P2) although its length was 1 nt shorter on templates with Δ+1A, Δ−1C and Δ−2/3/4T (Fig. 6B, lanes 3–8). CCC repressed P2 in all three cases (Fig. 6B and E). P2 transcription was inactivated by the deletion of −5A (as mention above), −6/7T and −8/9G (lanes 9–14). The P2 inactivation of Δ−6/7T or Δ−8/9G is probably because of, as mentioned before in the case of Δ−5A, the absence of a purine in the vicinity of P2 normal tsp. We cannot explain why the low level of P2 activity was observed with Δ−10T and not with Δ−6/7T or Δ−8/9G. CCC repressed the low level of P2 activity observed with the Δ−10T template.
In the absence of CCC, only marginal RNA synthesis occurred for P1 and P2 with the −11A deletion (lanes 17 and 18). It is known that −11A is the most critical bp of the −10 element for transcription initiation (Jones and Morgan, 1992; Li and McClure, 1998; Helmann and deHaseth, 1999; Fenton et al., 2000; Panaghie et al., 2000; Lim et al., 2001). However, in the presence of CCC, P1 was enhanced with the start site still at +3A (lane 18). An overexposure of an RNA electrophoretic gel with single deletions at positions from −13C to −10T (Fig. 6D) showed that beginning with the Δ−11A, P1 tsp shifted back to +1A in the absence of CCC. The reason for the retention of some tsp at +3A in the Δ−11A template in the presence of CCC is not known. Likewise, in Δ−12T, P1 RNA with a +1A start was barely detectable in the absence and presence of CCC (Fig. 6B, lanes 19 and 20, Fig. 6D, lanes 5 and 6, and Fig. 6E). The −12T is the first base of the −10 element of P1 (T−12A−11T−10G−9G−8T−7) and the last base of the −10 element of P2 (T−17A−16T−15G−14C−13T−12). The first (T) and the last (T) positions are highly conserved in the consensus −10 element (TATAAT). Therefore, the deletion of −12T is expected to destroy the −10 elements of both P1 and P2. With deletions of −11A and −12T, there was no initiation from P2 because of the shortening of the spacer length by 1 bp between the −10 element of P2 and its tsp (−5) (Fig. 6B and F). As there is no available purine immediately downstream of −5 for potential initiation, not surprisingly, there was no P2 transcription.
Deletion analysis of base pairs from −24 to −13
The Δ−13T, Δ−14G, Δ−15T and Δ−16A templates either eliminated or considerably reduced P1 RNA synthesis with very slight stimulation by CCC (Fig. 6B, lanes 21–26, Fig. 6C, lanes 29 and 30, and Fig. 6E). These changes affect the −10 or extended −10 elements (T−17A−16T−15G−14) of the promoter. Interestingly, Δ−19G increased transcription of P1 by 2.7-fold over the wild type (Fig. 6C, lanes 33 and 34, and Fig. 6E). The presence of CCC further enhanced the P1 RNA synthesis. It was reported previously that a Δ−17/18T deletion caused transcription of P1 to be CCC independent, but at the same time repressed by CCC (Busby et al., 1982). Our result is inconsistent with the previously proposed model that the deletion of 1 bp allows CCC to act as a repressor as if P1 were behaving like P2 because the former is closer to the CCC binding site at position −41.5. It appears that CCC can stimulate P1 activity even though the distance between the CCC binding site and the P1 sequence was decreased by 1 bp. The entire deletion set from Δ−12T to Δ−19G (G−19T−18T−17A−16T−15G−14C−13T−12) was very defective in P2 (Fig. 6B, lanes 19–26, Fig. 6C, lanes 29–34, and Fig. 6F) because of a disruption of the −10 or extended −10 element and/or non-availability of a purine as a tsp for this promoter as discussed above. The almost undetectable level of P2 activity in this set made the repression effect of CCC on P2 insignificant.
The intrinsic level of P1 transcription was identical to that of wild type with Δ−20/21/22T, Δ−23C and Δ−24T templates with tsp at +1 and more or less normal stimulation by CCC, suggesting there is no essential information in this region for P1 transcription (Fig. 6C, lanes 35–40, and Fig. 6E). This was also more or less true for P2; the deletions of the corresponding bp (−25A, −26C, −27G and −28C) did not affect P2 (data not shown). In Δ−20/21/22T, Δ−23C and Δ−24T, significant level of P2 transcription was observed with tsp at −5 (+1 for P2); transcription was repressed in the presence of CCC (Fig. 6C, lanes 35–40, and Fig. 6E). Thus, the tsp in the above templates with a single bp deletions in the −24 to −20 segment were at the corresponding wild-type position (+1 for P1 and −5 for P2) consistent with the idea that tsp is determined by bps from −10 to +1 positions.
The distance from −10 to start point
We demonstrated that a single bp deletion at any position from −10 to +1 shifted the tsp from +1A to +3A in P1 or from −5 to −3 (+1 to +3 of P2) in a double mutant of P2; deletions upstream of −12 did not affect the tsp. We conclude that RNAP selects the tsp to be at the 11th bp counting downstream from position −10, provided it is a purine. If no purine were available at that point, RNAP would scan the next two to three positions for a purine to initiate. This rule of tsp determination even applies to inefficient promoter variants. To further test the rule, a T residue was inserted between positions −5 and −6 (∇−6T) in the mutant DNA template containing the −2T deletion to restore the normal distance from 10 bp in Δ−2T to 11 bp for P1(Fig. 7A). The double mutant was then tested for tsp (Fig. 7B). The wild-type template started transcription at +1A (Fig. 7B, lanes 1 and 2), whereas the Δ−2T mutant template started at +3A (lanes 3 and 4) as shown before. The Δ−2T and ∇−6T containing template restored the start site of P1 to +1A, which was further stimulated by CCC to wild-type level (lanes 5 and 6), thus giving further support to the rules of tsp selection defined above. Transcription from P2 in the Δ−2T/∇−6T containing template was marginal perhaps because the available purine is now at a non-optimal location (12th). In addition, there is evidence of stuttering of P2 RNA synthesis because of an A:T-rich sequence surrounding P2 tsp (lane 5) (Jin, 1994). Next, a T residue was inserted between positions −12 and −13 (∇−13T) in the mutant DNA template containing the Δ−2T (∇−2T/∇−13T). In the absence of CCC, P1 was inactivated because of the disruption of its extended −10 element (lane 7). CCC repressed P2 and enhanced P1 slightly for both +1A and +3A starts (lane 8). The DNA template with ∇−6T alone started P1 transcription at the 12th position (+2 in the mutant template) counting from −10 (Fig. 7B, lanes 9 and 10). CCC enhanced P1 to wild-type level.
This work was designed to study the selection process of tsp by RNAP-σ70 holoenzyme in the two promoters of the gal system. The results generated several important experimental observations. (i) Transcription initiates at the 11th position counting from position −10 of the promoter. (ii) A purine is needed for efficient initiation at P1 and P2 as reported previously (Sørensen et al., 1993; Jeong and Kang, 1994; Liu and Turnbough, 1994; Fredrick and Helman, 1997; Tu and Turnbough, 1997; Walker and Osuna, 2002). When a pyrimidine is substituted at the designated tsp, transcription initiation is much less efficient. Our results confirm the previous findings of Liu and Turnbough (1994), who demonstrated that in the promoter of the pyrC operon, which encodes enzymes for the synthesis of UMP, the tsp has a preference for ATP ≥ GTP > UTP > CTP. Incidentally, with +1C, the efficiency of initiation is high in the presence of excess CTP in the pyrC promoter. (iii) If the length between the −10 position and the tsp is decreased by 1 bp, tsp moved to the next downstream purine if available within the next 3 bp. For P1, it moved to +3A. For P2, it does not initiate; the next four positions did not contain a purine. P2 did initiate if a purine is introduced at the +3 position (counting from +1 of P2). (iv) Increasing the distance between −10 to +1 positions from 11 to 12 bp makes RNAP move the tsp to the 12th position, which is a purine. (v) In the absence of a purine at the 11th position, RNAP does not initiate at the 10th position even if it is a purine. (vi) The requirement of 11 bp or one DNA helical turn between the −10 and tsp positions suggest that RNAP are very likely in contact with two points on the same face of the DNA. (vii) Alteration of critical promoter features, which reduces transcription efficiency, still follows the same principle of tsp selection. (viii) CCC did not regulate the tsp selection scheme; it stimulated P1 and repressed P2 independent of start points. CCC activated P1 on templates with deletions from −17/18T to −24T. This suggests that the distance between CRP binding site and the −10 element can be altered by 1 bp without affecting CRP action.
The plasmids used in this study are listed in Table 1. Plasmids were constructed by molecular cloning techniques (Sambrook et al., 1989). The parental plasmid (pSA850) contains the following features: the phage attachment site (attP′OP), the corresponding bacterial attachment site (attB′OB), the ρ-independent transcription terminator site of rpoC gene and the multiple cloning sites (mcs) (Squires et al., 1981; Choy and Adhya, 1992; Lewis, 2003). The ρ-independent transcription terminator was inserted downstream of OI to generate 125- and 130-nt-long transcripts from P1 and P2 respectively (Fig. 2A) (Lewis, 2003). Plasmid pSA850 contains the wild-type gal regulatory region from −75 to +91 (166 bp EcoRI–PstI fragment) and was used as a template for all PCR amplifications.
The mutant promoters were constructed by PCR amplifications in a Perkin Elmer PCR system 2400 using rTth DNA polymerase (Applied Biosystems). Briefly, the primer XbaI-2 (5′-atacgactcatagggaatttctagaccttcccgtttcgc-3′, covering the −180 to −139 region in pSA850) and the reverse primer (containing a deleted or substituted nucleotide covering −56 to +13) were used to amplify the left PCR product. The forward primer (containing a deleted or substituted nucleotide covering −24 to +45) and the reverse primer Hind3–6 (5′-gtgctgcaaggcgattaagttgggtaacgccaggg-3′, covering the +631 to +597 region in pSA850) were used to generate the right PCR products. Both PCR products were purified from the parental plasmid on a 1% agarose gel electrophoresed in 10 mM Tris acetate, pH 8.0, 1 mM EDTA (1× TAE) buffer. A vertically sliced segment of the gel was stained in 0.5 µg µl−1 ethidium bromide solution to localize the PCR products. The stained gel slice was aligned to the unstained gel and the unstained PCR products were excised from the gel. The DNA was eluted from the gel slice according to the protocol outlined in the QIAquick gel extraction kit from Qiagen. The two PCR products were mixed and amplified by the two external primers (XbaI-2 and Hind3–6). The resulting PCR product was separated on a gel, digested with EcoRI and HindIII and purified by QIAquick PCR purification kit. The digested PCR product was cloned into pSA850, which was digested with EcoRI and HindIII, and dephosphorylated with calf intestinal alkaline phosphatase. The recombinant plasmids were transformed into maximum efficiency E. coli DH5α competent cells (Invitrogen). Purification of the plasmid DNA was performed according to the protocol outlined in the Qiagen plasmid Midi kit. The concentration of the DNA was determined spectrophotometrically at 260 and 280 nM. The plasmid DNA sequences were confirmed by sequencing on an ABI Prism 310 Genetic Analyzer.
To avoid confusion in the description of deletions in the regions with repetitive bases, we represent the deletion of −2T, −3T or −4T as Δ−2/3/4T, −6T or −7T as −6/7T, −8G or −9G as −8/9G, −17T or −18T as Δ−17/18T and −20T, −21T or −22T as −20/21/22T.
In vitro transcription assays
In vitro transcription reactions were performed according to the method described by Lewis (2003). Briefly, supercoiled DNA templates (2 nM) were pre-incubated at 37°C for 5 min in transcription buffer (20 mM Tris acetate, 10 mM magnesium acetate, 200 mM potassium glutamate) supplemented with 1 mM DTT, 1 mM ATP, 100 µM cAMP, 0.8 U µl−1 rRNasin (Promega) and 20 nM RNAP (USB, specific activity: 2.4 × 103 U mg−1, 1 U µl−1) in a total reaction volume of 50 µl. When required, 50 nM CRP was added. To start the transcription reactions, nucleotides were added to a final concentration of 0.1 mM GTP, 0.1 mM CTP, 0.01 mM UTP and 5 µCi [α-32P]-UTP (ICN, specific activity: > 3000 µCi mmol−1, 10 µCi µl−1). The reactions were incubation at 37°C for an additional 10 min before they were terminated by the addition of an equal volume (50 µl) of loading dye (90% formamide, 10 mM EDTA, 0.1% xylene cyanol and 0.1% bromophenol blue). After incubation at 90°C for 2–3 min, samples were chilled on ice. Aliquots of 6 µl sample were loaded on a 6% sequencing gel and electrophoresed at constant power of 60–65 W. The 106 and 108 nt RNAI transcripts were used as internal controls to quantify the relative amount of gal transcripts (Tomizawa et al., 1981).
Mapping of tsp
To map tsp or 5′ end of the gal transcripts, a primer PEgus-3, 5′-ccaatgtaaccgctaccac-3′, which is complimentary to the non-coding template in pSA850 from +64 to +45, was used. Primer extension assays were conducted as described in the protocol of AMV reverse transcriptase by Promega (cat. No. E3030). The in vitro transcription assays were performed as described above with a few exceptions to obtain total RNAs. First, cold NTPs (1.0 mM ATP, 0.02 mM each of GTP, CTP and UTP) were used to generate unlabelled RNA. Second, these reactions were incubated at 90°C for 2–3 min to inactivate RNAP. PEgus-3 were end-labelled with [γ-32P]-ATP (ICN, specific activity: > 7000 µCi mmol−1). One microlitre of [γ-32P]-PEgus-3 (1 pmole µl−1) and 5 µl AMV primer extension 2× buffer (100 mM Tris-HCl, pH 8.3, 100 mM KCl, 20 mM MgCl2, 1 mM Spermidine, 2 mM GTP, CTP, ATP and TTP) were added to 5 µl of total RNA transcript. The extension reactions were incubated at 58°C for 20 min to anneal the primer to RNA before cooling the reactions to room temperature for 10 min in a Perkin Elmer PCR system 2400. A mixture of 2 mM sodium pyrophosphate, AMV primer extension 1× buffer and AMV reverse transcriptase (0.05 U µl−1) was added to the annealed reactions and incubated at 42°C for 30 min to extend the primer to the tsp. An equal volume (20 µl) of loading dye (98% formamide, 10 mM EDTA, 0.1% xylene cyanol and 0.1% bromophenol blue) was used to terminate the reactions. Six microlitre and 2 µl aliquots of the reactions performed in the absence or presence of CCC, respectively, were loaded on a 6% sequencing gel. CCC activated P1 by ≈15-fold; therefore, only 2 µl aliquots of the reactions were loaded to obtain a discernible band.
DNA sequencing was carried out according to the fmol® DNA cycle sequencing system protocol from Promega (Cat. No. Q4100). Briefly, 18 µl of master mix containing 0.6 µg µl−1 plasmid DNA template, 2 pmole [γ-32P]-PEgus-3, 5 U Taq DNA polymerase and sequencing buffer (5×) was prepared. Aliquots of 4 µl of the master mix were added to four tubes containing 2 µl of ddGTP, ddATP, ddTTP or ddCTP. The labelled primer, [γ-32P]-PEgus-3, was used for both primer extension and DNA sequencing reactions. The sequencing reactions were performed on a Perkin Elmer PCR System 2400 with the following programme: 1 cycle: 95°C for 2 min; 30 cycles: 65°C for 30 s, 42°C for 30 s and 70°C for 1 min; 1 cycle hold at 4°C. An equal volume (6 µl) of loading dye was added to each reaction. The reactions were heated to 90°C for 2–3 min before 2 µl of each reaction was loaded on a 6% sequencing gel in the order of G, A, T and C reactions.
We thank Stephen Busby for valuable critical comments on the manuscript.