Correspondence: Nancy A. Da Silva, Department of Chemical Engineering and Materials Science, University of California, Irvine, CA 92697-2575, USA. Tel.: +1 949 824 8288; fax: +1 949 824 2541; e-mail: email@example.com
Metabolic pathway engineering in the yeast Saccharomyces cerevisiae leads to improved production of a wide range of compounds, ranging from ethanol (from biomass) to natural products such as sesquiterpenes. The introduction of multienzyme pathways requires precise control over the level and timing of expression of the associated genes. Gene number and promoter strength/regulation are two critical control points, and multiple studies have focused on modulating these in yeast. This MiniReview focuses on methods for introducing genes and controlling their copy number and on the many promoters (both constitutive and inducible) that have been successfully employed. The advantages and disadvantages of the methods will be presented, and applications to pathway engineering will be highlighted.
The yeast Saccharomyces cerevisiae is a key laboratory and industrial microorganism and an excellent host for metabolic pathway engineering. As a eukaryote, S. cerevisiae can synthesize a variety of fungal and mammalian proteins that can prove problematic in bacteria. Ease of cultivation, success at the industrial level over many years, and generally recognized as safe (GRAS) status by the U.S. Food and Drug Administration contribute to the interest in this microorganism. The broad array of tools available for molecular-level manipulation of this species and knowledge of the S. cerevisiae metabolic, secretory, transport, signaling, and other pathways enable the successful engineering of this yeast for a diverse range of applications.
Pathway engineering requires the regulated expression of multiple foreign or native genes. In contrast to the operons that can be employed in prokaryotic cells, a series of independently transcribed genes must be introduced in yeast. Both the level and the timing of enzyme synthesis can be essential for the successful introduction of new pathways. In addition to gene number and transcription level/timing, translational and post-translational control can be important for modulating protein levels.
This MiniReview focuses on methods developed to introduce and control the expression of genes in S. cerevisiae, focusing on those most useful for metabolic engineering applications. The review will consider the control of gene number, an important method for regulating overall expression. This will include the advantages and limitations of plasmid vectors, and useful new vector series. Chromosomal gene integration is efficient in yeast and offers precise control over gene copy number and stability. We will consider the important issues and methods for chromosomal gene integration. Controlling transcription level via constitutive and inducible promoters offers another critical method of gene expression regulation. Useful promoters (including new series of promoters), the range of promoter strengths available, and regulation will be summarized. Recent metabolic engineering studies in S. cerevisiae will be used to demonstrate the application of these methods.
While the review will focus on the critical elements of introducing, maintaining, and expressing genes, we recognize the importance of additional methods to control the enzyme levels in yeast, including those modulating translation and post-translational processing.
Introduction of genes and regulation of gene number
Both plasmid vectors and chromosomal integration are widely used to introduce genes and control copy number in S. cerevisiae. Each has an important role, and the choice depends on the overall goal (e.g. overexpression, precise control of gene number). While the plasmids available for use in yeast are much more limited than those for Escherichia coli, they have been successfully employed for many metabolic engineering applications. They are extremely useful for gene expression; however, plasmids offer limited control of copy number, and segregational stability can be a significant issue even in selective medium. As homologous recombination is very efficient in S. cerevisiae, integration of genes into the genome offers an alternate, straightforward mechanism for gene introduction. Chromosomal integration also allows the insertion of precise numbers of the same or different genes. This is particularly important for the regulated expression of metabolic pathway genes.
The three classes of autonomously replicating plasmids in yeast are YRp, YEp, and YCp. All are S. cerevisiae/E. coli shuttle vectors that typically carry a multiple cloning site (MCS) for the insertion of expression cassettes. YRp vectors carry a S. cerevisiae origin of replication (e.g. ARS sequence) with no partitioning control. These are extremely unstable (Murray & Szostak, 1983a; Da Silva & Bailey, 1991) and not generally used for metabolic engineering applications. In contrast, the widely used YCp and YEp vectors have proven successful for many applications. YCp (CEN/ARS) vectors carry both an origin of replication and a centromere sequence, have high segregational stability in selective medium, and are maintained at 1–2 copies per cell (Clarke & Carbon, 1980). YEp vectors are based on the S. cerevisiae native 2μ episomal plasmid and contain either the full 2μ sequence or, more commonly, a 2μ sequence including both the origin and the REP3 (STB) stability locus (Futcher & Cox, 1983; Kikuchi, 1983). For the full sequence, use of a cir0 strain lacking the native plasmid is recommended to prevent the recombination between the vectors and to keep copy number of the recombinant vector high. For the partial 2μ plasmids, a cir+ host carrying the native 2μ is required to provide the transacting factors (REP1 and REP2) required for stability. These vectors are generally more structurally stable than the full 2μ plasmids, but may be maintained at lower copy number. Although the maintenance of YEp vectors at 10–40 copies (Romanos et al., 1992) is generally assumed, copy number is not controlled and can vary widely with the gene product and level of expression. Expression from a strong constitutive promoter or synthesis/secretion of complex products can reduce the average copy number and plasmid stability significantly, or overload a pathway in the cell (e.g. Moore et al., 1990; Ro et al., 2008; Fang et al., 2011). In extreme cases, use of a CEN/ARS vector may give higher product levels (Wittrup et al., 1994).
The general lack of yeast plasmids that are maintained at very high copy number has led to the development of 2μ-based vectors carrying selection markers such as LEU2-d and URA3-d (Beggs, 1978; Erhart & Hollenberg, 1983; Loison et al., 1989). The defective promoters on the markers result in an increase in copy number; hundreds of copies have been reported in selective medium, although these high copy numbers are not required for viability (Lopes et al., 1991). Such vectors are generally more useful for the overexpression of a product gene than for metabolic engineering applications. However, Ro et al. (2008) demonstrated the successful use of a plasmid carrying the LEU2-d marker (and three pathway genes) for the synthesis of artemisinic acid in nonselective, complex medium.
Several vector series (Table 1) have been developed carrying a series of selection markers on YEp and YCp plasmids (and YIp integrating vectors discussed below). Ma et al. (1987) constructed a series of YCp and YEp plasmids with LEU2, HIS3, LYS2, URA3, and TRP1 selection markers. The YEplac and YCplac plasmids (Gietz & Sugino, 1988) carry a MCS and URA3, TRP1, and LEU2 selection markers on both 2μ- and CEN/ARS-based plasmids, respectively. The pRS series (Sikorski & Hieter, 1989; Christianson et al., 1992) are similar useful cloning vectors with URA3, TRP1, HIS3, and LEU2 markers on both 2μ and CEN/ARS plasmids. Brachmann et al. (1998) and Taxis & Knop (2006) extended the pRS series to include the MET15, ADE2, kanMX, hphNT1, and natNT2 selectable markers. The various vector series have been widely used for gene expression in yeast.
Table 1. Vector series for gene expression in Saccharomyces cerevisiae (autonomous plasmids and integration vectors/templates)
Groups of vectors carrying constitutive and inducible promoters have also been developed (Table 1). These include the variants of the pRS series carrying ADH1, TEF1, GPD1, MET25, CYC1, GAL1, and GALL or GALS (GAL1 variant) promoters (Mumberg et al., 1994, 1995) and the CUP1 promoter (Labbe & Thiele, 1999). Cartwright et al. (1994) developed YEp and YCp expression vectors carrying a URA3 selection marker and the PGK, GAL1, GAL10, PHO5, and CUP1 promoters. The YEplac and YCplac plasmids were modified to carry the tetracycline-responsive tet-on/off promoters (Gari et al., 1997). The commercially available pYES and pYC series (Invitrogen) offer expression from the GAL1 promoter on 2μ or CEN/ARS vectors, respectively. These vectors are available with URA3, TRP1, and blasticidin resistance selection markers. The pGREG vectors (Jansen et al., 2005) are derivatives of the pRS series with five different selectable markers on CEN/ARS-based plasmids with a GAL1 promoter and can be useful tools for plasmid construction via in vivo recombination and for the expression of N- and C-terminal-tagged fusion proteins (nine tags available). Vectors series such as those by Funk et al. (2002), Van Mullem et al. (2003), Geiser (2005), and Alberti et al. (2007) have been constructed utilizing the Gateway™ cloning technology (Invitrogen, review by Walhout et al., 2000), a versatile method that allows for the insertion of ORFs into vectors by in vitro recombination using the bacteriophage lambda att sites. The vector series contain various promoters and selection markers, 2μ or CEN/ARS sequences, and additional features such as epitope tags. Recently, Fang et al. (2011) constructed a series of 32 pXP shuttle vectors, utilizing three constitutive promoters, PGK1, TEF1, and HXT7-391, and six reusable selection markers on both 2μ and CEN/ARS vectors. These vectors can be used as templates for gene integration as described below. M.W.Y. Shen, F. Fang, S. Sandmeyer & N.A. Da Silva (unpublished data) have extended this series to include the GAL1, ADH2, and CUP1 promoters.
These various vector series have been widely used for introducing single genes and also multiple genes for metabolic engineering in S. cerevisiae. Jiang et al. (2005) introduced a vector built from pYES expressing three genes for the biosynthesis of naringenin. Yan et al. (2005) introduced four genes for this pathway onto a single plasmid built from YEplac. Despite the presence of four GAL1 promoters on the same vector, no structural instability was observed for the 65 h of culture. However, repetitive sequences on a plasmid can be an issue as seen in the work by Verwaal et al. (2007) on carotenoid synthesis in S. cerevisiae, where significant structural instability led to the decision to integrate the genes. Leonard et al. (2005) introduced a 2μ plasmid carrying four genes and a CEN/ARS vector carrying three genes (built from YEplac and YCplac, respectively) for flavone biosynthesis. Carlson & Srienc (2006) used modified versions of the pRS vectors to engineer S. cerevisiae for the production of the biopolymer poly[(R)-3-hydroxybutyrate] (PHB). Two plasmids (one containing the bidirectional GAL1-GAL10 promoter) were used to express three pathway genes. These are just a few examples of the widespread use of such vector series for metabolic pathway engineering.
Vectors carrying two or more promoters allow the expression of more than one gene on a single plasmid. The bidirectional promoter plasmid series pBEVY and pBEVY-G were constructed by Miller et al. (1998) and have the TDH3 and ADH1 promoters (constitutive), or the GAL1-GAL10 (galactose-inducible) promoters, on 2μ vectors with four different selectable markers. Li et al. (2008) extended this series to include eight new 2μ-based pY2x-GAL(1/10)-GPD plasmids; these carry an inducible GAL1 or GAL10 promoter and a constitutive TDH3 promoter with four different markers. The pESC vector series is commercially available (Agilent Technologies) and has the bidirectional GAL1-GAL10 promoter cassette on 2μ-based vectors carrying one of the four selectable markers. The recently described pSP-G1 and pSP-G2 vectors (Partow et al., 2010) carry the constitutive TEF1 and PGK1 promoters in two different orientations; these are based on the original pESC-URA3 plasmids and aimed at high-level expression in yeast. All of these vectors allow the expression of two genes from the same construct, avoiding the need to carry two different plasmids in the cell.
Vectors carrying bidirectional promoters are thus very useful for metabolic engineering in S. cerevisiae. In particular, pESC variants carrying the bidirectional GAL1-GAL10 promoters have been used to express two to four genes on a single 2μ plasmid. Kim et al. (2011) used a pESC vector to evaluate the expression of pathway genes on the biosynthesis of ceramides. Maury et al. (2008) introduced the bacterial isoprenoid pathway into S. cerevisiae with eight genes carried on two plasmids (four genes under the control of two bidirectional promoters per plasmid).
To introduce multiple genes simultaneously, yeast artificial chromosomes (YACS) may be useful. Originally described by Murray & Szostak (1983b), these vectors carry an origin of replication, a centromere sequence, and telomeres and can be constructed by both in vivo recombination and in vitro ligation of the DNA fragments (Burke et al., 1987). Because of their ability to carry large fragments of DNA, such vectors have been useful in studying genomes, but can also be used to carry large pathway constructs on a single new ‘chromosome’. In a recent study by Naesby et al. (2009), eYACs (expressible yeast artificial chromosomes) were used to randomly assemble a group of genes for a seven-step flavonoid pathway. Fifty percent of the clones produced naringenin when grown in the presence of coumaric acid. Stability was comparable to normal YACS, and flavonoid production was maintained for more than 50 generations.
Chromosomal integration of genes
Plasmid vectors are ideal for the overexpression of genes at ‘high’ (YEp) or ‘low’ (YCp) levels and are convenient and easy to use. In combination with integrated genes, plasmids allow a quick assessment of the degree of overexpression required in a pathway. However, two or more 2μ and/or CEN/ARS vectors can be difficult to stably maintain simultaneously in a single cell (Futcher & Carbon, 1986; Mead et al., 1986). For the introduction of multiple genes, long-term stability, and precise control of expression, integration of the genes into the chromosome holds several advantages.
The ease of homologous recombination in S. cerevisiae (Oldenburg et al., 1997; Raymond et al., 1999; Schaerer-Brodbeck & Barberis, 2004; Gibson et al., 2008) makes genomic integration an attractive method for the introduction of pathway genes. A variety of vector- and PCR-based methods have been developed for single- or multicopy integration. In combination with reusable selection markers and characterized integration sites, chromosomal integration provides precise control over gene copy number and can ensure segregational stability. Integration may not be ideal for the overexpression of genes but is a key method for metabolic engineering in yeast. Important issues include the ease of gene integration and selection, stability of the inserts, and the ability for multiple simultaneous or sequential integrations (e.g. by marker recycling).
Vector-based gene integration
Yeast integrating plasmids (YIp) are vectors that carry a MCS, selection marker, target site(s), and no replication origin. These cannot be maintained in the cell unless integrated into the chromosomes. Depending on the position of the target sequences, integration into the genome can occur via single-crossover (Fig. 1a) or double-crossover (Fig. 1b) homologous recombination. For single-crossover, the vector is linearized within the target sequence. Integration results in duplicated sequences flanking the insert; this can result in structural instability as excision can occur via homologous recombination at these sites. For double-crossover integration (gene conversion and omega integration), the cassette to be integrated is inserted within the target sequence and the vector is linearized outside of this sequence. The genomic insertions are structurally stable as no duplicate sequences occur upon integration. The efficiency of integration via double-crossover is lower relative to single-crossover events, and construction of the vectors can be more difficult.
YIp vector series (Gietz & Sugino, 1988; Sikorski & Hieter, 1989; Cartwright et al., 1994; Alberti et al., 2007; Sadowski et al., 2007) have been developed carrying a series of auxotrophic markers that also act as target sequences, and integration occurs by a single-crossover event. In addition to the standard YIp vectors, useful variants have been developed through the incorporation of resistance selection markers, reusable selection markers, and repeated target sites. Use of a resistance gene, such as the bacterial neor for resistance to the aminoglycoside G418 in S. cerevisiae (Jimenez & Davies, 1980), allows the selection of multiple simultaneous integrations through G418 selection. The success at obtaining multiple insertions is because of the weak expression of the bacterial resistance gene and increased concentrations of the G418. Selection via resistance to G418 and other compounds (e.g. hygromycin and zeocin) can be particularly effective when combined with repetitive target sequences including the Ty1 elements, the ribosomal DNA cluster, and delta elements (Kingsman et al., 1985; Lopes et al., 1989; Fujii et al., 1990; Sakai et al., 1990, 1991).
Delta (δ) elements are the long terminal repeats (LTRs) of the S. cerevisiae Ty1 and 2 retrotransposons (Boeke, 1989; Boeke & Sandmeyer, 1991). Based on the sequence of strain S288C (Goffeau et al., 1996), there are several hundred delta elements dispersed in the S. cerevisiae chromosomes as solo δ elements or associated with Ty elements (Dujon, 1996; Kim et al., 1998; Wyrick et al., 2001). The multiple possible integration locations in combination with G418 or ethionine selection have led to a high efficiency of multicopy integration in a single transformation. A range of integrations (up to 80) has been reported using this method (Shiomi et al., 1995; Parekh et al., 1996; Wang et al., 1996; Lee & Da Silva, 1997a). Despite the large number of available sites, the majority of inserts are in long tandem repeats at one location. While the method is easy to employ, there are two major drawbacks: (1) the method can generally be used only once or twice (repeated G418 selection tends to lead to other mutations allowing resistance, not higher integrated copy number) and (2) the tandem nature of the integrations can lead to high instability, particularly following induction or for products that place a burden on the cell (Wang et al., 1996; Lee & Da Silva, 1997a). However, this method has been shown to be quite effective for optimizing gene copy number when an inducible GAL promoter and late induction of gene expression are combined (Parekh & Wittrup, 1997; Shusta et al., 1998). An example of the use of δ-neo method for pathway engineering is the integration of cellulase and β-glucosidase for the conversion of cellulose to ethanol (with no instability observed) (Cho et al., 1999). This method is not ideal for introducing precise numbers of a group of different genes.
δ elements have been chosen as the target site for several integration methods (discussed below) and successfully applied for pathway engineering. A recent novel approach, ‘cocktail δ-integration’, was used to create strains for the surface expression of β-glucosidase, endoglucanase, and cellobiohydrolase (Yamada et al., 2010). Three successive rounds of cotransformation of the three integrating vectors were performed (using a different selection marker for each round: URA3, HIS3, and TRP1), with different numbers of each gene integrated in each transformation. This allowed the creation of a group of strains and selection of an optimum strain for PASC (phosphoric acid–swollen cellulose) degradation.
The incorporation of reusable selection markers enables multiple sequential gene integrations owing to marker recycling. These markers include a selection gene flanked by direct repeats allowing excision by homologous recombination. Because spontaneous loss by recombination is a low-frequency event, counterselectable markers are generally used, enabling easy selection of cells that have lost the marker. In combination with a repetitive target site (e.g. a δ sequence), multiple integrations of the same (or different) gene are possible with the one general vector construct. Lee & Da Silva (1997b) developed an integrating vector carrying the reusable URA3 blaster cassette (developed by Alani et al. (1987) for multiple gene knockouts) and the Ty1 δ sequence. Genes are integrated sequentially, and URA3 marker excision is selected using 5-FOA (Boeke et al., 1984) after each vector (or partial vector) insertion. Expression is generally linearly correlated with the number of integrations (Lee & Da Silva, 1997b, 2006). A major advantage of this method is the dispersed nature of the genes, resulting in much higher stability than with tandem inserts. Disadvantages are the length of time required for multiple integrations and knockouts and the need for Southern blots or qPCR to determine the number of insertions (as the exact delta location is not specified). The integrations via this vector are also single-crossover events that lead to approximately 300 bp delta repeats on either side. While much more stable than tandem insertions, loss of the inserts can occur with time. A double-crossover version of the vector (Fig. 1b) was developed by placing the integration cassette within the δ element and linearizing outside of this sequence prior to transformation (Ching, 2005). While successful for the integration of a series of different genes, integrations of multiple copies of the same gene were not possible due to the large homologies; during subsequent integrations, replacements of genes/marker at the same location as the previous integration were selected.
The combination of delta target site and reusable selection cassette has been successfully applied for the engineering of strains for the synthesis of a variety of compounds. Examples include the engineering of S. cerevisiae for the production of 1,2-propanediol (Lee & Da Silva, 2006) and, in combination with plasmids, for the synthesis of artemisinic acid (Ro et al., 2006).
Several other reusable selection markers have been developed. Particularly powerful are those with an active recombination system for marker recycling (Prein et al., 2000; Johansson & Hahn-Hagerdal, 2002; Radhakrishnan & Srivastava, 2005). In these cases, the marker need not be counterselectable. The Cre/loxP and FLP/FRT systems (Sauer, 1987; Sauer, 1994; Guldener et al., 1996; Gueldener et al., 2002; Radhakrishnan & Srivastava, 2005) have been successfully applied in yeast. In the former, the selection marker is flanked by loxP sequences, and the expression of the Cre recombinase allows extremely efficient marker excision. Multiple markers can be excised simultaneously, significantly decreasing the time for strain construction. The small footprint of the loxP sequence also reduces the chance of off-target homologous recombination during subsequent integrations.
PCR-based gene integration
PCR-generated fragments have been widely used for gene knockouts (and gene knockins) in yeast (Langle-Rouault & Jacobs, 1995; Lorenz et al., 1995; Manivasakam et al., 1995; Goldstein et al., 1999; Gueldener et al., 2002). In recent years, integration of PCR-generated fragments has been a key method for the insertion of pathway genes into the yeast genome and the generation of libraries (Schaerer-Brodbeck & Barberis, 2004; Shao et al., 2009). The ease of homologous recombination in S. cerevisiae and the reduction in time and effort make this a preferred approach in many instances. Use of one of the many high-fidelity polymerases allows errors during PCR to be minimized, and genomic inserts can be recovered to confirm the correct sequence. In addition, insertions typically occur by double-crossover integration leading to the construction of stable strains (Fig. 1c).
In S. cerevisiae, homologous recombination requires only limited flanking homology (Manivasakam et al., 1995). Efficient targeting increases with the length of the homology, and an overlap of approximately 50 bp (25 bp on each side) is sufficient to easily screen and recover integrants in specific genomic locations. Standard length primers can thus be used to amplify a desired gene cassette with flanking target regions, and the PCR product can be transformed into the yeast for insertion by double-crossover. The efficiency of recombination in yeast also allows the use of nested primers or the assembly of two or more fragments (Fig. 1d) (Erdeniz et al., 1997; Hawkins & Smolke, 2008; Flagfeldt et al., 2009; Shao et al., 2009).
The pXP vector series mentioned earlier (Fang et al., 2011; M.W.Y. Shen, F. Fang, S. Sandmeyer & N.A. Da Silva, unpublished data) was designed to allow the seamless transition from plasmid-based (2μ or CEN/ARS) expression to PCR-based chromosomal gene integration. All selection markers on the vectors are flanked by loxP sequences to allow marker recycling following integration. The promoter-gene-terminator and selection cassettes can be amplified using standard priming sequences with overhangs to target specific chromosomal sites. Although less efficient than traditional vector-based integration where homologies are typically much longer, transformants can be rapidly screened via colony PCR to select the correct integration location. This system can be used for multiple sequential (and possibly simultaneous) gene integrations and multiple simultaneous marker excisions, and is thus a useful tool for pathway engineering. These vectors have now been utilized for the insertion of metabolic pathway genes in several studies currently underway in our laboratory, including the modification of central carbon metabolism, polyketide production, and arsenic uptake and sequestration (unpublished results).
Shao et al. (2009) made use of the efficiency of recombination in yeast to develop the DNA Assembler method for the insertion of entire pathways into the yeast genome (or onto plasmids). Expression cassettes are assembled and integrated in a single step, and large inserts of up to 19 kb have been successfully inserted demonstrating the promise of this tool for metabolic engineering. A group of different promoters and terminators were utilized to avoid repetitive sequences and subsequent instability of the inserts. The yeast Ty1 delta sequence was chosen as the insertion site, and integration occurs by single-crossover integration into one of the hundreds of potential target sites in the yeast chromosomes, contributing to the high efficiency reported. The method has been applied for the introduction of a 3-gene xylose utilization pathway, a 5-gene zeaxanthin biosynthesis pathway, and the two pathways together (at one site) (Shao et al., 2009).
The insertion of metabolic pathways via independent gene integration requires multiple target sites that provide well-characterized expression levels for the inserted genes. Traditional target integration sites have been nutritional marker genes (e.g. URA3, TRP1, and LEU2), although a variety of other locations have also been utilized. Prior work on single-crossover integration into the delta sequences (Lee & Da Silva, 1997b, 2006) found similar expression levels for multiple different integrations; however, the specific delta sites used were not determined.
Two recent studies have focused on the expression of reporter genes following integration (and full target replacement) at a group of well-defined sites. Using the pXP vectors as templates and primers containing targeting regions, Fang et al. (2011) compared the expression of the luciferase gene under the control of the strong PGK1 promoter (PPGK1-Rluc-TCYC1 cassette) at 14 different locations. The expression cassette replaced URA3, MET15, LEU2, TRP1, seven full-length Ty elements (two with replacements in both orientations), and one full-length Ty3 element. Similar levels of expression were observed at the locations studied, with a maximum difference of approximately 50% in luciferase fluorescence. The values were approximately 60% of that on a CEN/ARS vector, as expected based on a plasmid copy number of 1–2. In another recent study by Flagfeldt et al., 2009, expression of lacZ under the control of the two strong TEF1 and ACT1 promoters (PTEF1-lacZ-TCYC1 and PACT1-lacZ-TCYC1 cassettes) was compared at 20 different locations: replacements of URA3, SPB1/PBN1, PDC6, and 17 different solo LTRs. Similar levels were observed for the three gene replacements, but up to eightfold differences were observed at the LTR locations. These results were consistent for both promoters. In combination, the two studies provide 31 characterized sites for gene integration and also demonstrate the importance of evaluating new integration sites prior to utilizing them for gene insertion and expression.
Regulation of expression level via promoter choice
A wide variety of promoters are available for the control of transcription level in S. cerevisiae, including constitutive and inducible promoters of various strengths. In this review, the term ‘constitutive’ will be used broadly as several such promoters have a dependence on glucose level (e.g. transcription drops as glucose is exhausted) and growth phase. The regulated promoters will include those for which a higher level of control is possible or a clear inducer/repressor exists. Choice of promoter in combination with gene number allows a very wide range of gene expression levels.
A useful analysis tool to facilitate a systematic overview of available promoters and regulatory elements in S. cerevisiae is available on SCPD (The Promoter Database of S. cerevisiae) by Zhu & Zhang (1999) at http://rulai.cshl.edu/SCPD/. The database offers information on approximately 6223 open reading frames and can be used to study the yeast transcription factor binding sites.
Constitutive promoters and promoter series
Constitutive promoters have been widely used for controlling gene expression in S. cerevisiae. These promoters offer simplicity (no inducers or repressors needed) and relatively constant levels of expression. This may be desired for the introduction of new pathways in yeast, particularly if the pathway must be active during cell growth. Constitutive promoters may not be ideal for the synthesis of deleterious gene products or when separation of growth and production are desired. Plasmid stability and copy number can also be an issue, particularly when a strong constitutive promoter is used. Mumberg et al. (1995) found that the ratio of β-galactosidase expression between 2μ and CEN/ARS vectors was 30 for the ADH1 promoter, but only three for the stronger GPD promoter. In a study by Fang et al. (2011), when Rluc was expressed using a PGK1 promoter on a URA3-marked 2μ-based vector, copy number was 5.6 (vs. 11.6 for the empty vector).
The most widely used constitutive promoters have often been from the yeast glycolytic pathway. These glucose-dependent promoters include those for phosphoglycerate kinase PPGK1 (Holland & Holland, 1978; Ogden et al., 1986), pyruvate decarboxylase PPDC1 (Kellermann et al., 1986), triosephosphate isomerase PTPI1 (Alber & Kawasaki, 1982), alcohol dehydrogenase I PADH1 (Hitzeman et al., 1981; Denis et al., 1983), glyceraldehyde-3-phosphate dehydrogenase PTDH3 (GAP491) or PGPD (Holland & Holland, 1980; Bitter & Egan, 1984; McAllister & Holland, 1985), and pyruvate kinase PPYK1 (Nishizawa et al., 1989). The constitutive promoter for translation elongation factor PTEF1 (Gatignol et al., 1990) has also been widely used and has recently been modified by error-prone PCR to provide a group of promoters offering a range of expression levels (Alper et al., 2006a; Nevoigt et al., 2006). Other commonly used native promoters include PCYC1 (Guarente et al., 1984), PACT1 (Gallwitz & Seidel, 1980), PMFα1 (Brake et al., 1984), and those for hexose transport, for example PHXT7 (Reifenberger et al., 1995; Diderich et al., 1999) – although this list is by no means exhaustive. Constitutive promoters have been successfully employed for metabolic pathway engineering in yeast; examples include for ammonia assimilation (Roca et al., 2003), xylose metabolism (Ho et al., 1998; Shao et al., 2009), arabinose metabolism (Becker & Boles, 2003; Wisselink et al., 2007), and malic acid production (Zelle et al., 2008).
Several variants of these promoters have been developed in order to change response to glucose or to allow inducibility. The 1500-bp promoter PADH1 (Denis et al., 1983) is activated during growth on glucose and is downregulated following glucose depletion and during ethanol consumption. A short variant of the promoter (PADH1s) with a deletion of 1100 bp in the upstream sequence shifts expression to the early ethanol growth phase with activity increasing into the late ethanol consumption phase (Ruohonen et al., 1991). Restoring 300 bp of the upstream fragment resulted in a ‘middle’ ADH1 promoter (PADH1m) that is activated in early exponential growth and maintains activity into the late ethanol consumption phase (Ruohonen et al., 1995). A common approach to confer regulation is to fuse the UAS region from inducible promoters upstream of constitutive promoters and therefore render the hybrid promoter responsive to temperature (Walton & Yarranton, 1989) or inducer metabolites and ions (Bitter & Egan, 1988; Hinnen et al., 1989; Purvis et al., 1991).
Several methods for developing ‘synthetic promoter libraries’ for the modulation of gene expression are in place (Jensen & Hammer, 1998; Alper et al., 2006a; Hammer et al., 2006; De Mey et al., 2007). For example, mutagenesis of the constitutive PTEF1 promoter sequence (Alper et al., 2006a; Nevoigt et al., 2006) by error-prone PCR resulted in the selection of 11 mutant promoters with strengths ranging between 8% and 120% that of the native PTEF1. The same technique has been extended to optimizing the regulatory properties of the oxygen-responsive DAN1 promoter in S. cerevisiae (Nevoigt et al., 2007) where two mutant promoters with quicker induction and improved expression levels were isolated. Another successful method for synthetic promoter library (SPL) creation described by Jensen & Hammer (1998) is based on saturation mutagenesis of the nucleotide spacer regions flanking key promoter elements. Jeppsson et al. (2003) successfully employed this technique to generate a promoter library called YRP, using the RPG (Nieuwint et al., 1989) and CT (Baker, 1991) regulatory elements and the CUP1 promoter, to control the expression level of glucose-6-phosphate dehydrogenase between 0% and 179% of the wild type.
In contrast to the creation of promoter libraries, the multiple-gene-promoter-shuffling (MGPS) method developed by Lu & Jeffries (2007) can be used to achieve the optimal levels of overexpression of several genes at once, thereby allowing synergistic control over several rate-limiting steps and metabolic flux.
Regulated promoters enable control over the timing and level of gene expression. They are thus more suitable when expression of genes is desired at a specific stage of cell growth, or to prevent the build-up of toxic pathway intermediates. The use of inducible promoters is limited by the sensitivity of the promoter to the inducer (including the strength and time for response to the inducer or repressor), background levels of expression because of ‘leaky’ promoters, and the cost of metabolites for induction. In addition, strain response to the inducer may add to the complexity of expression control.
A large variety of regulated native or engineered promoters have been successfully used to control the gene expression in S. cerevisiae. The most tightly regulated native promoters are from the galactose-inducible S. cerevisiae genes GAL1, GAL7, and GAL10 (Douglas & Hawthorne, 1964; Bassel & Mortimer, 1971). These promoters are induced approximately 1000-fold in the presence of galactose and strongly repressed in the presence of glucose (Adams, 1972). While several genes have been identified to be involved in the regulation of these promoters, the GAL4 gene encoding a transactivator and GAL80 gene encoding the repressor for Gal4 control the central regulation mechanism along with the GAL upstream activation site (UAS) (Johnston et al., 1994). Modifications in the GAL1 and GAL10 promoters (Li et al., 2008) and engineering key enzymes involved in galactose catabolism and transport (Hawkins & Smolke, 2006, 2008) have provided tunable control to galactose-driven expression under these promoters. The GAL-regulated promoters have been widely used in yeast metabolic pathway engineering, including for artemisinic acid synthesis (Ro et al., 2006), increased acetyl-CoA synthesis for isoprenoid production (Shiba et al., 2007), expression of the bacterial isoprenoid pathway in S. cerevisiae (Maury et al., 2008), and n-butanol synthesis (Steen et al., 2008).
The S. cerevisiae CUP1 promoter contains four metal regulatory elements and controls the expression of copper metallothionein in yeast. PCUP1 can be induced around 20-fold in the presence of Cu2+ (Etcheverry, 1990). The activation of this promoter is independent of other culture parameters, with ionic concentration being the limiting factor based on the copper resistance of the host strain (Macreadie et al., 1991; Hottiger et al., 1995; Labbe & Thiele, 1999). Jeppsson et al. (2003) achieved a range of expression levels driven by the CUP1 promoter using the aforementioned SPL technique. The CUP1 promoter has also been used to express pathway genes in yeast, for example, for the synthesis of 1,2-propanediol (Lee & Da Silva, 2006).
The S. cerevisiae ADH2 promoter (573 bp) is tightly regulated by glucose repression, with over a 100-fold repression in the presence of glucose (Price et al., 1990). This promoter requires no inducer, as expression begins as glucose is depleted and ethanol consumption begins. Two UASs contained in a 260-bp region upstream of the initiation site have been identified to render a fully active and regulatable PADH2. Cultivation in complex medium is required for high-level expression from PADH2 (Price et al., 1990; Lee & Da Silva, 2005); however, because the promoter is off until glucose levels fall, plasmid stability is usually not a major issue. The promoter has been successfully employed for several metabolic engineering studies, including the synthesis of polyketides (Kealey et al., 1998; Mutka et al., 2006; Lee et al., 2009) and triacetic acid lactone (Xie et al., 2006) in S. cerevisiae.
Several other inducible S. cerevisiae promoters have been employed in yeast including PPHO5, PMET25, and PMET3. The PHO5 promoter of the acid phosphatase gene is regulated by inorganic phosphate (Pi) in the medium with approximately 200-fold repression in the presence of phosphate. The MET25 gene (Sangsoda et al., 1985; Kerjan et al., 1986) encodes O-acetyl homoserine sulfhydrolase, and the MET3 gene encodes ATP sulfurylase (Cherest et al., 1987). Both promoters are repressed in the presence of methionine or S-adenosylmethionine.
One of the most useful heterologous promoter systems in yeast utilizes the bacterial tetracycline operator (tetO) and hybrid transactivator, based on the expression system developed for mammalian cells (Gossen & Bujard, 1992; Gossen et al., 1995) and shown to be active in yeast by Dingermann et al. (1992). Gari et al. (1997) constructed a set of expression vectors with a hybrid promoter system tetO-CYC1 consisting of seven or fewer tetO boxes, the S. cerevisiae CYC1 TATA region, and tTA activator derivatives (tetR-λcI-VP16), with a 1000-fold induction ratio in the absence of tetracycline. The strongest system thus developed has expression levels comparable with that of the GAL1 promoter. Another study by Murphy et al. (2007) demonstrates the applicability of the tetO2 operator in the design of combinatorial promoters. Belli et al. (1998) developed a tetR regulator–based expression system that allows for tight tetO-driven expression in a dual (tetracycline-repressible and inducible) manner.
Comparisons of promoter strength
A number of studies have compared the strengths of the common constitutive and regulated promoters using gene cassettes either carried on plasmids or integrated into the genome. The latter comparisons are particularly useful as they control for both gene stability and copy number. While the comparisons will depend on the specific gene expressed from the promoters, useful trends have been observed and can be used as a guide for fine-tuning gene expression in metabolic engineering applications. These studies (including the rank order of promoter strengths determined in each study) are summarized in Table 2.
Table 2. Studies comparing the level of expression with various promoters in Saccharomyces cerevisiae
M.W.Y. Shen, F. Fang, S. Sandmeyer & N.A. Da Silva (unpublished data)
PGAL1(b)>> PPGK1(a) ~ PCUP1(c) >> PADH2(a)
Stationary growth phase:
PADH2 > PGAL1 ~ PCUP1 ~ PPGK1
Several promoter comparisons evaluated the expression levels from plasmid-based constructs. Mumberg et al. (1995) compared four constitutive promoters of varying strengths (PADH1, PCYC1, PTDH3, and PTEF1) on both CEN/ARS- and 2μ-based plasmids using E. coli lacZ as the reporter gene during growth in glucose medium. Only a 5-fold difference was observed in the expression level of the strongest promoter (PTDH3) for the 2μ plasmid as compared to the CEN/ARS plasmid. The truncated version of the PCYC1 promoter used in this study lacks the UAS2 sequence and thereby showed weakest activity. Monfort et al. (1999) compared the strength of the four constitutive promoters PPGK1, PADH1s, PADH1m, and PTDH1 (Hadfield et al., 1993) using Geotrichum candidum lipase2 as a reporter from 2μ-based plasmids using the constitutive PACT1 as a control. The results of their studies considered distinct phases of cell growth during glucose consumption, early ethanol consumption, and late ethanol consumption (where only PADH1s remained active). Fang et al. (2011) did a similar comparison using three relatively strong constitutive promoters PPGK1, PTEF1, and PHXT7-391 on the pXP vector series using Renilla luciferase as the reporter protein during exponential growth in glucose medium. Further, the authors identified an inverse relationship between the promoter strength and plasmid copy number.
Additional studies evaluated the promoters for the expression of secretory proteins. Park et al. (1993) compared constitutive promoters PPGK1 and PSUC2 and galactose-inducible promoter PGAL7 for the production of secreted α-amylase in fed-batch cultures. Cartwright et al. (1994) compared the levels of secreted SPβ-lactamase (SPβla) during late exponential growth on various carbon sources when five different promoters (PGAL1, PGAL10, PPGK1, PPHO5, and PCUP1) were used. The difference in expression between YEp-SPB and YCp-SPB was found to be 5-fold for the GAL promoters, 13-fold for PPGK1, 15-fold for PPHO5, and 10-fold for PCUP1.
A key comparison of seven constitutive promoters (PPGK1, PTEF1, PTDH3, PTPI1, PPYK1, PADH1, and PHXT7) was made by Partow et al. (2010) using lacZ as the reporter gene. In this study, the expression cassettes were integrated at a single copy into the URA3 locus of the yeast genome ensuring both stability and constant copy number. Similar to the plasmid-based study by Monfort et al. (1999), the strengths of the promoters were assessed at different stages of growth (glucose or ethanol consumption). In an earlier study by Hauf et al. (2000), five of these promoters as well as PENO2 and PPDC1 were compared during growth on ethanol using single-copy integrated cassettes. Partow et al. (2010) also compared the constitutive promoters with the GAL1 and GAL10 promoters in fed-batch or continuous culture. The result for the GAL1-GAL10 promoter comparison, PGAL10 > PGAL1 (Table 2), is in contrast to earlier studies comparing these promoters (Yocum et al., 1984; West et al., 1987; Da Silva & Bailey, 1991; Cartwright et al., 1994). The authors note that the difference in reporter gene cloning sites used in this study may affect the translational efficiency. This observation is supported by Crook et al. (2011); the authors show that the distance between the promoter and gene because of the MCS can result in significant mRNA secondary structure in the 5′ untranslated region, thereby affecting translational efficiency. Further differences in protein translation were observed with the length, codon optimization, or reporter gene (yECitrine, eGFP, and LacZ) used.
In another study evaluating the strength of inducible promoters, Lee & Da Silva (2005) compared the expression levels under PADH2, PGAL1, and PCUP1 for a single integrated copy of the lacZ gene into the delta sequences and also looked at induction at various culture times. Early induction was better suited to PCUP1 and late induction for PGAL1. In recent studies in our laboratory, expression from PPGK1, PADH2, PGAL1, and PCUP1 was compared on both plasmids (CEN/ARS and 2μ-based) and from single integrated copies (each replacing the same Ty1 locus in the chromosomes) using the extended pXP vector series described earlier (M.W.Y. Shen, F. Fang, S. Sandmeyer & N.A. Da Silva, unpublished data). In these studies, identical sequences separated the native promoter from the start codon of lacZ, and β-galactoside levels were compared at 12-h intervals up to 48 h.
A summary of all promoter comparison studies described is presented in Table 2. Much greater detail on the studies can be found in the listed references.
Metabolic engineering requires the introduction of multienzyme pathways and precise control over the expression of the associated genes. This MiniReview has focused on two key elements in regulating enzyme synthesis in S. cerevisiae, both at the pretranslation level: (1) gene introduction/control of gene number and (2) choice of promoter (both strength and regulation). A large number of approaches are available to modulate the expression at these points, and the combination of copy number control and promoter choice allows a wide range of expression levels for pathway genes.
We have also included several examples of the use of vectors, chromosomal integration, constitutive promoters, and inducible promoters for pathway engineering in S. cerevisiae. This is only a small sampling of the numerous studies described in the literature and does not include many studies that combine multiple different approaches. For example, constitutive and inducible promoters have been employed together to control gene expression at specific times during the culture. In addition, combinations of multiple promoters, gene knockouts, gene integration, and plasmids have been used to introduce complex pathways. One classic example is the engineering of yeast to produce hydrocortisone from glucose (Szczebara et al., 2003).
There are several other important factors influencing protein levels in S. cerevisiae that have not been addressed in this MiniReview, including additional factors at the transcription level and those modulating translation and post-translational processing. These range from transcription factor engineering (e.g. Alper et al., 2006b) and mRNA stability (e.g. the RNA control modules described by Babiskin & Smolke, 2011) to translation efficiency and protein stability. All provide additional levels of regulation that, with gene number and promoter choice, contribute to the fine tuning of enzyme levels needed for metabolic pathway engineering.