SEARCH

SEARCH BY CITATION

Keywords:

  • Genome engineering;
  • Lambda Red;
  • Multiple chromosomes;
  • Rec E/T;
  • Recombineering efficiency

Abstract

  1. Top of page
  2. Abstract
  3. 1 Introduction
  4. 2 Bacteriophage recombineering in E. coli
  5. 3 Multiple chromosomes
  6. 4 Conclusion
  7. Acknowledgements
  8. References
  9. Biographical Information

Recombineering has been an essential tool for genetic engineering in microbes for many years and has enabled faster, more efficient engineering than previous techniques. There have been numerous studies that focus on improving recombineering efficiency, which can be divided into three main areas: (i) optimizing the oligo used for recombineering to enhance replication fork annealing and limit proofreading; (ii) mechanisms to modify the replisome itself, enabling an increased rate of annealing; and (iii) multiplexing recombineering targets and automation. These efforts have increased the efficiency of recombineering several hundred-fold. One area that has received far less attention is the problem of multiple chromosomes, which effectively decrease efficiency on a chromosomal basis, resulting in more sectored colonies, which require longer outgrowth to obtain clonal populations. Herein, we describe the problem of multiple chromosomes, discuss calculations predicting how many generations are needed to obtain a pure colony, and how changes in experimental procedure or genetic background can minimize the effect of multiple chromosomes.

1 Introduction

  1. Top of page
  2. Abstract
  3. 1 Introduction
  4. 2 Bacteriophage recombineering in E. coli
  5. 3 Multiple chromosomes
  6. 4 Conclusion
  7. Acknowledgements
  8. References
  9. Biographical Information

Recombination has been an essential mechanism for genetic engineering, particularly in microorganisms. Unfortunately, many organisms, such as the prototypical bacterium, Escherichia coli, do not efficiently perform recombination with externally introduced DNA. As a result, traditional genetic engineering in E. coli relied heavily on phage-based transduction and conjugation in addition to plasmid-based gene introduction. Recombineering, which combines the ease of electroporation and increased recombination efficiencies of phage-based systems, has revolutionized the speed and ease of genetic engineering in E. coli. This technology has fundamentally changed the way we now think about editing genomes and has enabled large-scale genetic engineering approaches that were not possible previously. Specifically, this technique takes advantage of heterologous expression of bacteriophage proteins (Rec E/T from Rac prophage [1, 2] or the Red αβγ proteins from Lambda phage [3, 4]). When overexpressed in E. coli, these proteins enable efficient homologous recombination with DNA supplied by transformation. Any linear DNA fragment, either double (ds) or single stranded (ss), can be designed with an appropriate homology sequence (as short as 20 base pairs (bp)) to introduce a variety of changes, such as point mutations, deletions, insertions, replacements, and inversions, into any DNA in vivo, including chromosomes [1, 3, 5], plasmids [1, 6], and bacterial artificial chromosomes (BACs) [4, 7–9]. The first large-scale use of recombineering in E. coli was the creation of the Keio collection, which is a library of single-gene knockouts [10, 11]. Most recently, recombineering has been multiplexed and automated by using mixtures of ss- or ds-oligos to introduce a number of different changes simultaneously, thus greatly expanding our ability to edit genomes on a larger scale [12–14]. Since the first report of phage-based recombineering in 1998 by Stewart et al. [1] and Murphy [3], various efforts have improved our understanding of recombineering mechanisms and enabled further improvements in recombineering efficiencies. We present the current state of knowledge on recombineering and then discuss key considerations and challenges to the extension of multiplex recombineering to large-scale genome editing.

2 Bacteriophage recombineering in E. coli

  1. Top of page
  2. Abstract
  3. 1 Introduction
  4. 2 Bacteriophage recombineering in E. coli
  5. 3 Multiple chromosomes
  6. 4 Conclusion
  7. Acknowledgements
  8. References
  9. Biographical Information

Although homologous recombination has been reported in E. coli, the number of reported recombinants was two to three orders of magnitude lower (≈1 per 106 cells) than the native homologous recombination efficiencies of other microbes, such as Saccharomyces cerevisiae. However, when bacteriophage-based recombination proteins, such as Rec E/T or Red αβγ, are expressed in E. coli, recombination efficiencies reach up to about 1 per 103–4 cells without any further optimization [1, 3, 15]. In the Rec E/T system, RecE is an exonuclease with 5' to 3' activity and RecT is a ssDNA binding protein that stabilizes the ssDNA intermediate involved in annealing to the newly introduced complementary DNA strand during replication [16, 17]. The Lambda Red system consists of three proteins: Exo (also known as a), Beta, and Gam. Exo is an exonuclease, which degrades DNA in the 5' to 3' direction (similar to RecE); Beta is a ssDNA binding protein, which binds to ssDNA (similar to RecT), stabilizing it, and protecting it from further action by exonucleases; and finally Gam inhibits RecBCD and SbcCD activity in the host, thereby protecting the exogenous DNA from being degraded by natural mechanisms [18, 19]. Comparing the two phage systems, RecE and Exo serve the same function within the cell. The same is true for RecT and Beta, however, RecE does not function with Beta, and Exo does not function with RecT, so they are not interchangeable [20]. Recombineering with dsDNA requires the presence of all three Lambda Red proteins, but ssDNA recombineering only requires the Beta protein [17, 21]. A recent study by Fu et al. [22] reported that the Rec E/T system could be used for direct DNA cloning; Rec E/T is highly efficient at linear–linear homologous recombination and is more efficient than the Lambda Red proteins. Despite this, the majority of current research efforts employ the Lambda Red proteins, which is the main focus of the rest of this review.

2.1 Current mechanism

Understanding of the mechanism of recombineering has evolved significantly over the last few years. Initially, it was thought that recombineering occurred by strand invasion [23, 24]; however, in a recA background, recombineering remains highly efficient and therefore a RecA-independent mechanism was proposed, wherein the newly introduced DNA is incorporated by annealing at replication forks [23]. Stahl et al. [25] performed a detailed study to determine which mechanism was more likely, and recombination products from crosses of Lambda phage showed characteristics more consistent with annealing. Later studies demonstrating enhanced targeting of dsDNA recombineering to the lagging strand provided further support for the replication-fork annealing model [26, 27]. The most recent mechanism (Fig. 1), proposed by Mosberg et al. [28], hypothesized that replication-fork annealing occurred through a fully ss intermediate. When dsDNA is used, the first step in recombination is the degradation of one complete strand by Lambda phage Exo. It is believed that Exo and Beta act synergistically, with Exo aiding in the binding of Beta to the ss intermediate [29]; thus overcoming any potential issues with secondary structures of a fully ss intermediate. Beta binds to the ssDNA to protect it from further degradation and catalyzes its placement and annealing to the lagging strand of the replication fork, acting as an Okazaki fragment [28, 30, 31]. The homology regions of the oligo bind to complementary regions and, during the next round of replication, the insertion (or deletion, mutation, etc.) is incorporated into the newly synthesized DNA. This proposed mechanism is supported by experimental evidence, such as higher efficiencies for oligos targeting the lagging strand [28, 32, 33] and only Beta is needed for ssDNA recombineering [17, 21]. Since the mechanism of annealing to the chromosome is suspected to be the same for both ss- and dsDNA, the main advantage of using dsDNA recombineering is that larger inserts can be created with the polymerase chain reaction (PCR) (possibly including selectable markers) without any further processing to obtain ssDNA. This picture of the mechanism of recombineering is not complete, because recombination does occur on the leading strand, but how that occurs is not yet known.

thumbnail image

Figure 1. Current proposed mechanism of Lambda Red recombineering in E. coli. Upon induction of Lambda Red proteins, the Gam protein blocks activity of RecBCD (ExoV) and SbcCD, which degrade exogenous DNA. The Exo protein degrades one DNA strand in the 5' to 3' direction and recruits Beta to bind the exposed ssDNA to protect it from further degradation. Beta also promotes annealing to the lagging strand at the replication fork during DNA replication. The ssDNA displaces an Okazaki fragment at the replication fork and becomes integrated into the newly synthesized DNA.

Download figure to PowerPoint

2.2 Improvements in efficiency

With a better understanding of the recombineering mechanism, there have been numerous recent studies focused on improving the efficiency of Lambda Red based recombineering. In general, the most efficient methods now entail (i) optimizing the oligo used for recombineering to enhance replication-fork annealing and limit proofreading; (ii) mechanisms to modify the replisome itself; and (iii) multiplexing recombineering targets and automation.

Sawitzke et al. [34] investigated the effect of increasing oligo concentration and found that saturation occurred at 3000 oligos per cell. In the same study, the optimal length was also investigated. The optimal length was a 60- or 70-mer, but essentially oligos of 40–70 bases gave the same recombineering efficiencies, with recombineering efficiency decreasing rapidly for oligos of less than 25 nucleotides in length [34]. Wang et al. [13] investigated the efficiency of recombineering with varying sizes of insertions, deletions, and mismatches. For all cases, efficiency dropped off with increasing length of nonhomologous sequence, such as mismatch, insertion, or deletion; this indicates that minor rewriting can be achieved at much higher efficiencies than large changes. They also reported 90 bp as the optimal oligo length. Phosphothionating the four nucleotides at the 5' end of the oligo protects them from endogenous exonucleases and increases efficiency two-fold [13]. Finally, designing oligos to target the lagging strand increases efficiency about 30-fold [32, 33], which also supports the current mechanism of recombineering, which states that the oligo acts as an Okazaki fragment on the lagging strand of DNA replication.

Several strategies for modifying the cellular machinery involved in DNA replication have been shown to be effective, improving efficiency up to several hundred-fold. The first major strategy was to inactivate the mismatch repair (MMR) system by knocking out MutS, which is responsible for correcting mistakes made during replication. Constantino and Court [32] reported that, by removing mutS, oligo-mediated recombination increased 400-fold for a GB7G mismatch and 100-fold for most other mismatches. However, the basal rate of mutation increases 100-fold in a ΔmutS background, which can have undesired secondary effects, in particular, when attempting to map engineered mutations onto selectable phenotypes. One approach to circumvent the DNA MMR system is to use chemically modified bases that are not recognized by the MMR proteins. Wang et al. [35] used 2'-fluorouridine, 5-methyldeoxycytidine, 2,6-diaminopurine, or isodeoxyguanosine instead of the natural bases and found that allelic replacement efficiencies increased 20-fold in a strain with 100-fold lower background mutation rate. Mutations were also introduced at higher efficiencies in cells with a functional MMR system by introducing four or more adjacent mismatches or introducing mismatches at four or more consecutive wobble positions near the mutation site [34]. A more recent study introduced a known mutation into the DNA primase to reduce the frequency of priming for Okazaki fragments [36], which resulted in longer Okazaki fragments and greater accessibility to ssDNA on the lagging strand. This mutation, dnaGQ576A, resulted in 63% greater allelic replacement per clone than the wild type. A follow-up study removed five endogenous exonucleases (RecJ, ExoI, ExoVII, ExoX, and Lambda Exo); these mutations alone increased recombineering efficiency 46% and when combined with the dnaGQ576A mutation resulted in 111% more clones per cycle than the wild type; however, these mutations came at the expense of cellular growth rate [36, 37].

Multiplexing and automation were also used to increase the efficiency of recombineering. Wang et al. [13] built a device that automated all the steps in recombineering; thus enabling many more rounds of recombineering in one day. Multiplexing oligos also enabled the efficient creation of combinatorial libraries. This technique, multiplex automated genome engineering (MAGE), was used to create combinatorial libraries containing about 3 billion mutants in a few days, several of which were capable of producing five-fold more lycopene (mutations were directed at genes in the 1-deoxy-D-xylulose-5-phosphate (DXP) pathway) [13]. MAGE was also used to massively rewrite the E. coli chromosome, changing all 314 TAG stop codons to TAA stop codons. In this study, Isaacs et al. [12] speculated that there was a subpopulation of cells that were highly recombinogenic, and thus, contained many more mutations than other cells. This discovery led to the development of the coselection MAGE approach [38, 39]. The strategy here is that, since DNA is sufficiently unwound at the replication fork so that up to 500 000 bp are exposed, then if one oligo is integrated into the chromosome, it is likely that a second (or third, fourth, etc.) oligo designed to be integrated close by would also be integrated. Therefore, coselection markers, which are easily screened or selected for (e.g. antibiotic resistance or amino acid auxotrophy), are chosen around the chromosome and oligos designed to turn them on/off are mixed in with the oligo mixture used during recombineering. The use of coselection greatly reduced the number of colonies that had to be screened and resulted in the identification of strains with as many as 12 mutations [39]. Sawitzke et al. [34] also reported increased recombineering efficiencies (≈100-fold) in cells co-electroporated with a selectable plasmid. As such, these techniques, along with access to cheap and accurate oligo libraries (up to several hundred thousand), have set the stage for massive genome engineering at a scale of dozens to hundreds of modifications in parallel and on laboratory timescales.

3 Multiple chromosomes

  1. Top of page
  2. Abstract
  3. 1 Introduction
  4. 2 Bacteriophage recombineering in E. coli
  5. 3 Multiple chromosomes
  6. 4 Conclusion
  7. Acknowledgements
  8. References
  9. Biographical Information

It is widely known that E. coli can have multiple copies of a chromosome within the cell, in fact, some reports have the copy number of the chromosome as high as 16, depending on growth rate, growth stage, and nutrient state in the medium [40, 41]. In batch culture, the number of chromosomes within the cell varies widely (2 to 16) between stages of growth (lag, exponential, and stationary) in rich medium and even in what is assumed to be balanced, steady-state growth (exponential growth) the number of chromosomes per cell does not remain constant [42]. Variation in chromosome copy number is much less in minimal medium, with cells having only between one to four copies [42, 43]; however, cells grown in minimal medium are less efficient at taking up exogenous DNA [28], and thus, reducing any potential advantage. Low efficiency recombineering in cells grown in minimal medium can also be due to lower availability of replication forks or other reasons yet to be elucidated. Another issue is that daughter cells inherit multiple chromosomes upon cell division as well, thus carrying on the mixed genotype through several generations.

3.1 Recombineering with multiple chromosomes

The presence of multiple genomes in the cell reduces the efficiency on a chromosomal basis significantly (Fig. 2). An 'ideal' cell with one chromosome per cell would result in a maximum chromosomal efficiency of 50%, whereas the presence of four chromosomes has the potential to bring that number down to 12.5%. If we define the efficiency in terms of strands of DNA instead of complete chromosomes, then this decreases the efficiencies to 25% and 6.25%, respectively. Any cell with at least one chromosome containing the mutation would present the expected phenotype; however, the degree of heterogeneity within the population will be determined by the number of chromosomes in the cell at the time of recombineering. During outgrowth, except in the ideal case, mutated cells would become diluted by wild type; an additional complication arises when the mutation confers a growth defect, which would further dilute the desired subpopulation. Longer periods of outgrowth increase the likelihood of finding a clonal colony; however, it also increases the total number of cells, which need to be screened/selected for to find the desired cells. This issue is briefly mentioned by Sawitzke et al. [34], who also point to a figure of a plate with several sectored colonies plated only 30 min after electroporation. Their recommendation is to allow the cells to grow for 3 h before plating to obtain pure colonies; a similar figure appears in Constantino and Court [32]. While it is possible to identify a clonal colony after 3 h of outgrowth, longer outgrowth is needed to increase the frequency of nonsectored colonies in the overall population. If care is not taken to ensure mutant populations are homogeneous, mutants can be lost over time if there is no selectable advantage.

thumbnail image

Figure 2. The effect of multiple chromosomes on recombineering efficiency. (A) The ‘apos;ideal’ case (one chromosome/cell) with a maximum theoretical efficiency of 50%. (B) A more typical case, in which the cell has multiple (4–8) chromosomes, which results in lower efficiencies, 12.5 to 50%. (C) The effect of multiple chromosomes on homogeneity of the population, assuming a recombineering efficiency of 10%, increasing the chromosome copy number from 1 to 8 requires 55 generations to obtain a clonal population (≈2 overnight plates).

Download figure to PowerPoint

To estimate the time required to obtain a homogenous population, we constructed a simple mathematical model. We have simplified the mechanism of recombineering and replication by assuming one set of replication forks per chromosome, chromosomes are split equally and randomly between daughter cells, and copy number is maintained throughout the growth period. In the ideal case (1 chromosome per cell), only one generation is required to have nonsectored colonies. As one would suspect, increasing the chromosomal copy number results in longer outgrowths to obtain chromosomal homogeneity. If we assume a chromosomal recombineering efficiency of 10%, it is estimated that 8 generations of outgrowth are required if 2 chromosomes are present per cell to obtain 90% of recombinant phenotype cells with homogenous chromosomes. If 4 chromosomes are present per cell, 23 generations of outgrowth are required for the same degree of recombinant homogeneity (approximately 1 overnight plate). If 8 chromosomes are present in each cell, then 55 generations are required. An overnight plate is approximately 25 generations; therefore, cells with 4 or 8 chromosomes require an extra 1 or 2 days, respectively, to obtain homogenous populations. These are crude estimates that ignore many of the more complicated details of recombineering and replication, which are discussed in more detail below, but serve as a starting place for estimating how long one needs to wait to ensure a clonal population.

This leads us to the question of how chromosomal copy number can be controlled to increase the recombineering efficiency. As discussed previously, the type of medium used for growth can affect the chromosomal copy number, but the trade-off is lower DNA uptake efficiencies. In general, slower growth results in fewer chromosomes (i.e. lower temperatures), but this increases the overall time required for recombineering. Alternatively, one might attempt to engineer the intracellular mechanism that controls copy number. It is known that DnaA controls the initiation mass of DNA replication, and therefore, controls copy number [44]. Temperature-sensitive DnaA mutants (dnaA(Ts)) decrease replication initiation at higher temperatures and increase replication initiation at lower temperatures [45]. DnaA is autoregulated and the protein concentration is rate-limiting for initiation [46–49]; Løbner-Olesen et al. [50] showed that, by placing dnaA expression under control of the lac promoter and operator, the amount of DNA within the cell could be controlled by isopropyl β-D-1-thiogalactopyranoside (IPTG). Engineering replication machinery is a promising avenue to control copy number; further investigation into this area is likely to lead to the discovery of other methods to reduce DNA content in the cell and subsequently increase recombineering efficiency, and thus, enable larger scale genome engineering.

3.3 Multiple replication forks

The simple example depicted in Fig. 2B does not give the whole picture of how multiple genomes can affect recombineering. Replication of the chromosome requires around 40 min to complete, but cells are able to grow at a much faster pace (td = 20 min). To maintain a high copy number of the chromosome, the cell must initiate another round of replication before the previous one is complete. Therefore, at any given time, multiple replication forks will be open simultaneously [51]. Since cells used for recombineering are typically grown in LB medium and collected and washed at the mid-exponential growth stage, they will almost assuredly have multiple replication forks open. If the oligo that introduces the desired mutation is integrated into the chromosome at the leading replication fork (and onto the parent chromosome), the chromosomal efficiency will actually be enhanced because all subsequent DNA replication will incorporate that mutation into the new chromosome. If the oligo is integrated in the last replication fork, the efficiency of recombination would be lower, because only the last copy of the chromosome will carry the desired mutation. It has also been reported that the probability of an oligo being integrated into the chromosome is highly dependent on the oligo, location on the chromosome, and how closely it is to another oligo that has been integrated [38, 39]. Therefore, when multiple oligos are used in a single electroporation event, it is difficult to predict the probability of each oligo being integrated into the chromosome because they are not mutually exclusive events.

4 Conclusion

  1. Top of page
  2. Abstract
  3. 1 Introduction
  4. 2 Bacteriophage recombineering in E. coli
  5. 3 Multiple chromosomes
  6. 4 Conclusion
  7. Acknowledgements
  8. References
  9. Biographical Information

The use of bacteriophage proteins to enable high-efficiency homologous recombination in E. coli has laid the groundwork for the field of genome engineering, allowing fast and efficient editing of the genome at a much larger scale. Since the first report of recombineering in 1998 [1, 3], several groups have focused their attention on improving the system to achieve even higher efficiencies. In trying to optimize recombineering efficiencies, the mechanism of recombineering has been elucidated for the Lambda Red system [28]. The design of the DNA cassette itself was optimized: ssDNA oligos should be 60- to 90-mer, the 4 bases on the 5' end should be phosphothionated, and the oligo should be designed to target the lagging strand. Multiple modifications to the replisome also increased efficiency, knocking out the MMR by removing MutS, removing endogenous nucleases, and slowing down DNA primase. Finally, coselection MAGE allows researchers to screen far fewer colonies for the desired mutation based on the use of easily selectable/screenable coselection markers, which is even more advantageous for mutations that confer no obvious phenotype, such as promoter- or ribosome-binding site changes.

One area that has yet to be optimized is concerned with the issue of multiple chromosomes. Depending on the medium, growth stage, and nutrient state of the cell, E. coli can have up to 16 chromosomes. Thinking about recombineering efficiencies on a chromosomal basis instead of a cellular basis can thus explain reduced overall efficiencies. Colonies plated after short recovery periods (30 min) will be sectored and require several more isolations or longer outgrowth to obtain a homogeneous population of mutants. If the desired mutation confers a defect in growth, longer recovery times will dilute the mutant population significantly. Growth in minimal medium can reduce the number of chromosomes per cell, but cells grown minimal medium are much less efficient at taking up DNA. Synthetic biology approaches can also be used to more finely control chromosome copy number; the first obvious target is DnaA. Experimental approaches can also be used to reach homogeneity faster. For example, repeated recombineering cycles with the same oligo(s) would reduce the number of generations to find pure mutant colonies. Prior to efforts to reduce chromosomal copy number, the ultimate goal of any experiment needs to be considered. Depending on the goal of the recombineering experiment, homogeneity may not be advantageous. If a homogeneous population is desired, cells either need to be plated immediately and rescreened/selected until homogeneity is achieved or several rounds of recombineering can be performed with the same oligo (as in MAGE) to decrease the number of screens/selections to find a homogeneous mutant. However, if the goal of recombineering is to obtain a library of cells with different mixtures of mutations, then the presence of multiple genomes is in fact enabling the heterogeneity of the population.

Recombineering has revolutionized the way we think about engineering cells, enabling larger genome-wide editing. Many future applications of recombineering will be focused on engineering entire pathways to optimize expression of all relevant proteins in a combinatorial way, so we can achieve higher product yields, as done for lycopene by Wang et al. [13]. Our current ability to engineer entire pathways and investigate combinatorial libraries of expression levels is limited by the sheer size of libraries required; increasing recombineering efficiencies (by reducing chromosomal copy number or any other method) would reduce the total number of cells that need to be screened to find a winning combination.

Acknowledgements

  1. Top of page
  2. Abstract
  3. 1 Introduction
  4. 2 Bacteriophage recombineering in E. coli
  5. 3 Multiple chromosomes
  6. 4 Conclusion
  7. Acknowledgements
  8. References
  9. Biographical Information

Funding for N.R. Boyle and T.S. Reynolds is provided by OPX Biotechnologies Inc.

The authors declare no conflict of interest.

References

  1. Top of page
  2. Abstract
  3. 1 Introduction
  4. 2 Bacteriophage recombineering in E. coli
  5. 3 Multiple chromosomes
  6. 4 Conclusion
  7. Acknowledgements
  8. References
  9. Biographical Information

Biographical Information

  1. Top of page
  2. Abstract
  3. 1 Introduction
  4. 2 Bacteriophage recombineering in E. coli
  5. 3 Multiple chromosomes
  6. 4 Conclusion
  7. Acknowledgements
  8. References
  9. Biographical Information
Thumbnail image of

Professor Ryan Gill received his B.S in chemical engineering from The Johns Hopkins University in 1993; his M.S. in chemical engineering from University of Maryland College Park in 1997; his Ph.D. in chemical engineering from the University of Maryland College Park in 1999; and did postdoctoral work in chemical engineering at MIT from 1999 to 2001. He joined the Chemical and Biological Engineering Department at University of Colorado as an Assistant Professor in 2001, where he is currently an Associate Professor. He is the Managing Director of the Colorado Center for Biorefining and Biofuels and his research focuses on the development of tools that enable high-efficiency genome editing.