The current method for combining transgenes into a genome is through the assortment of independent loci, a classical operating system compatible with transgenic traits created by different developers, at different times and/or through different transformation techniques. However, as the number of transgenic loci increases over time, increasingly larger populations are needed to find the rare individual with the desired assortment of transgenic loci along with the non-transgenic elite traits. Introducing a transgene directly into a field cultivar would bypass the need to introgress the engineered trait. However, this necessitates separate transformations into numerous field cultivars, along with the characterization and regulatory approval of each independent transformation event. Reducing the number of segregating transgenic loci could be achieved if multiple traits are introduced at the same time, a preferred option if each of the many traits is new or requires re-engineering. If re-engineering of previously introduced traits is not needed, then appending a new trait to an existing locus would be a rational strategy. The insertion of new DNA at a known locus can be accomplished by site-specific integration, through a host-dependent homology-based process, or a heterologous site-specific recombination system. Here, we discuss gene stacking through the use of site-specific recombinases.
This second decade of the 21st century began with the cultivation of over 1 billion hectares of bioengineered crops (http://www.isaaa.org). Additionally, a trend not unnoticed is the rapid development of crops with stacked traits. Since 2008, the number of crop plants with the combination of herbicide tolerance and insect resistance has exceeded cultivars with insect resistance alone. With ever-emerging sophistication in genetic information technology, the next generation of bioengineered crops will most likely incorporate multiple transgenic traits.
Genetic information technology has many parallels to information technology. The classical plant breeding process by which we combine transgenic traits created by different developers, at different times and/or through different transformation techniques can be considered an operating system that permits compatibility among software, the DNA encoded transgenes, and hardware, the genetically modified products.
Gene stacking through the assortment of independent loci is efficient when few segregating loci are involved, but as increasing numbers of transgenes are added to the genome through random DNA integration, greater burden is placed upon plant breeders to move the multiple loci to the large number of locally adapted cultivars. This is particularly true considering that the typical introgression of transgenic traits involves also the co-assortment of non-transgenic elite traits. For instance, the introgression of 3 unlinked transgenic loci to an elite cultivar may seem trivial with a probability of (1/4)3 for a homozygous individual, but if 6 other non-elite loci are also involved, the probability would increase to (1/4)9, an event in more than a quarter million individuals. Given that the introgression process often involves parallel efforts with numerous local cultivars and their requisite local field trials, it can be an expensive bottleneck in the development of genetically modified crops. True that not all crop plants go through a breeding step, as many flowers, vegetables, and trees (fruits, nuts and forest trees) are clonally propagated. However, genetic crosses are used in 6 of the 7 highest production value crops in the world: rice, wheat, corn, soy, cotton and tomato, the exception being potato.
Bundling the Software
One solution to reducing the number of segregating loci is to package multiple transgenes into a single transformation event, analogous to information technology's bundling of multiple applications within a single software package. This is achieved through the in vitro stacking of multiple transgenes followed by their integration as a large DNA segment at a single locus. If the goal is to introduce multiple traits not previously in commercial varieties, or to replace outdated traits with upgraded versions, this approach is necessary. In cases where new traits are introduced into genetic backgrounds already harboring previously introduced transgenes, the benefit of a more expedient breeding process might not out weight the extra effort in finding an integration event expressing all of the packaged traits at an acceptable level, along with an integration pattern relatively free of DNA rearrangments. It is possible that previously deregulated traits would also have to undergo regulatory scrutiny just because of their new genomic location. Nonetheless, for the development of certain products, this can be the most logical approach, such as the co-introduction of herbicide tolerance and insect resistance. Since the introduction of insect resistance would require a co-transformed selection gene, the former gene serves as both a selectable marker as well as an agronomic trait.
Many developers are trying to bypass the lengthy and costly introgression step by transforming elite cultivars directly, since the elite cultivar is already a commercial variety that does not require a line conversion step. Transformation protocols for some elite cultivars of various crops are being developed, although often with lower transformation frequencies that require greater cost in obtaining the sufficient number of independent transformation events for field evaluation. Depending on the trait, it is not unusual for developers to screen up to thousands of independent events before finding the best combination of field performance, integration pattern and stable inheritance. Another concern with relying entirely on this approach is that there are often numerous locally adapted cultivars, each requiring a reliable high frequency transformation protocol. Moreover, elite cultivars continue to evolve, so the task of developing new transformation protocols has to be a continuous process. Finally, the most serious drawback may be in the lack of control over the location into which the new DNA integrates. The same DNA would integrate into different locations in each of the many cultivars, likely necessitating independent deregulation of each transformation event; though the same gene, but different location and integration pattern. Nonetheless, transforming elite cultivars directly can be beneficial, as introgression from one elite cultivar to another often requires fewer backcrosses than from a laboratory line. However, this approach in itself does not address the need to minimize the number of segregating loci.
In planta Gene Stacking
In principle, site-specific integration can be used to deliver new DNA next to a transgene segment, true to the definition of the word stack–to arrange in an orderly pile. Site-specific DNA targeting in plants has been reported with both recombinase-mediated and homologous recombination-dependent gene targeting. The recombinase-mediated approach relies on introducing site-specific recombinases to recombine the recognition sites corresponding to the site-specific recombination system (Ow 2002). Thus far, the site-specific recombination systems with proven gene targeting in higher plants include Cre-lox (Albert et al. 1995; Vergunst et al. 1998), FLP-FRT (Li et al. 2009), R-RS (Nanto et al. 2005), phiC31-att (Lutz et al. 2004) and Bxb1-att (Yau et al. 2011) where each system is denoted by the name of the recombinase-recombination site. In homologous gene targeting, the process is host cell-dependent; and for crop plants, practical targeting frequencies have been reported in rice (Terada et al. 2002). The introduction of engineered proteins that recognize target DNA sequences and cause double stranded breaks can increase the efficiency of gene targeting by soliciting a DNA repair response. Zinc-finger nucleases (Shukla et al. 2009; Townsend et al. 2009), mega-nucleases (Yang et al. 2009; Gao et al. 2010), and TAL effector nucleases (Christian et al. 2010) have emerged as effective inducers of site-specific gene targeting.
Each of the two approaches has its merits and deficiencies. Homology dependent gene targeting has one significant advantage over the recombinase-mediated approach, as it can generate mutant derivatives in place of native host genes (Yamauchi et al 2009; Curtin et al. 2011), whereas gene replacement using site-specific recombinases is limited to transgenes preconfigured with recombination sites, commonly referred to as cassette exchange (Nanto et al. 2009). On the other hand, recombinase-mediated reactions are precise, whereas the repair of double stranded breaks often involves end-joining ligation that is not conservative. Nonetheless, for most practical purposes, especially for abolishing gene function, the end-joining process suffices to generate the desired targeting outcome.
In terms of using these systems to sequentially stack DNA next to existing transgenes, both approaches should in theory suffice. Homology-dependent gene targeting can choose the location of the first transgene integration, but this may not be an advantage. Current knowledge of the plant genome cannot predict how a given chromosome location affects newly introduced DNA. Therefore with either approach, a favorable integration site must be found empirically through screening hundreds of random insertions. From the second transgene on, however, a recombinase-mediated approach would appear simpler, as the need to continually design new sequence-specific nucleases would not be necessary. Additionally, the incorporation of other features aside from transgene stacking offers certain advantages in a recombinase-mediated approach, particularly features that address public concerns related to the biosafety of genetically engineered crops. Below, we describe the current development of recombinase-mediated transgene stacking.
Recombinase-mediated Gene Stacking
Recombinase-mediated site-specific integration requires a first recombination site to be introduced into the genome to serve as an integration target. For this process to continue, each new integrating molecule must bring with it a new recombination target for the next round of DNA integration. While gene targeting has been reported using recombination systems that catalyze freely reversible reactions, these systems can be cumbersome for use in gene stacking. With each round of integration, strategies must be devised to prevent pre-existing recombination sites from recombining with each other in unwanted events such as the deletion of previously placed DNA. A report on gene stacking by the FLP-FRT system introduces new FRT sites designed not to recombine with most other previously placed FRT sites, yet some cross-specificity was nonetheless observed (Li et al. 2010). Even if cross-specificity were abolished completely, this strategy would require the continuous development of new FRT sequences that are not recombinogenic with previous FRT sites, a task more difficult with each additional round of integration. More than a decade ago, we decided upon recombination systems that catalyze irreversible reactions for deployment in gene stacking. The recombination sites of these systems are not identical in sequence and are typically known as attB and attP. The attB x attP reaction generates attL and attR that are dissimilar in sequence to attB and attP. The recombinase that promotes the attB x attP reaction, often referred to as the integrase, does not recombine attL x attR, unless there is an accompanying excisionase enzyme. Toward this aim, the phiC31-att and Bxb1-att irreversible recombination systems were specifically developed for this use (Thomason et al. 2001; Thomson and Ow 2006).
The general strategy has been described previously (Ow 2005). Reiterated below, the first trait gene, G1, is introduced into the genome linked to selectable marker M1 and an att site, either attB (BB’), or attP (PP’) as shown in Figure 1A. Three recombination sites from a second recombination system of the reversible type such as Cre-lox, FLP-FRT, or R-RS are placed such that two of them are in the same orientation flanking M1, while a third site is in the opposite orientation downstream of G1-attP. Figure 1 uses the lox site as an example. The directly oriented sites permit excision of DNA no longer needed after new DNA integration (Dale and Ow 1991), while oppositely oriented sites permit the optional resolution of multicopy insertions into a single copy target (Srivastava et al. 1999; Srivastava and Ow 2001). More importantly, it allows for cassette exchange of the flanking DNA with similarly configured cassettes. The “target plant line” harboring this construct is selected from a population of random integration events obtained from Agrobacterium or direct DNA transformation, as a clone with appropriate G1 expression and intact DNA structure. Delivery of a second trait gene G2 occurs via integration of a circular molecule shown in Figure 1B comprising of G2 and M2, along with a set of attB sites flanking G2, and a lox site between G2 and M2. Recombination between attP and attB inserts the circular molecule into the genome to yield two possible configurations depending on which attB site used. The configuration shown in Figure 1C is selected for subsequent deletion of the lox flanked DNA that removes M1 and M2 to produce the configuration shown in Figure 1D. If G1 had already been developed into a commercial variety, M1 most probably would have been removed by a prior site-specific deletion event. If that were the case, a single selectable marker M1 would suffice for this stacking scheme.
Figure 1D shows the G1-G2 stack harboring an attB site for insertion of a third trait gene, G3. This process is analogous to that of G2 integration, using a circular vector shown in Figure 1E that differs only in the use of attP instead of attB. Integration of the G3 vector should produce a configuration shown in Figure 1F. After activation of Cre-lox site-specific recombination, the structure would comprise of G1-G2-G3 followed by a functional attP site for further stacking of new DNA. Subsequent stacking of new traits would proceed through the same procedures as used for the insertion of G2 and G3, with each round alternating between the insertion of an attB or an attP circular plasmid. Note that with each round of Cre-lox recombination, recombination between the set of oppositely oriented lox sites could invert the gene stack, and may lead to two different expression patterns due to different orientations within the chromosome.
Figure 1 also shows the optional inclusion of a set of recombination sites from a third site-specific recombination system that flanks the initial first construct shown in Figure 1A. The example shown is from the CinH-RS2 system, although it could be from any recombination system that mediates site-specific excision. The purpose of the RS2 sites is to permit the optional removal of the entire transgenic locus, except for minimal transgenic sequences outside of these sites, such as T-DNA borders if they were used. This may be desirable should there be a need to remove certain types of transgenes from, for example pollen, grain or fruit (Luo et al. 2007; Moon et al. 2011).
As mentioned earlier, the lox sites in opposite orientation flanking the transgene stack just within the RS2 sites can be used for translocating the intervening DNA to another location through recombination with a similarly configured transgenic fragment, or in a cassette exchange reaction. In principle, cassette exchange could be used to break the linkage drag of a transgenic locus. Figure 2A shows the Figure 1C structure, but situated between closely linked non-elite alleles Y and Z. Extensive backcrossing is needed to break the linkage drag to obtain the elite line (field cultivar) shown in Figure 2B, a commercial variety having G1 flanked by elite alleles Y* and Z*. A genetic cross between the laboratory line and the elite line would yield progeny with homolog chromosomes derived from each parent. Activation of Cre-lox site-specific recombination would delete the unneeded DNA in Figure 2A, as in Figure 1C. Due to the set of inverted lox sites flanking the stacked transgenes, Cre-lox directed recombination between the two homolog chromosomes could also occur to exchange the two transgene cassettes and permit recovery of the G1-G2 cassette flanked by elite alleles Y* and Z*. Hence, not only would the conversion from laboratory to field varieties be expedited through minimizing the number of segregating loci, but also through this use of site-specific recombination to break linkage drags. Should interchromosomal site-specific recombination reach maximum efficiency, Y, G1 and Z would behave as unlinked loci, and the probability of obtaining a Y*-G1-G2-Z* homozygote in the next generation could be as high as 1 in (1/4)3.
The recombinase-mediated gene stacking scheme requires a minimum of two recombination systems. The first system has to catalyze an irreversible reaction. When we first began to consider this gene stacking system, such a recombination system was not available. It took a former postdoc L. Thomason several years to develop the phiC31-att system (Thomason et al. 2001). At the time we thought we had achieved the necessary breakthrough, unaware that a Stanford University laboratory was also testing this system (Groth et al. 2000). Although their interest was for use in human gene therapy, the filing of intellectual property by both institutions created extensive uncertainty at the time as to whether we ought to proceed with a gene stacking system based on phiC31-att. In the end, we decided to start over from scratch. Using the same screening strategy, a second postdoc J. Thomson found Bxb1-att. In the process, two other deletion-specific systems, ParA-MRS, and CinH-RS2 were also found (Thomson and Ow 2006). Hence, developing gene stacking based on Bxb1-att became a priority over phiC31-att.
We have reported site-specific integration in tobacco using the Bxb1-att system, essentially as illustrated by Figure 1A–C. A Bxb1 integrase expressing construct co-transformed along with the integrating DNA yielded precise, site-specific insertions at around 10% efficiency (Yau et al. 2011). This shows that the Bxb1-att system can catalyze the genomic attP X plasmid attB reaction. As expected, the subsequent Cre-lox deletions were also found among progeny from a cross with a cre-expressing plant, producing the configuration essentially as in Figure 1D. More recently, we achieved integration of the molecule essentially as shown in Figure 1E, into a genomic configuration essentially as shown in Figure 1C (Y-Y Yau, unpublished data). This proves that the Bxb1-att system can also catalyze the genomic attB X plasmid attP reaction. While we focused on developing Bxb1-att, others have reported success with the phiC31-att system for site-specific recombination in higher plants, including integration into the chloroplast genome (Marillonnet et al 2004; Lutz et al. 2004; Kittiwongwattana et al. 2007; Kempe et al. 2010).
For the second system, Cre-lox, FLP-FRT, or R-RS should be suitable, as each can catalyze site-specific excision as well as cassette exchange reactions. However, more data is available on using the Cre-lox system in plants, especially for generating specific chromosomal rearrangements, including interchromosomal recombination (Qin et al. 1994; Medberry et al. 1995; Koshinsky et al. 2000; Vergunst et al. 2000). It is also the only site-specific recombination system so far that has led to a commercial variety grown in the United States. In 2006, the U.S. deregulated Renessen's (Monsanto/Cargill joint venture) LY038 line of high lysine maize, in which the nptII kanamycin resistance gene was removed by Cre-lox site-specific recombination (Ow 2007). To that regard, at least for use in generating deletions, this system has received regulatory approval from the U.S. and six other countries. Finally, to the best of my knowledge, the Cre-lox patent has just recently expired. These considerations make Cre-lox our system of choice.
Should an optional third system be used to remove transgenic DNA, any of the recombination systems would suffice as long as it can perform efficient site-specific excision. Given that we have on hand the new deletion-only systems, ParA-MRS and CinH-RS2, we are giving these a try. So far, the CinH-RS2 shows promise (Moon et al. 2011), although more work is needed to make this system achieve the near 100% excision efficiency before it can be used to address biosafety issues. Most likely, this is just a matter of engineering more effective expression of the recombinase gene.
From the published literature on Cre-lox, R-RS, and FLP-FRT site-specific integration in plants, DNA targeting can produce in a high percentage of integration events a precise structure along with an expected level of transgene expression (Day et al 2000; Srivastava et al 2004; Nanto et al. 2009; Li et al. 2009, 2010). Unexpected gene silencing effects have also been reported, especially when protoplasts were used for transformation, but they do not represent a majority of the recovered plants. Moreover, in cases where gene silencing was observed, the hypermethylation of DNA was confined to the newly introduced DNA and did not extend to pre-existing transgenic DNA (Day et al. 2000).
Of particular relevance to this stacking scheme is the site-specific insertion of circular molecules delivered by particle bombardment (Srivastava and Ow 2002). Of the rice lines produced by biolistics, not only were the recovered plants mostly single copy, but expressed the transgenes stably over a number of generations (Srivastava et al 2004; Chawla et al. 2006). This contrasts with what most practitioners find using particle bombardment methods that integrate DNA randomly. Often, the DNA integrates in high copy complex patterns, and is frequently accompanied by gene silencing. This ability to generate useful, site-specific insertions through Cre-lox recombination has prompted the development of this gene stacking scheme based on the insertion of circular molecules, rather than linear molecules such as delivered through T-DNA integration. Another consideration was the expected expiration of the Cornell University particle bombardment patent, hence permitting freedom to operate.
On somewhat untested grounds is the unknown effect on gene expression when new DNA segments are stacked next to each other, an uncertainty that is also shared with the in vitro stacking approach. It would seem plausible that regulatory elements introduced with each coding region could affect the expression of neighboring transgenes. Possibly, it may become necessary to include border DNA segments to insulate each transgene. Additionally and/or alternatively, rather than envisioning a single locus harboring all of the introduced transgenes, there may come a need to separate similarly expressed transgenes into distinct clusters, such as a root specific transgenic locus and a seed-specific transgenic locus. This approach still reduces the number of segregating transgenic loci, though not to the lowest number of one.
The recombinase-mediated gene stacking system can be considered a new transformation operating system. It permits compatibility of software, the encoded transgenes, created by different developers, at different times and/or through different transformation techniques. It is also backward compatible in the sense that site-specific gene stacks can also coexist with randomly inserted transgenes. The operating system is relatively open source, as to the best of my knowledge, the particle bombardment and Cre-lox patents have expired, and patents on selected recombination systems, namely Bxb1-att and CinH-RS2 have not been filed internationally. Given that the key steps of this strategy have tested positive in tobacco, translating this research to crop plants is a logical next step. In retrospect, the time it took to reach this stage of development was much longer than expected. Part of the delay was due to the freedom-to-operate issue, which we viewed as important as technical feasibility. Keeping researchers from side-tracking towards other directions was also not easy. Unlike investigative science, technology development does not tolerate unintended outcomes, since such data are often not publishable. Therefore, the challenge remains as to whether or not translating this research to crop plants can be accomplished in an academic setting.