The Tol2kit: A multisite gateway-based construction kit for Tol2 transposon transgenesis constructs



Transgenesis is an important tool for assessing gene function. In zebrafish, transgenesis has suffered from three problems: the labor of building complex expression constructs using conventional subcloning; low transgenesis efficiency, leading to mosaicism in transient transgenics and infrequent germline incorporation; and difficulty in identifying germline integrations unless using a fluorescent marker transgene. The Tol2kit system uses site-specific recombination-based cloning (multisite Gateway technology) to allow quick, modular assembly of [promoter]–[coding sequence]–[3′ tag] constructs in a Tol2 transposon backbone. It includes a destination vector with a cmlc2:EGFP (enhanced green fluorescent protein) transgenesis marker and a variety of widely useful entry clones, including hsp70 and beta-actin promoters; cytoplasmic, nuclear, and membrane-localized fluorescent proteins; and internal ribosome entry sequence–driven EGFP cassettes for bicistronic expression. The Tol2kit greatly facilitates zebrafish transgenesis, simplifies the sharing of clones, and enables large-scale projects testing the functions of libraries of regulatory or coding sequences. Developmental Dynamics 236:3088–3099, 2007. © 2007 Wiley-Liss, Inc.


Transgenesis is a fundamental technique in any genetic system. As the use of the zebrafish system expands, transgenic lines are increasingly important to many experimental strategies, including genetic screens and dissection of transcriptional regulatory networks. For instance, the cloning of tissue-specific promoters allows labeling of specific embryonic structures in live embryos using fluorescent reporter transgenes, facilitating many experiments including genetic screens (Wada et al., 2005; Xiao et al., 2005). Furthermore, if expression in transient transgenics is sufficiently nonmosaic, promoter analysis or gene misexpression experiments can be carried out immediately without the time-consuming generation of stable lines (Fisher et al., 2006a, b).

Generating transgenic zebrafish has historically been time-consuming, due to three technical limitations. First, generating the desired expression constructs by conventional subcloning can require laborious multistep cloning strategies, because the choice of restriction enzymes is often limited for long genomic or cDNA fragments. Long-range PCR methods can circumvent some of these problems, but require painstaking resequencing of coding sequences. Second, rates of germline transgenesis are low with plasmid-based transgenesis, requiring the injection, raising, and screening of scores to hundreds of potential founders to ensure recovery of a stable line. Injection of supercoiled or linear DNA yields 1–10% germline transgenic founders (Stuart et al., 1988, 1990), while linearization with ISce-I meganuclease yields 20–30% germline transgenic founders (Thermes et al., 2002). The recent advent of transposon-based systems has dramatically increased the transgenesis rate, to 30% with Sleeping Beauty (Davidson et al., 2003) or 50% with Tol2 (Kawakami et al., 2004b). Third, while screening for transgenic embryos is simple in the case of transgenes expressing fluorescent proteins, it is tedious when the gene product is nonfluorescent (e.g., Gal4 or a dominant-negative receptor), or expressed conditionally (e.g., a construct driven by a Gal4 upstream activating sequence).

We designed the Tol2kit to overcome all three limitations. First, recombination-based cloning using the multisite Gateway system (Hartley et al., 2000; Cheo et al., 2004) greatly simplifies the generation of expression constructs. Furthermore, by selecting from an extensive library of existing building blocks (“entry clones”), one can rapidly construct multiple constructs in parallel. For instance, a tissue-specific promoter tsp could be used to generate tsp:EGFP (enhanced green fluorescent protein), tsp:nls-EGFP, and tsp:mCherry constructs, all in a single set of parallel cloning reactions. Second, using a Tol2 transposon backbone takes advantage of this system's high transgenesis efficiency (Kawakami, 2004). Indeed, expression from Tol2 constructs seems sufficiently nonmosaic that it may be possible to carry out many experiments in injected embryos (i.e., transient transgenics) rather than using stable lines (Fig. 4; data not shown). Third, we provide two different methods to visualize transgenes expressing nonfluorescent proteins: bicistronic mRNAs using an internal ribosome entry sequence (IRES) to drive EGFP or EGFP fusions, or a cmlc2:EGFP transgenesis marker included in the vector backbone. Here, we describe the Tol2kit system, its individual components, and functional tests thereof.

Figure 4.

Test of pDestTol2CG destination vector. Embryos were injected at the one-cell stage with 25 pg of DNA for pDestTol2CG; hsp70: mCherryCAAX-polyA, either with or without 25 pg of transposase RNA. A: Schematic of expression construct and graph showing fraction of embryos with a green heart at 30 hours postfertilization (hpf). Arrows in schematic indicate direction of transcription. Coinjection of transposase RNA greatly increased the fraction of embryos with green hearts. B–E: Further analysis of embryos selected at 30 hpf for green hearts. B–B′″: Embryos at 48 hpf, after heat shock for 1 hr at 30 hpf. C–C′″: Embryos at 48 hpf, without heat shock. No red fluorescence is detectable at this magnification (C″).D–D′″: Close up of two heat-shocked embryos, shown at 48 hpf. Note bright red fluorescence (D″) and lack of ectopic green fluorescence (D′).E–E′″: Close up of two non–heat-shocked embryos, shown at 48 hpf. Note that there is no mCherry fluorescence in the heart. Dim, diffuse red fluorescence appears in one embryo (E″, arrowhead), indicating leaky expression from the hsp70 promoter.



Gateway cloning technology is based on the att site-specific recombination system from lambda phage (Hartley et al., 2000). An attB and attP site can be recombined by a combination of enzymes to yield an attL and an attR site, while the reverse reaction is driven by a different combination of enzymes. By using several engineered att sites that recombine specifically, the commercially available multisite Gateway system (Invitrogen) makes it possible to combine directionally up to five fragments of DNA. We have used the three-insert multisite Gateway system, not the four-insert Gateway Pro system, which is incompatible with the components of the Tol2kit. The three-insert multisite Gateway system combines three “entry” vectors into a “destination” vector (Fig. 1). We refer to the three varieties of entry vector as 5′ clones (p5E-XX), with attL4-attR1 sites flanking the insert, typically containing a promoter element; middle clones (pME-XX), with attL1-attL2 sites flanking the insert, typically a reporter or coding sequence of interest; and 3′ clones (p3E-XX), with attR2-attL3 sites flanking the insert, typically containing a polyA signal or a 3′ tag such as IRES-EGFP-polyA. To generate an expression construct, equimolar amounts of the destination vector and 5′, middle, and 3′ entry clones are mixed in vitro; recombined in an “LR reaction” (in which attL and attR sites recombine); and transformed into bacteria for antibiotic selection (Fig. 1A). To generate the entry clones themselves, we use polymerase chain reaction (PCR) to add specific attB sequences to the ends of the desired insert sequence, followed by an in vitro “BP reaction” (in which attB and attP sites recombine) and transformation into bacteria (Fig. 1C).

Figure 1.

Gateway cloning strategy. A: Schematic of the 3-part LR recombination reaction used to generate expression constructs, using three entry clones and the pDestTol2pA/pA2 destination vector. B: Schematic of pDestTol2CG/CG2, destination vectors that include the cmlc2: EGFP-pA expression cassette. C: Strategy for generating a middle entry clone using pDONR221. Generation of 5′ and 3′ entry clones is similar, but uses different att sequences and donor vectors. D: Transformation of TOP10 cells with an LR recombination reaction yields two classes of colonies: clear (arrowheads) and opaque (arrow). Clear colonies yield the correct recombination product >99% of the time, whereas opaque colonies never do.

Recombination Reactions

In our hands, making entry clones by means of BP reactions was highly efficient and reliable, likely because these are bimolecular reactions followed by negative selection (loss of the ccdB of the donor vector). Even using only moderately competent bacteria, transformations generally yielded hundreds to thousands of colonies. Virtually all transformants contained an appropriate-length insert and were correct, although occasional clones bore point mutations attributable to PCR errors or primer synthesis errors. We have only had difficulty making entry clones when the PCR failed, for instance when trying to amplify a large genomic fragment. In these cases, we conventionally subclone the inserts into an entry clone containing a multiple cloning site (p5E-MCS or p5E-Fse-Asc). There are three varieties of donor vectors (pDONR plasmids) used to generate the three types of entry clones: pDONR P4-P1R (for generating 5′ entry clones) contains attP4/P1R sites; pDONR 221 (for generating middle entry clones) contains attP1/P2 sites; and pDONR P2R-P3 (for generating 3′ entry clones) contains attP2R/P3 sites. The att sites, specific to each donor vector, ensure directional recombination in the multisite recombination reaction.

Making expression constructs by means of three-insert LR reactions was less efficient. Transformations required highly competent bacteria and still yielded only hundreds of colonies, likely because these reactions are tetramolecular, and because there is selection against large plasmid expression constructs (often 10–15 kb). Furthermore, only a fraction of transformants yielded correct restriction patterns, apparently because the plasmids can undergo a variety of undesired internal rearrangements. However, we made the fortuitous observation that two different types of colonies, clear and opaque, appear on bacterial plates after transformation of an LR reaction (Fig. 1D). After growing and restriction-analyzing hundreds of colonies from many LR reactions, we have found empirically that clear colonies nearly always (>99%) yield the desired expression construct, whereas opaque colonies never contain the correct construct, but instead yield a variety of different (incorrect) restriction patterns. Taking advantage of this “clear/opaque” selection, we can typically obtain desired expression constructs after picking at most six colonies from each LR reaction.

LR reactions incorporating very short or very long fragments do show reduced efficiency, yielding fewer transformants and a lower percentage of clear colonies. We have noticed reduced efficiency when one insert is shorter than ∼200 bp (e.g., p3E-polyA), but have always been able to obtain the correct expression construct, even with an insert as short as 20 bp (p5E-Fse-Asc). We regularly use p5E-bactin2, with a 5.3-kb insert, and have successfully generated expression constructs using entry clones with inserts as large as 6.5 kb (J. Bonkowsky, E. Fujimoto, and C.-B. Chien, unpublished data). LR reactions fail using a 5′ clone with a 17.7-kb insert (B. Mangum and C.-B. Chien, unpublished), but this finding is likely due to capacity constraints of the high-copy backbone used in the destination vectors and/or the repetitive sequences found in this fragment. The least efficient but still successful reactions yield 10% clear colonies, while the most efficient yield >75% clear colonies. Therefore, while different LR reactions have variable rates of success, we can almost always generate the desired expression construct as long as clear colonies can be identified.

Overview of Tol2kit Components

The components included in the Tol2kit (listed in Table 1) were chosen to be generally useful in expression constructs for transient or stable transgenesis in zebrafish. They include a variety of regulatory elements (bactin2, hsp70, UAS) in 5′ entry vectors; reporter constructs (EGFP, mCherry, and fusions thereof) in middle entry vectors; and 3′ tags (EGFP, mCherry, and myc fusions, as well as IRES-EGFP tags). For many experiments, all that will be needed is the addition of a single entry clone. For example, a 5′ tissue-specific promoter clone can immediately be used to label the cells in that tissue with cytoplasmic, nuclear, or membrane-localized EGFP or mCherry. As another example, a middle entry clone encoding a gene of interest can immediately be used to overexpress this gene globally (using p5E-bactin2) or under heat-shock control (using p5E-hsp70), marking overexpressing cells with EGFP (using p3E-IRES-EGFPpA) if so desired.

Table 1. Components of the Tol2kita
  • a

    EGFP, enhanced green fluorescent protein; F0, yields appropriate expression in transient transgenics; F1, yields appropriate expression in stable transgenics.

5′ entry clones   
p5E-bactin25.3-kb beta-actin promoter (ubiquitous)F0, F12
p5E-h2afx1-kb H2A-X promoter (quasi-ubiquitous)F02b
p5E-CMV/SP61-kb CMV/SP6 cassette from pCS2+F0 
p5E-hsp701.5-kb hsp70 promoter for heat-shock inductionF0, F14
p5E-UAS10x UAS element and basal promoter for Gal4 responseF0 
p5E-MCSMulticloning site from pBluescriptF0 
p5E-Fse-AscRestriction sites for 8-cutters FseI and AscIF0 
Middle entry clones   
pME-EGFPCAAXMembrane-localized (prenylated) EGFP; fused to the last 21 amino acids of H-rasF0, F13
pME-nlsEGFPNuclear-localized EGFPF02f
pME-mCherryMonomeric red fluorophore mCherryF0, F12c
pME-mCherryCAAXMembrane-localized (prenylated) mCherryF02e, 4
pME-nlsmCherryNuclear-localized mCherryF02abd
pME-H2AmCherrymCherry fused to the zebrafish histone H2A.F/ZF0, F1 
pME-Gal4VP16Gal4 DNA binding domain fused to the VP16 transactivation domainF0 
3′ entry clones   
p3E-polyASV40 late poly A signal sequence from pCS2+F0, F13, 4
p3E-MTpA6x myc tag for protein fusions, plus SV40 late polyA  
p3E-EGFPpAEGFP for protein fusions, plus SV40 late polyAF0, F1 
p3E-mCherrypAmCherry for protein fusions, plus SV40 late polyAF0, F1 
p3E-IRES-EGFPpAEMCV IRES driving EGFP plus SV40 late polyAF02d
p3E-IRES-EGFPCAAXpAEMCV IRES driving EGFPCAAX (prenylated EGFP) plus SV40 late polyAF0, F12abcf
p3E-IRES-nlsEGFPpAEMCV IRES driving nlsEGFP (nuclear EGFP) plus SV40 late polyAF02e
Destination vectors   
pDestTol2pA/pDestTol2pA2attR4-R3 gate with SV40 polyA flanked by Tol2 inverted repeatsF0, F13
pDestTol2CG/pDestTol2CG2pDestTol2pA/pDestTol2pA2 with cmlc2:EGFP transgenesis markerF0, F14
pCS2FA-transposaseFor in vitro transcription of capped Tol2 transposase RNA 3, 4

The following sections describe in turn the 5′, middle, and 3′ entry clones of the Tol2kit, showing sample results from expression constructs using these clones. Initial tests of the entry clones used the destination vector pISce-Dest (Marcel Souren and Jochen Wittbrodt, unpublished observations), in which the attR4-R3 multisite gate is flanked by ISce-I meganuclease sites. Later tests used Tol2-based destination vectors: pDestTol2pA/pDestTol2pA2 and pDestTol2CG/pDestTol2CG2, details of which are described below.

5′ Entry Clones

We typically use 5′ entry clones to supply promoter elements (or more strictly speaking, enhancer–promoter elements). There are seven 5′ entry clones in the Tol2kit (Table 1). Three are designed for driving broad expression throughout the embryo, two for conditional expression, and two for conventional subcloning of promoter elements. All are designated as p5E clones (to specify 5′ entry clone), and the DNA inserts are flanked by attL4/R1 sites.

Promoters for broad expression.

p5E-bactin2 contains a 5.3-kb promoter element from the β-actin gene (Higashijima et al., 1997), including a 5′-untranslated region (UTR) exon and an intron upstream of the start codon, which drives expression broadly throughout the embryo (Fig. 2A,C–G). p5E-h2afx contains a 1-kb promoter from the histone H2AX gene, which also drives broad expression (Fig. 2B; J. Parant and H.J. Yost, data not shown). This finding may be due to perdurance of fluorescent proteins, because the endogenous gene appears to be expressed in proliferation zones (Thisse and Thisse, 2004). p5E-CMV/SP6 contains the simian CMV and SP6 promoters from pCS2+ (Rupp et al., 1994; Turner and Weintraub, 1994). When injected as DNA, the CMV promoter of this cassette drives broad expression (data not shown). In addition, its SP6 promoter can be used for in vitro transcription reactions to produce capped RNA.

Figure 2.

Validation of reporters and internal ribosome entry sequence (IRES) constructs. Embryos shown were injected at the one-cell stage with expression constructs made with the pISce-Dest destination vector, then mounted at 24 hours postfertilization (hpf) for confocal microscopy of the trunk. All plasmids tested generate functional bicistronic messages, as demonstrated by the presence of both mCherry and EGFP fluorescence. A–A″:bactin2: nlsmCherry-IRES-EGFPCAAXpA. B–B″:h2afx: nlsmCherry-IRES-EGFPCAAXpA. C–C″:bactin2: mCherry-IRES-EGFPCAAXpA. D–D″:bactin2: nlsmCherry-IRES-EGFPpA. E–E″:bactin2: mCherryCAAX-IRES-nlsEGFPpA. F:bactin2: nlsEGFP-IRES-EGFPCAAXpA. G:bactin2: EGFPCAAX-IRES-nlsEGFPpA. Scale bar = 50 μm.

Promoters for conditional expression.

p5E-hsp70 contains a 1.5-kb promoter element from the hsp70 gene (Halloran et al., 2000), whose expression is strongly upregulated after heat shock (Fig. 4B–E). We note that, when using the hsp70 promoter to drive bright fluorescent protein constructs, we do sometimes detect low-level, broad fluorescence even at normal rearing temperatures, likely reflecting leaky expression (Fig. 4E). The p5E-UAS contains a 10× UAS multimerized Gal4 upstream activating sequence element from yeast, followed by an adenovirus E1b TATA box and a carp beta-actin 5′-UTR fragment (Rorth, 1996; Köster and Fraser, 2001). This sequence drives no expression on its own, but is strongly activated in the presence of Gal4 (data not shown).

5′ entry clones for subcloning.

p5E-MCS contains the pBluescript multiple cloning site, whereas p5E-Fse-Asc contains two 8-cutter restriction sites, FseI and AscI. Although we have generally been successful in using PCR and a BP reaction to place promoter elements in 5′ entry clones, this does not always work, particularly for longer (>5 kb) promoter elements. In addition, many existing promoter clones are flanked by convenient restriction sites. These two entry clones can be used for conventional subcloning of promoter 5′ entry clones, which can then be used in subsequent multisite recombination reactions. For instance, we have used this method to construct a working 5′ entry clone containing 5 kb of the ath5 (atoh7) promoter (Masai et al., 2003; K.M. Kwan and C.-B. Chien, unpublished).

Middle Entry Clones

We typically use middle entry clones to supply reporters or genes of interest. The Tol2kit includes seven middle clones encoding fluorescent reporters, either cytoplasmic or with different subcellular localization tags, and one middle clone encoding the Gal4VP16 transcription factor. The two fluorescent proteins used were EGFP (Zhang et al., 1996) and the monomeric red fluorescent protein mCherry (Shaner et al., 2004; Gray et al., 2006). All of these constructs use strong Kozak consensus sequences for ribosome binding at the start codon (Kozak, 1984) and were generated in the standard Gateway reading frame. All of these constructs contain a stop codon, and so cannot be used for N-terminal fusions to a protein of interest. They are designated as pME clones (to specify middle entry clone), and the DNA inserts are flanked by attL1/L2 sites.

EGFP reporters.

Three versions of EGFP are included. pME-EGFP encodes a cytoplasmic EGFP. pME-EGFPCAAX encodes a prenylated EGFP, with a C-terminal fusion to the last 21 amino acids of human H-ras (CAAX box) for targeting to the plasma membrane (Moriyoshi et al., 1996; Fig. 2G). pME-nlsEGFP encodes EGFP with an N-terminal nuclear localization sequence (Fig. 2F). Reporters with different subcellular localization will be useful for particular experiments, and also allow unambiguous identification of double-labeled cells, for instance with green membranes and red nuclei (Fig. 2A″,B″).

mCherry reporters.

Four versions of mCherry are included. pME-mCherry encodes cytoplasmic mCherry (Fig. 2C,C″). pME-mCherryCAAX encodes a membrane-targeted mCherry (Fig. 2E,E″), analogous to EGFPCAAX, whereas pME-nlsmCherry encodes a nuclear-localized mCherry (Fig. 2A,A″,B,B″,D,D″), analogous to nlsEGFP. pME-H2AmCherry encodes mCherry fused to the C-terminus of zebrafish histone variant H2A.F/Z. Like H2A.F/Z-EGFP fusions (Pauls et al., 2001), this fusion specifically labels chromatin (K.M. Kwan and C.-B. Chien, data not shown).


The final middle entry clone encodes Gal4VP16, a fusion of the Gal4 DNA binding domain to the highly acidic region of the herpes simplex virus protein VP16, which strongly activates transcription downstream of the Gal4 UAS (Sadowski et al., 1988). Coinjection of a driver construct expressing Gal4VP16 and a reporter construct using the multimerized UAS from p5E-UAS generates extremely strong expression (K.M. Kwan and C.-B. Chien, data not shown). These constructs have been used successfully for transient expression in retinal ganglion cell axons (Campbell et al., 2007).

3′ Entry Clones

We typically use 3′ entry clones to supply a polyadenylation signal, and sometimes also a 3′ tag: either a fusion protein tag, or an EGFP marker encoded as a separate cistron driven by an IRES. There are seven 3′ entry clones included in the Tol2kit (Table 1). All are designated as p3E clones (to specify 3′ entry clone), and the DNA inserts are flanked by attR2/L3 sites.

polyA signal.

p3E-polyA comprises the SV40 late polyadenylation (polyA) signal sequence (Okayama and Berg, 1983), derived from pCS2+; the other six 3′ entry clones all use the same polyA signal. Whereas many mammalian expression vectors use the SV40 early polyadenylation signal sequence, in Xenopus laevis and other systems, the SV40 late polyA signal sequence has been shown to be much more effective in stabilizing mRNA transcripts and promoting translation (Carswell and Alwine, 1989; Matsumoto et al., 1998). In zebrafish, although we have not carried out extensive comparison of the SV40 early and late polyA signals, the latter is clearly effective: pCS2+ DNA constructs give strong expression, as does RNA transcribed in vitro from pCS2+ templates (data not shown).

C-terminal protein tags.

Using a middle entry clone encoding a gene of interest (with no stop codon), multisite recombination can generate in-frame C-terminal protein fusions using a tag encoded by a 3′ entry clone. Three of the 3′ entry clones encode such tags in the standard Gateway reading frame: p3E-MTpA encodes a 6× myc epitope tag (Munro and Pelham, 1986; Roth et al., 1991) derived from pCS2+MT; p3E-EGFPpA encodes an EGFP tag; and p3E-mCherrypA encodes an mCherry tag.

IRES-driven markers.

Although a C-terminal fusion is often undesirable, because such a fusion is either known or suspected to disrupt function of the protein of interest, it is still very useful to label the expressing cells. In mammalian systems, a common solution is to use bicistronic constructs, in which the protein of interest is encoded as the first cistron of an mRNA, and a marker such as GFP is encoded as a second cistron on the same mRNA, with expression of the second cistron driven by an IRES. In mammalian systems, the most commonly used IRES is that from encephalomyocarditis virus (EMCV). In zebrafish, the EMCV IRES has been shown to be active (Fahrenkrug et al., 1999), but has not heretofore been widely used. The Tol2kit includes three different IRES 3′ entry clones: p3E-IRES-EGFPpA, p3E-IRES-EGFPCAAXpA, and p3E-IRES-nlsEGFPpA use the EMCV IRES to drive cytoplasmic, membrane-targeted, or nuclear-localized EGFP, respectively, followed by the SV40 late polyA signal. It is important to note that the first cistron should be terminated by a stop codon (see the Discussion section).

To test the effectiveness of these IRES constructs, we used the bactin2 promoter to drive expression of bicistronic mCherry-EGFP constructs using the destination vector pISce-Dest. DNA constructs (75 pg) were injected into one-cell embryos, then mounted for confocal microscopy at 24 hours postfertilization (hpf). In embryos injected with bactin2:nlsmCherry–IRES-EGFPCAAXpA (Fig. 2A–A″), muscle and skin cells clearly display red nuclei (nlsmCherry) and green membranes (EGFPCAAX), indicating that both open reading frames were translated. In embryos injected with bactin2:nlsmCherry-IRES-EGFPpA (Fig. 2D–D″), cells display red nuclei (nlsmCherry) and diffuse cytoplasmic green fluorescence (EGFP), indicating that, again, both open reading frames were translated. In both cases, every cell with green fluorescence also displayed a red nucleus. In addition, green and red fluorescence levels were correlated: the brighter the red nucleus, the brighter the green fluorescence. bactin2:mCherryCAAX-IRES-nlsEGFPpA yielded cells with red membranes and green nuclei (Fig. 2E–E″), whereas bactin2:mCherry–IRES-EGFPCAAXpA yielded cells with cytoplasmic red fluorescence and green membranes (Fig. 2C–C″). Thus, we found consistent coexpression of the two cistrons with multiple constructs.

Finally, to test the relative translation efficiency of the first and second cistrons, we generated a pair of complementary constructs in which one cistron encoded nlsEGFP, and the other encoded EGFPCAAX. In embryos injected with bactin2:nlsEGFP–IRES-EGFPCAAXpA (Fig. 2F), many cells clearly display green nuclei, but the plasma membrane is labeled much less brightly and can only be seen in the cells with the brightest nuclei. Conversely, in embryos injected with bactin2: EGFPCAAX–IRES-nlsEGFPpA (Fig. 2G), green membranes are visible in many cells, but nuclear-localized fluorescence cannot be detected (perhaps obscured by the intensity of EGFPCAAX fluorescence). Therefore, although the EMCV IRES is clearly functional and can drive detectable expression of EGFP reporters, the second cistron is clearly translated far less efficiently than the first cistron. Nonetheless, in our tests, every cell with detectable EGFP driven by the IRES always also expresses the first cistron. Thus, these constructs should be generally useful for fluorescently marking cells expressing a gene of interest from the first cistron.

Destination Vectors

We use Tol2-based destination vectors to provide the Tol2 transposon ends needed for transposition. Tol2 was originally isolated from the medaka fish (Oryzias latipes) and belongs to the hAT family of transposons, which are flanked by inverted repeats and encode their own transposase. We constructed four such destination vectors: pDestTol2pA, pDestTol2CG, pDestTol2pA2, and pDestTol2CG2. The latter two are functionally equivalent to the first two (earlier) constructs, but lack ∼2 kb of superfluous sequence lying outside the transposon ends. Therefore, while the initial tests of the destination vectors were performed with the earlier constructs, only pDestTol2pA2 and pDestTol2CG2 are now included in the Tol2kit (Fig. 1A,B; Table 1). These vectors are designed to be used specifically in three-insert LR reactions, not two- or one-insert LR reactions. All four vectors share three basic features: a multisite Gateway cloning cassette, a downstream polyA signal, and flanking Tol2 ends. The multisite Gateway cloning cassette (attR4-ccdB-cmR-attR3) provides a recombination target and negative selection for nonrecombined vector after the LR reaction. An SV40 late polyA signal sequence lies downstream of the cloning cassette, meaning that 3′ entry clones do not strictly require a polyA signal sequence. Expression constructs generated using our existing 3′ clones will have two polyA signal sequences in tandem, potentially increasing the efficiency of polyadenylation. Flanking the Gateway cloning cassette and polyA signal are ∼500 bp from each end of the Tol2 transposon, including the terminal inverted repeats. In addition, pDestTol2CG and pDestTol2CG2 include an extra cmlc2:EGFP-pA expression cassette. This cassette comprises a ∼900-bp enhancer–promoter from the cardiac myosin light chain gene (cmlc2 or myl7, Zebrafish Information Network), which drives cytoplasmic EGFP specifically in the developing heart (Huang et al., 2003; Auman et al., 2007). This cassette is designed to be a transgenesis marker, particularly useful when the transgene of interest is nonfluorescent, difficult to visualize (e.g., a few neurons in the brain), or conditionally expressed (e.g., driven by a UAS sequence or a heat-shock promoter). For expression constructs generated with pDestTol2CG/pDestTol2CG2, a green fluorescent heart becomes a proxy for the presence of the transgene. We placed the cmlc2:EGFP-pA cassette in a reversed orientation relative to the multisite gate (Fig. 1B), to minimize the chance of interference between the two expression cassettes (see Fig. 4).

These vectors are designed for use with the Tol2 transposase, which we have recloned into a pCS2-based expression vector (pCS2FA-transposase), removing superfluous 5′ and 3′ untranslated sequences. To generate transgenics, plasmid DNA of a desired expression construct is coinjected with in vitro-transcribed transposase RNA (Kawakami and Shima, 1999; Kawakami, 2004). For optimal transgenesis, it is crucial to inject embryos as early as possible at the one-cell stage, specifically targeting the cell rather than the yolk, because large charged DNA and RNA molecules diffuse poorly. To test pDestTol2pA, we used multisite recombination with p5E-bactin2, pME-EGFPCAAX, and p3E-polyA to generate pDestTol2pA; bactin2:EGFPCAAX-polyA, which should drive ubiquitous expression of plasma membrane-localized EGFP. When we injected 30 pg of DNA with or without 25 pg of transposase RNA (Fig. 3), it was clear that coinjection of transposase RNA increases the number of fluorescent embryos, the number of fluorescent cells in each embryo, and the total fluorescence of each embryo, replicating previous results seen with Tol2 constructs generated by conventional subcloning (Kawakami et al., 2004b; Balciunas et al., 2006). Broad expression throughout the embryo suggests that the transposase is catalyzing transgene insertion early during development, and suggests that transient transgenics will be useful for many experiments, such as testing tissue-specificity of enhancer activity (Fisher et al., 2006b). These DNA/RNA doses are near maximal: approximately 50% of the injected embryos die by 24 hpf, with variations in survival between DNA constructs. pDestTol2pA2 gives similar results (K.M. Kwan and C.-B. Chien, data not shown).

Figure 3.

Test of pDestTol2pA destination vector. Transposase greatly increases frequency and level of enhanced green fluorescent protein (EGFP) expression. A,B: DNA (30 pg) for pDestTol2pA; bactin2: EGFPCAAX-polyA was injected at the one-cell stage either without (A) or with (B) 25 pg of transposase RNA. Images were taken at 24 hours postfertilization.

To test pDestTol2CG, we generated pDestTol2CG; hsp70: mCherryCAAX-polyA. To confirm that the Tol2 ends were functional, we injected 25 pg of DNA with or without 25 pg of transposase RNA, then screened at 30 hpf for embryos that showed any EGFP fluorescence in the heart (Fig. 4A). With DNA alone, only 12/44 embryos (27.3%) displayed green hearts; upon RNA coinjection, this fraction increased to 115/153 (75.2%). Furthermore, hearts of embryos coinjected with transposase RNA generally expressed EGFP more uniformly and more brightly. Next, we tested whether the hsp70 and cmlc2 enhancers would interfere (i.e., cross-activate transcription of the other coding sequence). From embryos coinjected with transposase RNA, we selected those displaying EGFP in the heart at 30 hpf, heat-shocked half at 37°C for 1 hr, then screened for red fluorescence at 48 hpf. One hundred percent of heat-shocked embryos displayed widespread, bright red fluorescence with no ectopic EGFP expression (Fig. 4B–B′″,D–D′″; 52/52 embryos). Without heat shock, no embryos (0/54) displayed the bright mCherry fluorescence seen in heat-shocked embryos (Fig. 4C–C′″,E–E′″); however, approximately half (28/54, 51.9%) did display weak, diffuse red fluorescence that we attribute to “leakiness” or constitutive expression from the hsp70 promoter element (Fig. 4E″, arrowhead). In all embryos, EGFP fluorescence was still restricted to the heart (Fig. 4B′,C′,D′,E′). Because heat-shocked embryos were not globally green, the hsp70 element did not act as an enhancer with the cmlc2 promoter to drive EGFP expression. Conversely, because without heat-shock we saw no red hearts (Fig. 4C″,E″), the cmlc2 element did not act as an enhancer with the hsp70 promoter to drive mCherry. The lack of detectable interference between the two enhancer–promoter elements suggests that this design for destination vectors will be useful for providing visible transgenesis markers. In addition, the widespread, fairly uniform expression of mCherryCAAX after heat shock strongly suggests that early integration of the transgene will allow for experiments in transient transgenics. pDestTol2CG2 gives similar results (S. Stringham and C.-B. Chien, unpublished results).

Finally, using multiple constructs made in pDestTol2pA/pA2 or pDestTol2CG/CG2, we have raised injected fish to adulthood and tested for germline transmission of the transgene. Our results are comparable to published results with Tol2 (Balciunas et al., 2006; Urasaki et al., 2006): >30% of injected fish show germline transmission of a fluorescently expressing transgene, with 0.6–30% of F1 offspring expressing fluorescence.


The goal of the Tol2kit is to establish a system for easily generating expression constructs and stable transgenics. We have assembled a set of verified tools in a modular system that will be useful for many experimental strategies. The multisite Gateway system, using recombination-based cloning of multiple DNA fragments in a directional manner, works efficiently, especially with the clear/opaque selection we have established. The Tol2 transposon ends in our destination vectors efficiently generate transient and stable transgenics, as shown previously (Kawakami et al., 2004b; Balciunas et al., 2006; Fisher et al., 2006b). We are currently planning the next generation of Tol2kit-compatible vectors, including destination vectors with shorter Tol2 ends. The existing destination vectors have ∼500 bp from the 5′ and 3′ ends of Tol2, including fragments of the first and fourth exons encoding the transposase mRNA. These ends can be reduced to 200 bp and 150 bp, respectively, while still yielding efficient integration (Urasaki et al., 2006). We are also planning destination vectors using other transposons, such as Sleeping Beauty, which also integrates efficiently in zebrafish (Davidson et al., 2003). This will make it possible to generate a stable zebrafish transgenic line using one transposon, and then use another transposon to perform transient transgenic experiments without fear of mobilizing the initial transgenic construct. In addition, the Tol2 transposon integrates efficiently in other vertebrate systems, including Xenopus laevis and Xenopus (Silurana) tropicalis (Kawakami et al., 2004a; Hamlet et al., 2006), chicken (Sato et al., 2007), and mammalian (Koga et al., 2003; Kawakami and Noda, 2004) cell culture. That the Tol2 transposon works efficiently in in vitro cell culture makes the Tol2kit a viable option for generating stable cell lines. The Tol2 transposon may also be useful in other systems, including nonvertebrates. Finally, if the Tol2 ends are not required, all of the Tol2kit entry clones are compatible with the pDest R4-R3 destination vector from Invitrogen.

We have generated several 3′ entry clones using the EMCV IRES, and confirmed that they are functional. These constructs should be valuable for marking cells, and the different versions with varied subcellular localization should allow simultaneous detection of multiple constructs. In addition, the IRES-EGFPCAAXpA cassette is functional in Xenopus laevis (Michael Levin, personal communication), indicating that these expression constructs will be useful in organisms other than zebrafish. The EMCV IRES mRNA sequence contains 12 AUG codons and is thought to form a complex secondary structure that somehow interacts with the ribosomal machinery to reinitiate translation at the 11th AUG (Kaminski et al., 1990). The EMCV IRES we have used is nearly identical to the “preferred IRES” sequence described by Bochkov and Palmenberg, beginning before the poly-C tract and extending to the 12th AUG (Bochkov and Palmenberg, 2006), where we have placed the start codon of the EGFP or EGFP fusion. The only significant difference in our IRES sequence from the native EMCV sequence is that the A6 bifurcation loop (AAAAAA) has been replaced by AAAAAAA (A7), a change that attenuates translation in mammalian cell culture (Bochkov and Palmenberg, 2006). Reverting this A7 to A6 may therefore improve the IRES efficiency in zebrafish.

When designing new IRES constructs, it is important to keep three points in mind. First, the upstream cistron should be terminated by a stop codon to prevent readthrough into the IRES sequence (although see Houdebine and Attal, 1999). Second, when building new IRES constructs, the second cistron must be kept in frame with the 11th AUG of the IRES (as in all of our p3E-IRES constructs). Finally, the second cistron must tolerate a small N-terminal fusion, as the 11th AUG is used as the start codon, adding Met-Ala-Thr-Thr to the start of the protein.

We have also generated a set of equivalent IRES constructs driving mCherry, but have not been able to visualize the mCherry protein, likely for three reasons: the relative dimness of mCherry compared with EGFP, the low expression level driven by the IRES, and suboptimal imaging using 543-nm excitation, far from the peak of mCherry excitation at 587 nm (Shaner et al., 2004).

We have generated destination vectors using a cmlc2:EGFP-pA expression cassette, which drives expression of EGFP specifically in the heart. We are currently planning to generate destination vectors with the cmlc2 promoter driving other reporters: mCherry, and EGFP and mCherry variants with specific subcellular localization. This will make it easy to generate double transgenics simply by identifying embryos with hearts containing, for example, red nuclei and green membranes. One caveat is that, to date, we have only tested the cmlc2 promoter with hsp70 or with UAS-driven reporters. There may be interference with or repression of weaker, tissue-specific promoters (N. Lawson, personal communication). In these cases, or when the heart itself is of interest, other marker transgenes could be used, such as gamma-crystallin:EGFP-pA (Davidson et al., 2003). Having multiple transgene markers will yield flexibility in experimental strategies.

Finally, a major goal of developing the Tol2kit is to encourage sharing of reagents throughout the community. The ease of Gateway cloning will make it exceptionally easy to exchange useful clones. We are distributing the Tol2kit constructs freely and will deposit all the entry clones with the Zebrafish International Resource Center (ZIRC). The modular nature of the system makes it easily extendable, and we encourage others to also generate and exchange compatible constructs. In particular, we note that the Tol2kit is compatible with many of the Gateway-based clones described by Villefranc and Lawson (2007). Sequences, maps, MTA information, and other details are provided on a Web site (, where we will also add information for community-generated constructs.

Gateway cloning has been previously used to facilitate large-scale, genome-wide projects in several organisms when combined with libraries of regulatory or coding sequences, including Caenorhabditis elegans open reading frame (ORF) and promoter collections (Walhout et al., 2000; Dupuy et al., 2004), and ORF collections for human (Lamesch et al., 2007), Schizosaccharomyces pombe (Matsuyama et al., 2006), and Arabidopsis thaliana (Gong et al., 2004). The Tol2kit will open the way for such projects in zebrafish.


Fish Handling

All fish were of the Tü or TL strains. Embryos were raised at 28.5°C and staged according to time postfertilization and morphology (Kimmel et al., 1995).

Plasmid Construction

Entry clones.

Entry vectors were generated as described in the Invitrogen Multisite Gateway manual, with minor modifications. PCR was performed using primers to add att sites onto the ends of DNA fragments, using either the GeneAmp XL PCR Kit (Applied Biosystems), Phusion (NEB), or Platinum Pfx (Invitrogen). For 5′ entry clones (using pDONR P4-P1R), the forward PCR primer containing an attB4 site was used (GGGGACAACTTTGTATAGAAAAGTTGNN, template-specific sequence); the reverse primer contained a reverse attB1 site (GGGGACTGCTTTTTTGTACAAACTTGN, template-specific sequence). For generation of middle entry clones (using pDONR 221), the forward primer contained an attB1 site (GGGGACAAGTTTGTACAAAAAAGCAGGCTNN, template-specific sequence); the reverse primer contained a reverse attB2 site (GGGGACCACTTTGTACAAGAAAGCTGGGTN, template-specific sequence). For generation of 3′ entry clones (using pDONR P2R-P3), the forward primer contained an attB2 site (GGGGACAGCTTTCTTGTACAAAGTGGNN, template-specific sequence); the reverse primer contained a reverse attB3 site (GGGGACAACTTTGTATAATAAAGTTGN, template-specific sequence). In all cases, bases were included (N) to maintain the standard Gateway reading frame.

PCR products were purified using the Qiaquick gel extraction kit (Qiagen). Note that purified PCR products were used immediately in BP reactions to generate entry vectors; storage even at −20°C significantly decreased the efficiency of recombination, possibly due to contaminating exonuclease activity. For the BP reaction, equimolar amounts (50–100 fmol) of purified PCR product and donor vector were combined with TE and BP clonase II enzyme mix (Invitrogen) in a final volume of 10 μl; half reactions (5 μl) also work. Reactions were incubated at room temperature for at least 2 hr and treated with proteinase K, and 2 μl were transformed using subcloning efficiency DH5α cells (Invitrogen).

In construction of entry clones, the following plasmids were used as source constructs. For pCS2FA-transposase, the Tol2 transposase ORF was amplified from pCSTP (Kawakami et al., 2004b). For 5′ entry clones, bactin2 was amplified from a 5.3-kb genomic clone (gift of K. Poss); h2afx was amplified from genomic DNA based on genome assembly data; CMV/SP6 from pCS2+; hsp70 from pHSP70/4 (gift of J. Warren); and the Bluescript MCS from pBSII SK+. The Fse-Asc cassette was generated by annealing DNA oligos, and the UAS insert was conventionally cloned from pBUAS-E1b-RFP (gift of R. Köster). For middle entry clones, EGFP, EGFPCAAX, and nlsEGFP were amplified from the corresponding pCS2+ clones; mCherry from pKR5-mCherry (Gray et al., 2006); and Gal4VP16 from a clone from R. Köster. mCherryCAAX and nlsmCherry were generated by adding the relevant localization sequences by PCR, and H2A.F/Z was amplified from cDNA and PCR-fused to mCherry. For 3′ entry clones, the SV40 polyA was amplified from pCS2+; MTpA from pCS2+MT; EGFPpA and mCherrypA from pCS2+ constructs; and the IRES constructs were generated by means of PCR fusion of the EMCV IRES (from hspIG, generated by F. Maderspacher from pIRES2-EGFP, Clontech) to EGFPpA (and variants) in pCS2+. All entry clones were tested by diagnostic digest followed by sequencing; those clones found to harbor point mutations were always tested and verified to be functional.

Expression constructs.

Multisite recombination reactions were performed as described in the Invitrogen Multisite Gateway Manual, with minor modifications. Equimolar amounts (20 fmol) of destination vector, and 5′, middle, and 3′ entry vectors were combined with TE and LR Clonase II Plus Enzyme Mix in a final volume of 10-μl; half reactions (5 μl) also work. Reactions were incubated at room temperature overnight, then treated with proteinase K and 3 μl were transformed using One Shot Top10 Chemically Competent E. coli.

Destination vectors.

To generate the Tol2 destination vector pDestTol2pA the R4-R3 cassette including the attR4 recombination site, ccdB gene, chloramphenicol resistance gene and the attR3 recombination site was amplified by PCR from the original pDESTR4-R3 vector (Invitrogen) using the following primers: M13 REV/XHOI, 5′-ACCGGTCTCGAGCAGGAAACAGCTATGAC-3′; and M13 FWD/CLAI/KPNI, 5′-AGCCGTATCGATGGTACCGTAAAACGACGGCCAG-3′. The PCR product was subcloned into pCRII using the TOPO TA cloning kit (Invitrogen) and verified by sequencing. Subsequently, the insert was subcloned into the Tol2 vector pT2KXIGDin (K. Kawakami) using ClaI /XhoI resulting in the Tol2 destination vector pDestTol2pA.

pDestTol2CG was generated by PCR-amplifying the cmlc2:egfp-pA cassette from cmlc2-pIE-EGFP (gift of D. Yelon) and cloning into the unique BglII site of pDestTol2pA. pDestTol2pA was cut with AfeI/DraIII and religated to yield pDestTol2pA2. pDestTol2CG was cut with AfeI/NgoMIV and religated to yield pDestTol2CG2.

RNA Synthesis

Transposase RNA was generated using the pCS2FA-transposase plasmid as a template. DNA was linearized with NotI and purified using the Qiagen PCR Purification Kit. Capped RNA synthesis was performed using the mMessage mMachine SP6 kit (Ambion). RNA was purified using the Qiagen RNeasy Mini Kit and subsequently ethanol precipitated.


Expression constructs were tested by injection of plasmid DNA, with or without transposase RNA, into the cell at the one-cell stage. For verification of the IRES constructs (Fig. 2), approximately 75 pg of DNA was injected per embryo. For tests of the Tol2 transposase (Figs. 3, 4), either 25 or 30 pg of DNA was injected (see Figure Legends for details) ± 25 pg of transposase RNA.

Heat Shock

For the test of pDestTol2CG; hsp70: mCherryCAAX-polyA (Fig. 4), embryos were incubated at room temperature for 45–60 min, then transferred to a tube of E3 embryo medium prewarmed to 37°C and heat-shocked for 1 hr.


Confocal imaging was performed on an Olympus FV1000. Embryos were mounted in 1.5% low melting point agarose dissolved in E2 medium with gentamicin. Widefield images were captured on an Olympus SZX-12 stereoscope.


Thanks to Koichi Kawakami for Tol2 components. We also thank Rob Weimer, Debbie Yelon, Ken Poss, Jim Warren, Reinhard Köster, Henry Roehl, Dave Turner, and Robert Davis for generous gifts of plasmids and especially thank Nathan Lawson and Jochen Wittbrodt for help and advice. Thanks to Nathan Lawson and Jacques Villefranc for critical reading of the manuscript. Thanks to Josh Bonkowsky and Sydney Stringham for testing newer Tol2kit components, and the rest of the Chien lab for comments and support throughout this work. K.M.K. was supported by an NIH institutional training grant and a postdoctoral fellowship from the American Cancer Society. C.G., J.P., and J.J.Y. were supported by grants from the NIH. M.E.H. was supported by an NIH institutional training grant. C.-B.C. was supported by a grant from the Dana Foundation and the NIH.