A recombineering-based gene tagging system for Arabidopsis


(fax 01 919 515 3355; e-mail jmalonso@ncsu.edu).


One of the most information-rich aspects of gene functional studies is characterization of gene expression profiles at cellular resolution, and subcellular localization of the corresponding proteins. These studies require visualization of the endogenous gene products using specific antibodies, or, more commonly, generation of whole-gene translational fusions with a reporter gene such as a fluorescent protein. To facilitate the generation of such translational fusions and to ensure that all cis-regulatory sequences are included, we have used a bacterial homologous recombination system (recombineering) to insert fluorescent protein tags into genes of interest harbored by transformation-competent bacterial artificial chromosomes (TACs). This approach has several advantages compared to other classical strategies. First, the researcher does not have to guess what the regulatory sequences of a gene are, as tens of thousands of base pairs flanking the gene of interest can be included in the construct. Second, because the genes of interest are not amplified by PCR, there are practically no limits to the size of a gene that can be tagged. Third, there are no restrictions on the location in which the fluorescent protein can be inserted, as the position is determined by sequence homology with the recombination primers. Finally, all of the required strains and TAC clones are publically available, and the experimental procedures described here are simple and robust. Thus, we suggest that recombineering-based gene tagging should be the gold standard for gene expression studies in Arabidopsis.


Determining a gene’s function in all its complexity is a challenging endeavor that requires integration of several types of information and the use of a variety of experimental approaches (Alonso and Ecker, 2006). Key information critical to understanding the role of a gene in a biological process is its detailed spatial and temporal gene expression profile, as well as the subcellular localization of the corresponding protein. In organisms in which homologous recombination is not practical, these types of studies typically involve generation of translational fusions between the genes of interest and a reporter gene, such as GFP. Although this is a commonly used approach, there is always the concern that expression of these chimeric gene constructs may not perfectly reflect expression of the corresponding endogenous genes (Taylor, 1997). Possible discrepancies between expression of the transgene and that of the native gene are usually the results of missing regulatory sequences in the former (Mizukami and Ma, 1997; Sieburth and Meyerowitz, 1997). Thus, it is paramount that all regulatory sequences of a gene are included in these chimeric constructs. In Arabidopsis, it is generally accepted that most regulatory sequences of a typical gene are present in the 2–3 kb immediately upstream of the ATG and the 0.5–1 kb downstream of the stop codon (Tian et al., 2004). However, in the vast majority of the cases, the actual regulatory sequences of a gene are unknown, and the general guidelines do not necessarily apply to the gene of interest. Thus, it is impossible to know a priori whether or not all regulatory sequences have been captured in a reporter gene construct. A generally accepted means of confirming that the expression pattern of a transgene does indeed mimic that of the native gene is to complement the knockout mutant with the chimeric construct. However, these types of experiments are not always possible or informative. For example, the mutant of a gene may not be available or may not have obvious phenotypes. Even when mutant complementation experiments are performed, they may not be conclusive. For example, using a whole-gene–GFP fusion construct (TAA1p:GFP-TAA1), we recently showed that WEI8/TAA1 is expressed in the quiescent center of Arabidopsis roots (Stepanova et al., 2008). However, Yamada et al. (2009) showed that a translational fusion of the TAA1 cDNA with GUS driven by the TAA1 promoter is not expressed in the quiescent center. Importantly, both constructs were reported to complement the phenotypic defects of the taa1 mutant (Stepanova et al., 2008; Yamada et al., 2009). Thus, genetic complementation cannot be taken as conclusive proof that the observed expression pattern corresponds to that of the endogenous gene. Obviously, the ideal way to overcome these uncertainties would be to tag the endogenous gene using homologous recombination strategies. Unfortunately, homologous recombination in Arabidopsis, as well as in many other model organisms, is not efficient enough to make this a practical option (Alonso and Ecker, 2006). In the absence of efficient homologous recombination, a desirable alternative would be to tag the gene of interest in the genomic context of large BACs or fosmid clones. Inducible homologous recombination in bacteria, or ‘recombineering’ (Copeland et al., 2001), has been shown to be a very powerful approach for precise manipulation of DNA sequences contained in large BAC clones (Warming et al., 2005; Ciotta et al., 2011), and therefore could be used to generate gene–reporter fusions in a pseudo-chromosomal context. Recent reports indicate that this methodology can be efficiently utilized to examine gene expression in model organisms such as Drosophila (Venken et al., 2008; Ejsmont et al., 2009), Caenorhabditis elegans (Sarov et al., 2006; Tursun et al., 2009), humans and mouse (Poser et al., 2008).

Recombineering makes use of the bacteriophage recombination machinery to transiently activate homologous recombination in bacteria. Although several recombineering systems have been developed (Court et al., 2002), we have adopted the λ-Red system, in which expression of the bacteriophage exo, bet and gam genes is under the tight control of a temperature-sensitive λ repressor allele cI857 (Yu et al., 2000). This repressor functions normally at 32°C, but becomes inactive at 42°C, providing a simple and very effective way to precisely control expression of this recombineering function in Escherichia coli.

In addition to the recombineering strains, a genomic clone containing the gene of interest with all its regulatory sequences is also required. For this purpose, we utilized the JAtY library (JAtY, http://orders2.genome-enterprise.com/libraries/arabidopsis/jaty.html). This transformation-competent artificial chromosome (TAC) library was generated using genomic DNA from the standard Arabidopsis accession Col, and has a mean insertion size of approximately 68 kb. Importantly, more than 8000 clones have been end-sequenced, and therefore the gene content of each clone is known. The JAtY library covers >90% of the Arabidopsis genes. It is also worth pointing out that the vector utilized in this library (pYLTAC17) does not contain sequences (selection markers and regulatory sequences) that are found in the most popular T-DNA mutant collections. Not only does this facilitate complementation experiments with an engineered JAtY TAC clone using publically available homozygous T-DNA mutants, but also avoids potential silencing problems that could arise when combining several T-DNAs with shared sequences in a single plant. Finally, pYLTAC17 is a binary vector that allows use of Agrobacterium-mediated transformation for transferring the genomic DNAs contained in this library back into the plant genome (Liu et al., 2002). A potential drawback of using the JAtY library is the reported variable plant transformation efficiency. A robust and simple transformation procedure that achieves high transformation efficiency with JAtY clones is described here.

Recombineering-based gene tagging using the λ-Red system and the JAtY TAC library appears to be a good choice for generating the whole-gene GFP fusions required for the in-depth spatial-temporal gene expression studies. However, before adopting this system as a general tagging strategy in Arabidopsis, it is necessary to systematically evaluate the qualities of the system experimentally. Herein, we examine the efficiency and fidelity of each step in the recombineering and transformation procedure. Our results indicate that combination of the λ-Red recombineering system with the end-sequenced JAtY library can be utilized not only for generation of whole-gene reporter fusion constructs, but also for introducing various sequence modifications (including point mutations, deletions and replacements) in the pseudo-genomic context of a large TAC clone.

Results and Discussion

Overview of the general recombineering procedure

To illustrate the recombineering procedure, the protocol used to insert a GFP tag in the C-terminus of a gene is described (Figure 1). The first step is identification of a TAC clone in which the gene of interest is situated approximately in the middle of the clone (step 1). This can be easily achieved using the ATIDB browser (http://atidb.org/cgi-perl/gbrowse/atibrowse/). We recommend use of the smallest TAC clone (which is easier to transform into plants, see below) in which the gene of interest is farther away from the closest TAC end. Once the TAC clone has been identified, it is transferred to a recombineering E. coli strain such as SW102 (step 2). Next, the position at which the tag is going to be inserted in the gene needs to be determined, and this information can be used to design a pair of long primers. Each primer consists of two parts. The 5′ region of the forward primer, for example, includes 50 nucleotides identical to the 50 nucleotides found in the gene of interest immediately upstream of the point at which GFP will be inserted. The 3′ region of the forward primer consists of approximately 21 nucleotides corresponding to the 5′ end of the recombineering cassette (i.e. the galK cassette or the GFP cassette). Similarly, the 5′ region of the reverse primer is complementary to the 50 nucleotides just downstream of the point at which GFP is to be inserted, and the 3′ region matches the final approximately 21 nucleotides of the recombineering cassette. Using these primers, the recombineering cassette (for example, the galK cassette) is PCR-amplified, resulting in a DNA product that harbors the galK cassette flanked by two 50 nucleotides sequence stretches that are complementary to the gene of interest. Importantly, we have added 5′ (linker G) and 3′ (linker A) adapter sequences to both the galK and GFP cassettes (Figure 1 and Experimental procedures). These sequences serve two purposes: they allow use of the same pair of long primers to add gene-specific sequences to both the galK and GFP cassettes, and, at the same time, provide flexible poly-glycine and poly-alanine arms between the protein of interest and the GFP tag (Tian et al., 2004). This linear DNA is electroporated into recombineering-competent cells containing the TAC harboring the gene of interest, and the recombination events are selected in growth medium in which only bacteria that have incorporated the galK cassette can proliferate (step 3). Using a similar PCR strategy, the same 50 nucleotides of gene-specific sequences can be incorporated to each side of the GFP cassette. This GFP cassette can then be used to replace galK in the second recombination event, resulting in a final product in which the gene of interest (contained in a large TAC clone) has been fused to GFP (step 4). It is important to mention that galK can be used both as a positive selectable marker in the presence of galactose and as a counter-selectable marker in the presence of 2-deoxygalactose. The modified TAC clone is then transferred to a recA-deficient Agrobacterium tumefaciens strain (step 5) that is then used to mediate incorporation of this ‘recombineered’ TAC clone into the genome of the plant (step 6).

Figure 1.

 Schematic representation of the recombineering procedure.
The steps involved are: (1) TAC clone selection, (2) transfer of the JAtY clone to SW102, (3) recombination I and selection of Gal+ colonies, (4) recombination II, counter-selection of Gal+ and sequence verification, (5) transfer to Agrobacterium, and (6) plant transformation. The panel on the lower left shows the primer design. Red indicates gene-specific sequences; yellow and green indicate linker sequences.

Efficiency and fidelity of the process

Although the effectiveness of the λ-Red system for modification of large DNA clones has been thoroughly examined using mouse BAC clones (Warming et al., 2005), we wished to re-examine the efficiency and fidelity of the system using Arabidopsis JAtY TAC clones. Thus, to determine the suitability of recombineering as a general tool for gene function studies in Arabidopsis, we selected 40 genes, focusing primarily on the Aux/IAA family of auxin transcriptional regulators. In 39 of 40 cases, we were able to find at least one TAC clone containing the gene of interest among the available end-sequenced JAtY clones (Table 1). Next, we examined the efficiency and fidelity of the six steps described above using JAtY clones with insert sizes ranging from 27 to 104 kb (Table 1).

Table 1.   Recombination constructs generated
NameGeneJAtYSize (kb)Integrity 1Efficiency 2 (%)Integrity 2Sequence
  1. JAtY, name of the TAC clone; size, length of Arabidopsis genomic DNA present in a particular JAtY clone (an estimate based on the end sequences of the JAtY clone); integrity 1, integrity of the clones after transferring them from DH10B to SW102 E. coli strains; efficiency 2, efficiency of the second recombination (replacement of the galK by the GFP sequences); integrity 2, integrity of the clones after the second recombination; sequence, number of clones that needed to be sequenced to find one with no mutations.

  2. ND, not determined.

ATHB52At5g5398049E1374NDND13 of 131
TAA1At1g7056050P1382ND752 of 31
TAR1At1g2332070I1389NDND3 of 31
TAR2At4g2467073P23104NDND3 of 31
IAA4At5g4370057M08665 of 5877 of 71
IAA5At1g1558061G08275 of 5964 of 41
IAA6/SHY1At1g5283074P09385 of 5874 of 41
IAA18At1g51950No JAtY clone available
IAA19/MSG2At3g1554052P18795 of 5834 of 41
IAA20At2g4699060K1275 13ND1
IAA26/PAP1At3g1650066K03475 of 5914 of 41
IAA32At2g0120067N06575 of 5174 of 41

The desired TAC clones were transferred from DH10B to SW102 E. coli strain. Using the procedure described in Experimental procedures, every clone was easily transferred from DH10B to SW102, and later into Agrobacterium. Although we did not precisely quantify the E. coli transformation efficiency of this step, no obvious correlation between clone size and bacterial transformation efficiency was observed.

Gene-specific recombineering and test primers were designed as indicated in Experimental procedures, and the recombineering primers were used to add gene-specific sequences to the galK and GFP cassettes by PCR.

Next, we examined the efficiency of inserting galK into the C-termini of the 39 genes listed in Table 1. Recombination efficiency was estimated by selecting 16–32 galactose-positive colonies (Gal+) per clone, and testing for the presence of the galK gene by PCR. In most experiments, we found no false positives, with 100% of the Gal+ colonies having galK inserted in the desired location.

A similar approach was used to test the efficiency of replacing the galK gene by GFP (Gal selection). As shown in Table 1, the efficiency of this step was much more variable, ranging from 4 to 100%. Low efficiencies were usually associated with sub-optimal competent cells or poor quality of the recombination cassette DNA. As expected in a counter-selection approach, some of the false-positive (Gal) colonies correspond to deletions of the galK gene rather than its replacement by GFP (data not shown).

In summary, by examining over 1200 recombination events using 41 JAtY clones (for 39 genes), we concluded that the experimental procedure described here provides an overall recombination efficiency of approximately 55%.

In addition to examining the efficiency and robustness of the procedure, we also wished to test its fidelity: in other words, whether or not the multiple transformation and recombination events result in undesirable alterations in these large DNA clones and whether GFP can be precisely inserted in the desired location. To test the fidelity of the system, two approaches were used. First, we examined the presence of local alterations at the recombination site. To do this, we sequenced the junctions between GFP and the gene of interest, as well as GFP itself. This was done for all 41 constructs. Precise and flawless GFP insertions were observed in 37 of the 41 clones initially sequenced. Of the four clones with mutations, three harbored a single nucleotide change in the region corresponding to the recombination primers, and were presumably the result of mistakes incorporated during synthesis of these long oligonucleotides (approximately 70-mers). The fourth mutation was found in the GFP itself and was probably due to a mistake introduced by the proof-reading polymerase during GFP amplification. In all four cases in which mutations were found, sequencing of an additional independent clone was sufficient to obtain a mistake-free construct. This indicates that, even when dealing with large numbers of genes, it should be relatively easy to find error-free clones.

To test for the presence of alterations in other parts of these large TAC clones, six JAtY clones with a wide distribution of insert sizes (Table 1) were selected. TAC DNA for 4–13 independent clones for each of the selected genes was fingerprinted after: (i) transfer from DH10B to SW102 (Figure S1), (ii) the second recombination event (replacement of galK by GFP) (Figure 2), and (iii) transformation into Agrobacterium (Figure S2). No detectable alterations were observed in any of over 70 independent clones examined (Figures 2, S1 and S2, and data not shown).

Figure 2.

 The recombineering procedure does not induce unwanted alterations in the large TAC clones.
Fingerprint of six TAC clones before (lanes C) and after (lanes 1–4) insertion of GFP into the gene of interest. DNA from four independent GFP-tagged clones (lanes 1–4) for each of the six selected genes was digested with HindIII, and the fragments were separated by electrophoresis on a 1% agarose gel. Black arrows indicate the bands containing GFP that are absent in the original control (C) clones. The names of the tagged gene and the sizes of the corresponding JAtY clones are indicated.

Stability of large JAtY clones in Agrobacterium

An important factor that could compromise the usefulness of the system is the stability of the JAtY clones in Agrobacterium. Several studies have investigated this point using libraries from various plant species, with various clone sizes, vectors and Agrobacterium strains (Liu et al., 1999; Song et al., 2003), reaching conflicting conclusions. For example, in a comprehensive study using potato binary bacterial artificial chromosomes (BIBACs) and TACs, Song et al. (2003) concluded that clones larger than 100 kb suffer frequent DNA losses after transfer to Agrobacterium. This ‘instability’ of large clones was seen even when recA-deficient Agrobacterium strains were used. In a different study, Liu et al. (1999) found that only one of 35 randomly picked Arabidopsis TACs suffered deletions when transferred from DH10B to Agrobacterium. The size of the clones was not reported in the study by Liu et al., and therefore it was not possible to determine whether or not a direct correlation between size and instability in Agrobacterium exists. Based on the apparent discrepancies between studies, we decided to examine possible alterations of the JAtY clones during their transfer from E. coli into Agrobacterium. Six clones ranging in size from 27 to 79 kb were initially examined. As shown in Figure S2, we did not observe DNA alterations in any of these clones. To further explore the long-term stability of large JAtY clones in Agrobacterium, we grew the two largest JAtY clones (89 and 104 kb insert sizes) used in this study for approximately 48 generations, and then fingerprinted four independent clones per construct. As shown in Figure S3, even a clone of over 100 kb could be propagated in Agrobacterium for many generations without any detectable alteration. However, we cannot conclude that all JAtY clones are stable in Agrobacterium. In fact, we have observed that some JAtY clones are difficult to transfer into Agrobacterium. Although we do not know the reason for that, it does not seem to be related to the size of the clone. Finally, it is important to mention that, even in those cases where transfer of the TAC clone from E. coli to Agrobacterium is problematic, once the TAC is transferred into Agrobacterium, it can be stably maintained (data not shown).

Arabidopsis transformation using JAtY clones

The last step in the process is transfer of the plant genomic DNA from the JAtY clones into the Arabidopsis genome. The transformation efficiency of 25 JAtY clones was examined using standard Agrobacterium-mediated floral-dip transformation (Clough and Bent, 1998). As shown in Table 2, the efficiency varied quite substantially from clone to clone. However, these low and variable transformation efficiencies could be expected, as other researchers have also noticed these problems when using JAtY clones. On the other hand, Liu et al. (1999) reported consistently high efficiencies of transformation using a very similar TAC library. Two possible reasons for this discrepancy were the vector used (pYLTAC17 versus pYLTAC7) and the transformation method (floral dip versus vacuum). As shown in Tables 2 and S1, neither use of vacuum infiltration nor use of pYLTAC7-derived clones, such as K2I5, resulted in any significant improvement. Based on the results from three independent experiments, transformation efficiencies were reproducible: in other words, clones with low transformation efficiencies always produced poor results (Table 2), whereas clones with higher efficiencies produced consistently better results. As the only difference between the JAtY clones used in these experiments is the genomic DNA that they carry, it is possible that some property of this DNA is responsible for the observed variability. The obvious candidate, the size of the clone, was unlikely to be the reason, as we did not observe a correlation between efficiency of transformation and clone size (Table 2). Assessing the vector used to generate the JAtY library, pYLTAC17, we realized that, in clones from this library, the Arabidopsis genomic DNA fragments are placed immediately adjacent to the SacB coding region present in the vector. Thus, it is possible that some of these Arabidopsis sequences are able to activate the expression of SacB in Agrobacterium. Activation of SacB would be detrimental for the survival of Agrobacterium cells in transformation medium that contains high sucrose concentrations (Ried and Collmer, 1987; Lee, 2006). To test this possibility, we examined the effects of 5% sucrose (the concentration typically used in the transformation medium; Clough and Bent, 1998) on growth of Agrobacterium strains harboring clones that have been successfully transferred to plants versus those that have not. As shown in Figure 3, there was a perfect correlation between the efficiency of transformation and the ability of Agrobacterium to grow in the presence of 5% sucrose. As sucrose in the infiltration medium can be replaced by glucose without any adverse effects on the transformation efficiency (Clough and Bent, 1998) and glucose is not a substrate for SacB (showing no toxicity for any of the tested JAtY clones), we decided to test the plant transformation efficiency when 5% glucose was used in the transformation medium. As shown in Table 2, all clones could be readily transformed into plants using glucose. It is also possible that a similar improvement can be obtained by reducing the amount of sucrose in the infiltration medium. In fact, Liu et al. (1999) used 1% sucrose instead of the typical 5%, providing a possible explanation for the discrepancy between their findings and our initial results. However, we recommend using 5% glucose, as 1% sucrose may have detrimental effects on Agrobacterium clones expressing high levels of SacB.

Table 2.   Transformation efficiencies (%) in the presence of sucrose versus glucose in the infiltration medium
GeneJAtYSize (kb)SucroseGlucose
Experiment 1Experiment 2Experiment 3Experiment 1Experiment 2Experiment 3
  1. Efficiencies were calculated based on the number of Basta-resistant plants in approximately 8000 seeds of the T0 generation plants.

Figure 3.

 Sucrose in the transformation medium has a detrimental effect on Agrobacterium strains harboring JAtY clones.
(a) Effect of supplementing the growth medium (LB) with glucose or sucrose on growth of Agrobacterium (upper panels) or E. coli (lower panels). Six different Agrobacterium and E. coli clones were tested. Growth of Agrobacterium clones harboring TAC constructs containing IAA4, 6, 19, 26 and 32 was inhibited by sucrose, whereas that for clones harboring IAA5 was indistinguishable in sucrose-containing versus control medium. All E. coli clones grew well in all types of media. Serial dilutions for each clone are shown.
(b) Plant transformation efficiency for each of the six clones tested in (a) in the presence of 5% sucrose (light gray) or 5% glucose (black) in the transformation medium.
(c) Schematic representation of a typical clone from the JAtY library. LB and RB indicate the T-DNA left and right borders, respectively.

An important aspect that needed to be investigated when using large TAC clones is the ‘integrity’ of the T-DNA upon insertion into the plant genome. Previous studies have shown that large deletions of the T-DNA can occur during the plant transformation process (Hamilton et al., 1996; Liu et al., 1999). To characterize this phenomenon in more detail and to determine the frequency of deletions, over 500 independent T1 plants (or T2 pools) from transformations of six JAtY clones ranging in size from 27 to 79 kb were examined (Figure 4). The results shown in Figure 4 clearly indicate that deletions of T-DNA do occur during the process of integration into the plant genome. We also found that the frequency of deletions has a direct correlation with the size of the T-DNA (Figure 4). In one case, however, a relatively small clone (IAA6) showed a lot more ‘apparent’ deletions than expected based on its size (Figure 4). One possible explanation for the large number of ‘big deletions’ observed (resulting in loss of both SacB and the tagged gene) is the close proximity of the IAA6 gene to the end of the TAC clone (Figure S4). However, this cannot explain the high overall number of deletions observed for this small clone. Thus, although we could not identify any obvious sequence peculiarity in this clone (such as tandem repeats), it is plausible that specific sequences present in the IAA6 JAtY clone may favor deletions of the T-DNA during the transformation process. Nevertheless, from a practical point of view, the overall results from these experiments indicate that identification of 10–20 T1 plants resistant to the herbicide Basta (phosphinothricin, glufosinate ammonium) should be sufficient to obtain a few (at least five) transgenic lines containing an ‘intact’ T-DNA, even when dealing with large TAC clones.

Figure 4.

 Large T-DNAs suffer deletions during integration into the plant genome.
Deletions of the T-DNA inserted in the plant genome were examined by PCR using primers specific to SacB and the tagged gene (gene of interest). The presence of the BAR gene was tested by the resistance of the plants to Basta.
(a) Schematic representation of the three categories of deletions observed. ‘Intact’ refers to Basta-resistant transgenic lines in which both the SacB gene and the GFP fusion construct (gene of interest) were present. ‘Small deletions’ indicates that the GFP fusion was detected in these Basta-resistant lines, but the SacB gene was not. ‘Big deletions’ refers to those Basta-resistant lines in which neither SacB nor the GFP fusion were detected.
(b) Percentage of the Basta-resistant T1 lines that belong to one of the three categories described above. Transgenic lines for six representative constructs ranging in insert size from 27 to 90 kb were examined. Light gray, gray and black bars indicate the percentages of lines in each of the three categories: with intact T-DNAs or with small or big deletions, respectively. The cumulative results for over 500 lines examined are also shown (total). The Aux/IAA constructs shown are arranged by JAtY clone size (small to large).

Although a similar PCR-based strategy has been used previously to test the integrity of the T-DNA during the plant transformation process (Liu et al., 1999), it has some potential problems. For example, a T-DNA insertion could be classified as intact when in fact it corresponds to two complementary truncations of a T-DNA. To address this potential problem, we used real-time PCR to examine the copy number of three genes located in the left side, central region and right side of the T-DNA. Ten independent transgenic lines generated using construct CCH (Table 1) were tested. In addition to the tagged gene (At3g56240), we also determined the copy number of the first (At3g56140) and last (At3g56310) gene in this TAC clone. Consistent with previous reports (Bubner et al., 2004), the variability in the copy number between replicates was larger than typically observed in quantitative RT-PCR experiments. Thus, we were not able to discriminate between copy number differences smaller than twofold. Despite these limitations, our results, together with the Basta segregation analysis (see below), indicate that transformation with large TAC clones can result in multiple linked T-DNA insertions in one or several genomic locations, as illustrated by lines CCH-A-15 and CCH-A-7, respectively (Figure 5). However, these linked T-DNAs are not complete in all cases, and large deletions can be observed, as indicated by lines CCH-A-5 and CCH-A-10 (Figure 5). Finally, it is important to point out that the copy number of the genes located close to the right border (RB) was equal or higher than that of the genes close to the left border (LB) in all cases (Figure 5). This strongly implies that the deletions predominantly occur at the LB of the T-DNA, as previously suggested (Forsbach et al., 2003). This has important practical implications. If deletions predominantly affect the LB, the presence of the SacB gene (located close to the LB) will be indicative of the presence of at least one whole copy of the T-DNA in the plant genome. In fact, we have tested more than 500 independent T1 plants (or pools of the corresponding T2 plants) and found that, in all but two cases, the presence of SacB was predictive of the presence of the GFP-tagged gene in the middle of the TAC clone.

Figure 5.

 Deletions and tandem insertions of T-DNAs are not uncommon when using JAtY clones.
(a) Schematic representation of the JAtY49O07 clone. The relative sizes, positions and orientations of the genes in the TAC clone are shown by the arrows. Black arrows indicate the tagged gene At3g56240, and the left-most gene (closest to the LB) and right-most gene (closest to the RB). The relative orientation of the Arabidopsis genomic DNA was deduced from line H4A-14 in which neither SacB nor the At3g56140 gene were detected. (b) Name of the T1 line, deletion type according to the criteria in Figure 4 (genotype), the number of segregating insertions according to Basta-resistance (Table S2), and the copy number for each of the three genes examined. The results represent the mean of two independent experiments. Technical triplicates were used in each experiment.

In addition to determining the copy number of the transgenes, we wished to determine the mean number of unlinked integration events in the genome when using these large JAtY clones. To do this, we examined the segregation of Basta-resistance among the progeny of over 230 T1 plants generated using five JAtY clones ranging in size from 27 to 82 kb (Table S2). According to these data, 78% of the lines contain a single insertion, 13% have more than one insertion, and 8% of the lines show segregation rations <1:3, consistent with reduced transmission of the T-DNA. The percentage of single-insertion lines is much higher than reported for the Salk T-DNA collection, in which only 50% of the lines show single insertion (Alonso et al., 2003). These differences may be due to a lower frequency of insertions of the large TAC clones compared with the relatively small T-DNA of pROK2 utilized in generation of the Salk collection. This is also consistent with the reduced efficiency of transformation observed using JAtY clones compared with standard binary vectors (data not shown). The predominance of single-locus insertions may be an additional advantage of using JAtY clones.

Reproducibility of expression patterns among independent transgenic lines

Another typical concern when generating translational fusions is the reproducibility of the observed expression patterns among independent transgenic lines: ‘the positional effect’ problem (Lorence and Verpoorte, 2004). A potential additional advantage of using large TAC clones in the recombineering approach is that these potential positional effects may be buffered by tens of thousands of base pairs flanking the gene of interest. To experimentally test this prediction, we examined the GFP fluorescence pattern of over 1500 T2 seedlings from >200 independent T1 lines generated using five TAC clones ranging in size from 27 to 82 kb (Table 2). Each of the five tagged genes (one per TAC clone) had distinctive expression patterns (Figure 6). Identical expression patterns for each gene were consistently observed in all fluorescent T1 lines obtained for that particular gene (data not shown). Nevertheless, fluorescence intensity varies from line to line. Although a more detailed analysis is needed, we observed a correlation between fluorescence intensity and copy number as estimated by RT-PCR. These results not only support the idea that the long sequences flanking the GFP fusions can buffer the positional effects, but also have obvious practical implications, as analysis of fewer lines would be needed to establish the expression pattern of the tagged gene.

Figure 6.

GFP–gene fusions in the pseudo-genomic context of large bacterial artificial chromosomes can be used to study gene expression in Arabidopsis.
Root expression patterns of five selected genes tagged with GFP using the recombineering procedure are shown. From top to bottom: GFP fluorescence of roots, confocal images of the GFP channel, the DIC channel, and a merged image of both channels.

Generation of replacements and point mutations by recombineering

Although we were able to use the Aux/IAA genes to demonstrate the qualities of the recombineering system, we were frustrated by the lack of GFP signals in the transgenic lines generated with the 28 Aux/IAA constructs. It is important to point out, however, that we were able to show not only that the transgenes were expressed in these lines, but also that their expression mimicked quite closely the expression of the endogenous genes in response to auxin (Figure S5). Thus, the lack of GFP fluorescence is most likely due to the very high protein turnover characteristic of Aux/IAAs (Callis and Vierstra, 2000; Dreher et al., 2006). This problem could, in theory, be resolved by introducing a specific point mutation in conserved domain II of the Aux/IAAs, which is known to stabilize these proteins (Liscum and Reed, 2002). However, this would result in hyper-accumulation of these key auxin signaling components and cause dramatic developmental alterations (Liscum and Reed, 2002). Thus, to generate an Aux/IAA protein that is stable and at the same time not toxic for the plant, the canonical stabilizing mutation in domain II can be combined with replacement by GFP of part of the essential domain IV (Rouse et al., 1998). This also provides the opportunity to test use of the recombineering procedure to generate point mutations and sequence replacements in TAC clones.

Three Aux/IAA genes were chosen for this test: IAA4, IAA5 and IAA19 (Table 1). To replace the second proline in the core GWPPV sequence of domain II by a serine, the codon CCN (proline) had to be replaced by the codon TCN. We first introduced these mutations by direct recombination using long primers containing the mutation, as described previously (Costantino and Court, 2003). However, we found this approach highly inefficient, as hundreds of colonies had to be screened by PCR in the absence of any selectable marker (data not shown). As an alternative, we generated galK cassettes in which the desired mutation (CCN → TCN) was introduced in the primer used to target the galK to the correct location in the Aux/IAA genes. In a second recombination step, the galK gene was removed using a genomic DNA fragment that expands >50 bp to either side of the galK insertion, and in which the CCN → TCN mutation was introduced by PCR. Overall, this approach turned out to be highly efficient, with 100% of true positives in both the first and second recombination. In fact, we found that the second recombination in these cases was much more efficient than when the galK was replaced by GFP in previous experiments. Possible reasons are the length of the stretches of sequence homology in the replacement cassette (approximately 200 bp in the case of the point mutation cassettes versus 50 bp in the case of the GFP cassette) and/or the size of the replacement cassette (approximately 200 bp in the case of the point mutation cassettes versus approximately 800 bp in the case of the GFP cassette). Based on our experience using large replacement cassettes (approximately 2500 bp), we favor the hypothesis of cassette size, although clearly the length of the homology arms also has a positive impact.

Next, we replaced a portion of the Aux/IAA gene (corresponding to the 22 conserved amino acids of domain IV; Figure S6) with three copies of the improved version of the yellow fluorescent protein gene, Ypet (Nguyen and Daugherty, 2005), that has been codon-optimized for Arabidopsis (Table S3). For this purpose, we used exactly the same experimental approach as utilized to insert GFP in the selected genes listed in Table 1, except that the gene-specific homology arms of the galK and 3xYpet cassettes did not correspond to a continuous sequence in the TAC clone, but to sequences separated by 66 bp (22 amino acids), thus producing a deletion. We found that the efficiency of the first recombination was 100%, indicating that deletions or gene replacements can be effectively generated by recombineering.

These constructs were used to transform Arabidopsis plants using the glucose method. The ease of detecting the Ypet fluorescence signal (Figure 7) and the consistency of the observed expression patterns with those previously reported for Aux/IAA19 (Muto et al., 2007) suggest that the strategy followed is adequate, and similar approaches can be utilized to examine expression of the remaining Aux/IAA genes.

Figure 7.

 Stabilized loss-of-function aux/iaa–Ypet mutant proteins generated by recombineering can be visualized in Arabidopsis.
The root expression patterns of three aux/iaa–Ypet constructs, in which a stabilizing point mutation in domain II and partial deletion of domain IV were introduced by recombineering in the pseudo-genomic context of a TAC, were visualized using confocal microscopy. Upper panels show Ypet fluorescence; lower panels show the overlap of the fluorescence and DIC channels.

In summary, our findings demonstrate that recombineering-based gene tagging can be effectively used to study gene expression in Arabidopsis. Because of the simplicity and robustness of the Experimental procedures, the public availability of the required recombineering strains and end-sequenced TAC clones, and the intrinsic advantages of generating the gene constructs in the pseudo-genomic context of large TAC clones, we suggest broad adoption of this methodology as a general tool to study gene expression in Arabidopsis. Nonetheless, it is important to emphasize that (like the traditional approaches) recombineering-based gene tagging also has its potential drawbacks. For instance, the large size of the TAC clones may exacerbate the typical problems of insertional mutants: incorporation of long DNA fragments in the genome may cause chromosomal heterogeneities that may affect chromosome dynamics or the local epigenetic landscape, thus limiting the value of the recombineering methodology for the study of genes involved in these processes. Furthermore, introduction of extra copies of multiple genes in a single TAC is likely to increase the frequency of phenotypic alterations, thus potentially hindering the expression studies. Finally, it is also formally possible that remote cis-elements that regulate the expression of the gene of interest in a TAC may still be missed in the recombineered construct despite the large size of flanking genomic sequences included. Although we have not carried out detailed analyses to address the aforementioned concerns, our observations of more than 40 TAC clones and hundreds of transgenic lines suggest that these problems are not frequent. Therefore, in the absence of efficient homologous recombination in Arabidopsis and other plant species, we propose that TAC-based gene translational fusions should be adopted as the gold standard in gene expression studies.

Experimental Procedures

Strains and materials

The JAtY library was purchased from the John Innes Center (Norwich, UK) and BAC end sequence information was obtained from the Arabidopsis thaliana Integrated Database (ATIDB) (http://atidb.org/cgi-perl/gbrowse/atibrowse/). The E. coli recombineering strain SW102 and the galK clone were provided by Dr N. Copeland (Mouse Cancer Genetics Program, National Cancer Institute-Frederick, Frederick, MD, USA) (Warming et al., 2005). The universal adaptors GGTGCTGCTGCGGCCGCTGGGGCC and GGTGCTGCTGCGGCCGCTGGGGCC (Tian et al., 2004) were introduced by PCR at the 5′ and 3′ ends, respectively, of galK (Warming et al., 2005) and GFP (Appendix S1). JAtY library clones were selected using 25 μg ml−1 kanamycin (Fisher Scientific, http://www.fishersci.com). Escherichia coli SW102 and A. tumefaciens GV3101 (pMP90) strains were grown in LB medium supplemented with 12.5 μg ml−1 tetracycline and 25 μg ml−1 gentamycin, respectively, with or without 25 μg ml−1 kanamycin to select for the JAtY clone.

Fluorescence microscopy and confocal laser scanning microscopy

Three-day-old etiolated T2 seedlings were mounted in water on a microscope slide (Fisher Scientific) and covered with a cover slip (Corning, http://www.corning.com/lifesciences). GFP fluorescence was observed using a Zeiss Axioplan microscope (http://www.zeiss.com/) under 10× or 20× magnification, and the images were acquired using a SPOT InsightTM (SPOTTM Imaging Solutions, http://www.diaginc.com/) camera with spot Advanced software. To enable comparisons between independent T2 lines of the same construct, images were acquired using identical settings. Lines with strong fluorescence were selected for confocal microscope imaging. An LSM 710 confocal workstation (Zeiss) was used. Fluorescence and differential interference contrast microscopy (DIC) images were acquired at both 20× and 40× magnification. The obtained images were analyzed using Zen 2009 software (Zeiss).


We thank Dr Miguel A. Perez-Amador (Instituto de Biologia Molecular y Celular de Plantas, Valencia, Spain) for helpful comments and stimulating discussions. This work was supported by National Science Foundation grants DBI0820755 and MCB0923727 and funds from North Carolina State University to J.M.A., and from the North Carolina State University Genomics graduate program to R.Z.