Plant transformation viaAgrobacteriumfrequently results in formation of multiple copy T-DNA arrays at one target site of the chromosome. The T-DNA copies are arranged in repeats, direct or inverted around one of the T-DNA borders. A Ti plasmid-derived transformation vector has been constructed enabling direct selection of transformants carrying at least two linked copies of T-DNA in the same orientation. The selection is based on expression of a promoterless neomycin phosphotransferase gene on one T-DNA copy from a promoter located on the other T-DNA copy. After co-cultivation of tobacco protoplasts withAgrobacterium, as many as 30% of regenerated transformed plants carried directly repeated T-DNA copies. The junction regions between two T-DNAs were amplified and 13 amplified fragments were cloned and sequenced. The involvement of T-DNA left and right border sequences in direct repeat junctions was determined. In some junctions, additional filler DNA was detected. The length of filler DNA varied from a few up to almost 300 bp. The longer filler DNAs from two clones were found to be T-DNA fragments in direct or reverse orientation. We discuss the recently suggested models for T-DNA integration and propose that the formation of direct repeats in genomes does not necessarily result from ligation of intermediates (i.e. T-strands), but more likely from the co-integration of several intermediates into one target site.
The plant pathogen Agrobacterium tumefaciens provides a natural system for genetic transformation of plants. In this system, a specific segment of agrobacterial DNA is transferred into a plant cell and eventually integrated into plant genomic DNA (reviewed in Tinland & Hohn 1995;Zambryski 1992;Zupan & Zambryski 1995). The transferred DNA (T-DNA) is part of the Ti (tumour-inducing) plasmid of A. tumefaciens and is delimited by two 25 bp direct repeats called T-DNA borders. Site-specific endonucleolytic cleavages of both the left and the right border sequences generate a single-stranded molecule (T-strand) ( Stachel et al. 1987 ) which is transported to the plant cell nucleus in the form of a nucleoprotein complex (T-complex) ( Howard & Citovsky 1990). The proteins generating the T-strand, forming the transport complex and supporting the T-strand transfer and integration are products of Ti plasmid virulence (vir) loci (for review, see Sheng & Citovsky 1996). The expression of virulence genes is induced by the presence of phenolic compounds (such as acetosyringone) ( Bolton et al. 1986 ) and acidic pH ( Mantis & Winans 1992), both characteristic for a wounded plant cell. Efficient transformation is also dependent on constitutively expressed chromosomal virulence (chv) genes of A. tumefaciens ( Douglas et al. 1985 ).
To our knowledge, there are no reports on detection of multimeric T-DNA forms in vir-induced agrobacteria and therefore they presumably are formed in the plant cell prior to or during integration into the nuclear genome. In order to find out which mechanism is responsible for the formation of T-DNA multimers, a co-transformation approach was used ( De Block & Debrouwer 1991;De Neve et al. 1997 ;Komari et al. 1996 ) in which transgenic plants were screened for the presence of two different T-DNAs originating from different agrobacterial strains ( De Block & Debrouwer 1991;De Neve et al. 1997 ) or from one vector carrying different T-DNAs ( Komari et al. 1996 ). Co-integration occurred with quite high frequencies and with preferences for linked ( De Block & Debrouwer 1991) or unlinked ( Komari et al. 1996 ) loci, respectively, depending on the agrobacterial strain. The high frequency of T-DNA co-integration into one locus was suggested to be the result of ligation of separately introduced T-DNAs ( De Neve et al. 1997 ).
We have transformed tobacco protoplasts with a vector carrying a promotorless neomycin phosphotransferase gene (close to the right border sequence) and a strong CaMV 35S promoter (near the left border sequence) within its T-DNA. Selection has been achieved by the expression of the promoterless neomycin phosphotransferase gene of one T-DNA copy from the promoter located on the other T-DNA copy integrated in the same orientation. By selecting for kanamycin resistance, we have obtained transgenic plants carrying at least one linked direct repeat of T-DNA. We have analysed the junction regions between T-DNA copies. The sequence analysis revealed variability in the presence of filler DNA and residual T-DNA border sequences in the direct repeat junctions.
In our experiments we intended to investigate the formation of T-DNA multimers integrated into the plant nuclear genome after Agrobacterium-mediated transformation. We have developed a system for exploring one of the three possible mutual orientations of two linked T-DNA copies, a direct repeat. The pPCV631-L-TX vector has been designed to allow a direct selection of clones with at least two copies of T-DNA integrated in the genome as repeats in the same orientation. The two T-DNA copies ( Fig. 1a) had to be integrated in a configuration allowing expression of the neomycin phosphotransferase gene carried by one T-DNA copy from the promoter located on another T-DNA copy. The neomycin phosphotransferase gene of pPCV631-L-TX has its own ATG codon so that in-frame integration was not required for the gene expression. However, the distance between the two T-DNA copies had to be optimal to lead to expression.
We selected transformed plant cells resistant to hygromycin. Direct repeats of T-DNA (as tested by kanamycin resistance) were detected in as many as 30% of all transformants carrying at least one copy of T-DNA. Amplification of the junction regions between directly repeated copies of T-DNA led to the isolation of over 30 fragments differing in length (from 100 to 500 bp), out of which 13 were cloned and sequenced ( Fig. 2). In the following text they are numbered according to their actual length in bp; fragments of the same length representing different clones are distinguished by capital letters. Generally we found two junction types, with or without filler DNA, respectively. The same types of junctions have been previously described for junctions between T-DNA and plant DNA by Mayerhofer et al. (1991) and named precise and imprecise junctions, respectively. Here the junctions of two T-DNAs are either referred to as precise junctions, i.e. junctions of two sometimes slightly truncated T-DNAs (usually consisting of residual border sequences), or imprecise junctions analogous to precise junctions with additional filler DNA between two T-DNAs.
Among clones from which junction fragments have been amplified and sequenced, we found six clones representing precise junctions of two T-DNAs (clones 201, 220, 231A, 231B, 235A, 235B). The junctions in fragments 231A and 231B are identical. In the others, the length of residual border sequences varies between the left and right border sequences. In clone 201, the right border is missing as well as part of the T-DNA (about 40 bp representing the region upstream of the nptII coding region). The right border is also missing in clone 220 (together with 14 bp of T-DNA) and in clone 235B (together with the adjacent 2 bp) ( Fig. 3a).
Imprecise junctions and filler DNA
Seven clones (131, 238, 251, 254, 263, 268, 532) exhibit a pattern of imprecise junctions, with additional, filler DNA integrated between the two copies of T-DNA. The length of filler DNA varies from 8 to almost 300 bp.
In clone 131, the left border sequence together with part of the T-DNA (126 bp representing tetR binding sites) are missing. However, the residual right border sequence is present. In clone 532, the right border sequence is missing but the T-DNA itself is not truncated. In the five other clones, both the left and right border residual sequences are present. The variability in length of residual T-DNA border sequences is shown in Fig. 3(b).
Short filler sequences in clones 251, 254, 263 and 268 (8, 9, 36 and 26 bp, respectively) could not be analysed as to their origin, because no homology with T-DNA or tobacco DNA was found. Clone 238 contains filler DNA of 10 bp, of which 9 bp are homologous to the 5′ end of the left border sequence, and the orientation of this filler DNA corresponds to the orientation of border sequences in T-DNA copies forming the direct repeat.
Two longer filler DNAs (clones 131 and 532) were found to have regions of homology with T-DNA. In clone 131, the filler DNA is 36 bp long, and the 20 bp on its 3′ end (upstream of the right border) are 100% homologous to the poly(A) region of the ocs (octopine synthase) gene, which is fused to the nptII gene in pPCV vectors. The orientation of this filler sequence corresponds to the orientation of the same sequences present in the integrated T-DNA copies ( Fig. 1b). The rest of the filler DNA was not identified.
The longest filler DNA (in clone 532) is 293 bp long and most of it is homologous to the CaMV 35S promoter. The region of 100% homology is 277 bp long and it is integrated between the T-DNA copies in an orientation opposite to their CaMV 35S promoters (as shown on the schematic drawing in Fig. 1b).
Variability of T-DNA ends in the direct repeat junctions
We wished to collect data concerning the involvement of border sequences in the junction regions of T-DNA direct repeats and compare them to the data obtained for the T-DNA/plant DNA junctions (published by others). T-DNA/plant DNA junctions contain in most cases residual border sequences (up to 22 bp for the left border, up to 3 bp for the right), and in some cases the residual border sequence – more often the left border sequence – is missing (data taken from Tinland & Hohn 1995). Among our 13 clones carrying a tandem repeat junction, the majority of border sequences were preserved in a manner which corresponds to those published data. Only five junctions did not contain one of the residual border sequences (five borders out of 26 were missing). In four junctions the right border sequence was missing, in one junction it was the left border sequence. This result is just the opposite to what was reported for T-DNA/plant DNA junctions ( Tinland & Hohn 1995). However, if we compare all these junctions with respect to the variability in length of preserved T-DNA ends, it is quite clear that the left end of a junction is more variable than the right end, which corresponds well with published data ( Gheysen et al. 1991 ) and with the generally expected fact that the right end of T-DNA integrates more precisely than the left end.
We have not detected a clone which would represent the ‘read-through’ event reported for various transformants as a consequence of skipping one of the border sequences during formation of T-strand ( Martineau et al. 1994 ). All the junctions correspond to precisely cut-off T-strands or their slightly truncated forms. However, we would not have detected larger truncations because both components of the selection system are close to border sequences and a larger truncation would render their function impossible. Also a pre-selection against too long ‘read-through’ events could be expected, because the efficiency of transcription and/or translation would be affected by the increasing distance between the promoter and the reporter gene. Nevertheless, we cannot confirm that the junction regions that we have analysed are responsible for the kanamycin resistance, because the T-DNA copy number was not estimated and the kanamycin resistance could be conferred by another T-DNA direct repeat present in the genome. As to the presence of more than two T-DNA copies in the genome, it is also theoretically possible that the marker gene is expressed from plant regulation sequences present at the target site instead of being expressed from the promoter of the second T-DNA copy. However, the probability of such an event is quite low, because in our system employing the vector pPCV631-L-TX the frequency of transcriptional fusions with plant promoter was two orders of magnitude lower than the frequency of direct repeat formation (data not shown).
A model explaining the formation of T-DNA repeats has recently been suggested ( De Neve et al. 1997 ). That model takes into account the data obtained from ‘co-transformation’ of plant cells with agrobacteria carrying two different T-DNAs. The different T-DNAs were frequently integrated into one locus in direct or inverted repeats. Formation of these repeats is explained as ligation of intermediates (double-stranded or partially double-stranded T-DNAs) prior or during the integration, with additional breaks and repair. The authors favour the idea of extra-chromosomal ligation occurring prior to integration. This view is supported by the fact that one of the transformants carried four copies of T-DNA (three from one agrobacterial strain and one from the other) integrated at one site, precisely linked to each other, which seemed unlikely to happen during integration. However, the data on the position of each T-DNA copy were obtained by Southern analysis and do not include the exact sequence analysis of the junction regions.
The model mentioned above refutes the replication and repair model suggested in the 1980s ( Jorgensen et al. 1987 ;Van Lijsebettens et al. 1986 ), according to which a single T-DNA would be replicated and then integrated into the genome as a repeat. That model was strongly supported by the fact that the same truncated T-DNAs were observed at different integration sites or at one site as a direct repeat in the plant genome after transformation ( Van Lijsebettens et al. 1986 ) and the same break points of both T-DNA copies were observed in inverted repeats ( Jorgensen et al. 1987 ). Unfortunately, these data were obtained at the resolution of restriction analysis and sequence data are not available. Moreover, they do not exclude the possibility of T-DNA truncation in agrobacteria prior to transformation.
We cannot really reject any of these models, but neither can we completely agree with any of them. The ligation model does not explain the origin of filler DNA found between the repeatedly integrated T-DNA copies. For Agrobacterium-mediated DNA transfer, the presence of filler DNAs is well documented in T-DNA/plant DNA junctions and was also reported to form the junctions between T-DNA border sequences of some circularized agroinfecting DNA molecules ( Bakkeren et al. 1989 ). The origin of some fillers in T-DNA/plant DNA junctions was determined: short sequences homologous to sequences inside the T-DNA may form part of the filler as well as plant sequences from regions adjacent to the site of integration and repeats of both ( Gheysen et al. 1991 ;Holsters et al. 1983 ;Mayerhofer et al. 1991 ). For example, in a direct repeat (tandem) junction, the filler DNA consisted of a 16 bp repeat of the T-DNA right end and six repeats of a plant DNA sequence 40 bp long ( Holsters et al. 1983 ). The replication of plant sequences cannot be explained by replication of T-DNA prior to integration, assumed in the replication model, and thus must have arisen during or after the T-DNA integration. We favour such an idea and it is supported by our data obtained for various direct repeat junctions.
Model involving a reactive spot with various enzyme activities
Taking into account our data together with some previously published data, we do not think that there is one simple mechanistic model to explain direct repeat formation. In our view, two logistic principles should apply. First, with regard to the localization of the event, there has to be a site susceptible to integration, a reactive spot to which the T-strands are attracted, either in the form of individual T-complexes or as a cluster of mutually interactive T-complexes. Second, with regard to the character of the event, the enzymes (subunits) present at the reactive site and taking part in DNA ‘processing’ very likely compete for substrate and the outcome of the enzyme activities defines the resulting form of junction in a direct repeat. These conclusions are based on the following facts which we would like to emphasize.
In assessing the possible mechanisms, we considered the events leading to the formation of a junction in clone 532. They are depicted in Fig. 4 and are consistent with the model of T-DNA integration into plant DNA ( Tinland & Hohn 1995;Tinland et al. 1995 ). According to this model, the T-strand attacks plant chromosomal DNA, temporarily single-stranded, and finds a microhomology to its 3′ end. The free 3′ end of T-strand is used as a primer for repair synthesis of plant DNA. The same strand of chromosomal DNA is attacked by the 5′ end of T-strand which is covalently bound to VirD2 protein (probably involved in the integration process) and anneals with chromosomal DNA either transiently or due to stabilization by proteins. Subsequently, the T-DNA 5′ end is joined to the plant DNA 3′ end generated by a single-strand nick. A single-strand break in the invaded strand generates a 3′ end for repair replication of T-DNA (for details see Fig. 5).
In our scheme we took into account three copies of single-stranded T-DNA. Possibly during the formation of junction in clone 532 the T-strand 1 interacts with the upper strand of plant DNA in a manner described by Tinland & Hohn (1995). The T-strands 2 and 3 on the other hand each invade both the T-strand 1 and the lower strand of plant DNA. Further steps are analogous to the model mentioned above and consist of single-strand breaks, DNA repair synthesis and ligation (for details see Fig. 4).
The main reason for preferring such a scheme is the presence of an inversely oriented part of T-DNA in filler found in the junction of clone 532. From our point of view it could only have been generated by replication of a T-strand, assuming that T-DNA enters the nucleus in a single-stranded form ( Tinland et al. 1994 ). It is thought that some portion of T-DNA in the nucleus has to exist in a double-stranded form because transient expression and extra-chromosomal homologous recombination occur with remarkable frequencies ( Janssen & Gardner 1990;Offringa et al. 1990 ). However, if replication was achieved before integration and T-DNA integrated in a (partially) double-stranded form, the replicated T-DNA would have to be shortened via single-strand or double-strand breaks. In that case there would be no VirD2 protein bound to the 5′ end to help integration and probably no single-stranded 3′ end searching for a microhomology. Moreover, the reason why the other two T-strands would then be integrated into the same site via a T-DNA-specific mechanism seems unclear. Therefore, we favour the idea that the partial replication of T-strand 1 starts after the initial steps of integration (or extra-chromosomally; not included in Fig. 4) and is interrupted by invading T-strands 2 and 3 which are integrated into the chromosome in a regular T-DNA-specific manner.
We exclude the possibility that the inverted orientation of T-strand 1 is a result of its integration into the lower strand while T-strands 2 and 3 are integrated into the upper strand, because it does not seem likely that the ends of T-strands 2 and 3 would be preserved during the recombination while only a small portion of T-strand 1 should be integrated.
More T-strands (of different agrobacterial origin) may be present at one target site of plant chromosome as reported for co-transformation events ( De Block & Debrouwer 1991;Depicker et al. 1985 ) and integration of TL-DNA and TR-DNA (two distinct T-DNAs of A. rhizogenes Ri plasmid) into an inverted repeat ( Jouanin et al. 1989 ). Therefore, a likely explanation for T-DNA repeat formation is interaction of nucleoprotein complexes among themselves and/or their attraction to the relaxed plant DNA. The question remains as to what the mediator of such an interaction could be. Possible candidates are host cellular proteins and/or the bacterial T-complex proteins.
However, there are more possibilities for interpretation of our results. In the case of clone 532, a two-step integration could have occurred, with integration of one T-DNA copy during one cell cycle and integration of the other two T-DNA copies (forming a direct repeat) into the same target site in another cell cycle. The preference for the same target site could be conferred by the chromatin structure and the target site position. Such a preference is not consistent with the results obtained in a study by Offringa et al. (1990) for gene targeting in Arabidopsis, in which the targeting of an Agrobacterium-delivered non-functional selection marker (a defective nptII gene previously introduced into chromosome) by DNA carrying the restoring part of the selection marker occurred with very low frequency (3 × 10–4).
Our data do not support the possibility of rolling circle-like replication of one T-strand copy ligated via its ends into a single- or double-stranded circlular molecule, because of the presence of filler DNA in half of the joints and mainly T-DNA-derived fillers (clones 532, 131, 238). However, it cannot be excluded that a rolling-circle-like replication occurs in some cases of direct repeat junctions – the perfect junctions. For these junctions the extra-chromosomal ligation model could also apply. There may be more mechanisms responsible for direct repeat formation and the choice among them would then depend on the overall situation of the cell or just pure coincidence.
Bacterial strains and vector construction
Transformation vector pPCV631-L-TX derived from pPCV631 ( Koncz et al. 1989 ) was used for protoplast transformation. The T-DNA of the vector carries a plant selectable marker hygromycin phosphotransferase (hph), a modified CaMV 35S promoter (CaMV 35S-TX with tetracycline repressor binding sites) ( Gatz et al. 1991 ) and a promoterless neomycin phosphotransferase gene with its own ATG codon. The initial vector pPCV631 ( Koncz et al. 1989 ) was modified by addition of CaMV 35S-TX promoter ( Gatz et al. 1991 ). The promoter was isolated as an EcoRI+Klenow BamHI fragment from pUCA7-TX plasmid ( Gatz et al. 1991 ) and subcloned into pUCLB (obtained from C. Koncz, unpublished data) digested with HindIII + Klenow/BamHI. The KpnI/PvuII fragment carrying the left border sequence (BL) together with the promoter was then isolated and ligated to pPCV631 cut by KpnI/ClaI + Klenow. Bacterial strains described in Koncz & Schell (1986) were used.
Protoplasts from Nicotiana tabacum SR1 ( Maliga et al. 1973 ) grown on modified solid MS medium ( Murashige & Skoog 1962) were isolated ( Nagy & Maliga 1976) and after 3 or 4 days of cultivation when the first cell divisions were visible they were transformed by Agrobacterium tumefaciens. After 24 h of co-cultivation, the plant cells were washed twice with media containing 1 mg ml–1 cefotaxim (Claforan, Roussel) and further regenerated ( Depicker et al. 1985 ). The selection of transformants carrying direct repeats of T-DNA was achieved on media containing 100 μg ml–1 kanamycin. The transformation frequency (5–15%) was estimated in control experiments on media containing 20 μg ml–1 hygromycin as number of transformed calli per total number of calli regenerated without selection, and the frequency of direct repeat formation was then determined as the number of transformed kanamycin-resistant calli per total number of transformed calli.
Plant DNA isolation, PCR, sequencing and Southern hybridization
Total genomic DNA from transformed calli or regenerated plants was isolated using the modified method of Murray & Thompson (1980).
The PCR reaction mixture (50 μl) contained 1 μg of plant DNA and 1 unit of Taq polymerase (Promega), and was performed in supplied amplification buffer containing 1 μM primers, 200 μM dNTPs, 2.5 mM MgCl2. The primers (Genset) for amplification were 5′-GGATGACGCACAATCCCAC-3′ (complementary to the 3′ end of CaMV 35S promoter region) and 5′-GTGCAATCCATCTTGTTCAATC-3′ (complementary to the 5′ end of the nptII gene) ( Fig. 1a). The PCR was ‘hot-started’ at 94°C for 10 min, then 30 cycles of 94°C for 90 sec, 55°C for 90 sec, 72°C for 3 min with a 5 sec extension per cycle, and finished with 72°C for 10 min. The final elongation was prolonged to 30 min when the PCR fragments were to be cloned into a T-cloning vector (MBI Fermentas). Prior to cloning, the PCR fragments were purified by a PCR product purification kit (Boehringer).
The dideoxy chain-termination method with SequenaseTM Version 2.0 (USB, Amersham) was used for sequencing. The reactions and gel runs were performed according to manufacturer’s instructions with standard pUC19 primers as well as the PCR primers mentioned above. Other commonly used techniques were performed according to Ausubel et al. (1989) .
Analyses of DNA sequences
Searches for homologies of analysed DNA sequences to DNA sequences in public databases were performed on an NCBI server using the BLASTN program ( Altschul et al. 1990 ).
We would like to thank Dr Barbara Hohn for careful reading of the manuscript and critical comments on it. The work was supported by the Grant Agency of the Czech Republic: Project No. 206/96/K188.