T‐circle vector strategy increases NHEJ‐mediated site‐specific integration in soybean

The soybean transgenic plant #83 is a two-copy transgenic event: one copy SSI at the chromosome 6 DT5.1 site (see S4 ), the other integrated at soybean phospholipase A1-I beta2 mRNA 5’ leader sequence (LOC100819591, GenBank # XM_003549891 ). This region sequence is identical between soybean cultivar Williams 82 and A3555. The three junctions are detailed in S6 .


Materials and Methods
Target sequence selection.
The DT5.1 region, located near end of the soybean (Glycine max) chromosome 6, was selected by identifying hypomethylated regions in DNA methylation datasets in our proprietary soybean germplasm A3555, which has been disclosed in the Supplementary 1.The LbCas12a DT5.1 target site was selected by a proprietary gRNA finder software and is unique in the soybean genome.The target site was selected for high cutting frequency in stable transformants by Illumina sequencing of the PCR products.We observed 95% mutation frequencies at this target site across multiple experiments.

Plasmid construction.
The essential genetic element sequences including aadA, Cas12a, gRNA cassettes, border variants, the GOI terminator are disclosed in Supplementary 1.The elements were either amplified by PCR using Q5 DNA polymerase (New England Lab) from existing plasmid templates, genomic DNA or synthesized (Bio Basic Inc., Markham, Ontario, Canada) with 20 to 25 bp overlapping sequences between the assembly junction elements in primers.The expression cassettes/elements were assembled by hot fusion method as described previously (Fu et al., 2014).All binary vectors contain the aadA expression cassette as the selectable marker which confers spectinomycin resistance in transgenic shoots (Brian et al. 2013).To remove the selectable marker, LbCas12a and gRNAs in progeny plants, a Cre recombinase autoexcision cassette (Ye et al., 2023) was added in the T-DNA and a pair of loxP sites were used to flank the marker gene and the gene editing accessory genes.An ori pRi binary vector backbone was used for all soybean binary vector construction which is single copy in Agrobacterium and produces approximately 50% single copy transgenic events in regular transformation (Ye et al., 2011).All plasmids were confirmed by fully sequencing.
The two plasmids designed with T-circle vector strategy are depicted in Fig. S2.The aadA marker gene is split into two pieces being placed at both T-DNA ends.The RB inner sequence after the 25 bp core sequence (Supplementary 1) is removed by PCR, which generates 3 bp residue after Tstrand processing.Two plasmid elements are identical except for the left border length.In the plasmid 1, the inner LB sequence before 25 bp core border sequence was removed which leaves 22 bp LB residue after proper T-strand termination (Fig. S3).In plasmid 2, the original LB was used which leaves 285 bp inner LB sequence after T-DNA process (Fig. S4).

Agrobacterium preparation and soybean transformation.
Agrobacterium tumefaciens AB30 strain is used for all experiments, which is derived from the nopaline strain ABI (Ye et al., 2008) by knocking out kanamycin resistance gene.The binary vectors were transformed into AB30 by electroporation and selected on LB medium with 30 mg/L gentamicin and 50 mg/L kanamycin to obtain single colonies.The vectors in Agrobacterium were verified by full plasmid sequencing.Soybean cultivar A3555 was used for transformation.Agrobacterium preparation and soybean transformation was described previously (Ye et al., 2008;Martinell et al., 2013) with minor modification using 150 mg/L spectinomycin instead of glyphosate for plant regeneration.Briefly, the mechanically excised meristem embryos from dry seeds were imbibed in the INO medium for 1 hour, inoculated with Agrobacterium using sonication.After five-day co-culture, the meristem embryo explants were selected on WPM media with 150 mg/L spectinomycin for 6 weeks.The green shoots with original hypocotyls were directly transferred into soil plugs for rooting.Leaf samples were collected after 2 weeks for DNA extraction.Independent primary transgenic plants that were produced in tissue culture are also called as transgenic events.

DNA extraction and transgene copy number determination.
Leaf samples were used for DNA extraction following the method previous described (Kouranov et al., 2022).The transgene copy number was determined by TaqMan ® assay following the manufacturer's instruction (ThermoFisher Scientific, Waltham, MA USA).The primers 5'-AGCTAAGCGCGAACTGCAAT-3' (forward) and 5'-GGCTCGAAGATACCTGCAAGA-3' (reverse) amplifying the aadA marker gene in the soybean binary vectors, and further detected by MGB (minor grove binding) TaqMan ® probe 6FAM-TGGAGAATGGCAGCGCAATGACA, were used for the aadA selectable marker gene copy number assay.The ori pRi TaqMan ® primers and detection probe as described previously (Ye et al., 2011) was used for all vector backbone detection.

DNA library construction for the genomic insertion fragments and analysis of target insertion by
Illumina sequencing and PacBio Hi-Fi long-read sequencing.
Genomic DNA (gDNA) libraries were prepared starting with at least 20 ng of gDNA (Kouranov et al., 2022).Kapa HyperPlus kits (Roche) were used in a 96-well format to create individual gDNA libraries.
Sizing by two-step bead purification (St. John & Quinn, 2008) of individual libraries was employed to facilitate sequencing on Illumina NovaSeq 6000 instrument.The sequence reads were assembled against the entire vector sequence and soybean genome using CLC Genomics Workbench 20.0.4 software (Qiagen).The sequences that map to the transformation vectors and genomic junction fragments were used to identify the T-DNA insertion location by BLASTn analysis using the flank sequences against the entire soybean genome.

Supplementary 2
Fig. S1 to S6, Table S1 to S3 and genetic element sequences    In all single copy SSI events that we have analyzed so far, the chromosomal junctions with RB and LB do not exist but RB-LB junction inside the Arabidopsis Act7 intron.Therefore, it's plausible to believe that the single copy SSI events are derived from single T-DNA end-joining (T-circle) followed by re-linearization.Note: Plasmid 1 and 2 are identical except for the LB length (Fig. S2).TF: transformation frequency= transgenic plants/number of explants.1 copy: frequency of single copy transgenic plants regardless of backbone.Backbone positive: frequency of ori pRi backbone probe detection positive plants. 1 copy SSI rate: SC-BF-PGE events.The SSI rate is defined as the SSI events/analyzed samples. 2 SSI copy events: two inserts, one in target site with perfect genetic elements, the other in other chromosome location.N/A: not available.Note: Sequence ID: each represents DNA sample from independent transgenic soybean event.

Fig. S3 :
Fig. S3: The expected junction of the shortened RB and the shortened LB after T-DNA circularization and re-linearization from plasmid 1 in transgenic soybean plants.A 27 bp border residue in the middle of the intron from Arabidopsis Act7 promoter is blue highlighted.

Fig. S4 :
Fig. S4: The expected junction of the shortened RB and the full length of LB after T-DNA circularization and re-linearization from plasmid 2 in transgenic soybean plants.A 289 bp border residue in the middle of the intron from Arabidopsis Act7 promoter is blue highlighted.
Backbone: vector backbone presence detection by Taqman®, shown as negative (neg) or positive (POS).aadA copy: aadA transgene copy number determined by Taqman® using aadA probe.T-DNA location: the chromosome location of T-DNA insert(s).On-target: T-DNA insert at the expected DT5.1 target site at the chromosome 6.Color code: dark green: single copy SC-BF-PGE SSI events; light green: 2 copy SSI events with one SC-BF-PGE insert; blue highlighted: SSI events with loxP junction deletion.Insert direction: two insertion orientations [forward (FWD) or reverse (REV)] as shown in Fig.S5.Other abbreviations: Chr01= chromosome 01, and so on; RI=random insert; inv rep=inverted repeats; BKB=backbone.