Hybridization and range expansion in tamarisk beetles (Diorhabda spp.) introduced to North America for classical biological control

Abstract With the global rise of human‐mediated translocations and invasions, it is critical to understand the genomic consequences of hybridization and mechanisms of range expansion. Conventional wisdom is that high genetic drift and loss of genetic diversity due to repeated founder effects will constrain introduced species. However, reduced genetic variation can be countered by behavioral aspects and admixture with other distinct populations. As planned invasions, classical biological control (biocontrol) agents present important opportunities to understand the mechanisms of establishment and spread in a novel environment. The ability of biocontrol agents to spread and adapt, and their effects on local ecosystems, depends on genomic variation and the consequences of admixture in novel environments. Here, we use a biocontrol system to examine the genome‐wide outcomes of introduction, spread, and hybridization in four cryptic species of a biocontrol agent, the tamarisk beetle (Diorhabda carinata, D. carinulata, D. elongata, and D. sublineata), introduced from six localities across Eurasia to control the invasive shrub tamarisk (Tamarix spp.) in western North America. We assembled a de novo draft reference genome and applied RADseq to over 500 individuals across laboratory cultures, the native ranges, and the introduced range. Despite evidence of a substantial genetic bottleneck among D. carinulata in N. America, populations continue to establish and spread, possibly due to aggregation behavior. We found that D. carinata, D. elongata, and D. sublineata hybridize in the field to varying extents, with D. carinata × D. sublineata hybrids being the most abundant. Genetic diversity was greater at sites with hybrids, highlighting potential for increased ability to adapt and expand. Our results demonstrate the complex patterns of genomic variation that can result from introduction of multiple ecotypes or species for biocontrol, and the importance of understanding them to predict and manage the effects of biocontrol agents in novel ecosystems.


| INTRODUC TI ON
Human-mediated translocations (e.g., introductions) and climate change have reshaped range limits and previous barriers to gene flow on a global scale (Capinha et al., 2015). Conventional wisdom is that high genetic drift and loss of genetic diversity due to repeated founder effects will constrain adaptation, particularly at the expansion front of introduced species or species expanding their ranges due to environmental change (Estoup et al., 2016;Excoffier et al., 2009;Slatkin & Excoffier, 2012). However, reduced genetic variation can be countered by behavioral aspects of introduced and rangeexpanding species, such as aggregation, and by admixture with other distinct populations (Estoup et al., 2016). Introductions can also provide insights into the stability of cryptic species complexes with allopatric, parapatric, and sympatric native ranges. Introductions can bring such species into contact under novel conditions or secondary contact, representing test cases of ecological speciation (Smith et al., 2018). Furthermore, understanding the genomic consequences of translocations and admixture between introduced populations is key to both preventing the spread of invasive species and improving the conservation of threatened species (Fauvergue et al., 2012; McFarlane & Pemberton, 2019; Roderick & Navajas, 2003).
However, the spontaneous nature of accidental introductions makes it difficult to study the outcomes and consequences of ecoevolutionary processes occurring in these systems, since, among other factors, the introduction history and founding population sizes are typically unknown. Classical biological control programs (hereafter biocontrol) are essentially planned, intentional invasions.
In a typical biocontrol program, highly host-specific natural enemies (agents) are collected in their native range and introduced into a novel environment to control invasive pests (targets; McFadyen, 1998). Biocontrol systems thus provide an unmatched opportunity to study invasions from a genomic perspective because, compared with their respective target invasive species, biocontrol agents were introduced relatively recently and almost always intentionally, with known source locations, introduction localities, and sometimes known introduction population sizes (Marsico et al., 2010). Despite the opportunity biocontrol agents represent, genomic tools have rarely been used to identify and characterize the consequences of founder effects or evolutionary mechanisms contributing to establishment, persistence, range expansion, or rapid evolution in classical biocontrol agents of invasive species (Hopper et al., 2019;Leung et al., 2020;Muller-Scharer et al., 2020;Sethuraman et al., 2020;Szűcs et al., 2019).
Biocontrol scientists have, on several occasions, released several individuals from different locations in the native range, with either known, or inferred, differences in phenotypes. These different populations are referred to as "ecotypes." Different ecotypes are often released in the hope that some will better match the novel environment (DeBach & Rosen, 1991;Frick, 1970;Room et al., 1981;Smith et al., 2018). While this practice may increase the chance of ecological matching across a diverse range of target habitat, it also opens the door for novel phenotypes to arise upon hybridization of different ecotypes. Admixture among different populations, and, at an extreme, hybridization between different species, may present the genetic novelty and diversity necessary to overcome the bottleneck imposed by introduction and adapt, but at the risk of yielding undesirable traits or decreases in fitness (Fauvergue et al., 2012;Kolbe et al., 2004;Lommen et al., 2017;Rius & Darling, 2014).
The outcomes of multiple agent releases have been understudied, while the consequences of hybridization and admixture among divergent biocontrol agents are even less well understood (but see Szűcs et al., 2011;Szűcs et al., 2021). Biocontrol efforts could be enhanced if hybridization resulted in increased genetic diversity, providing the raw material for the regional evolution of more efficacious ecotypes, or increased fitness in populations with higher genetic diversity (Bean et al., 2013;Bitume et al., 2017;Szűcs et al., 2021;Szűcs, Eigenbrode, et al., 2012;Tracy & Robbins, 2009). If hybridization resulted in forms with less host specificity than seen in parental forms, for example, biocontrol safety could be compromised, though there have been no documented cases of evolution in fundamental host range. Despite evidence of a substantial genetic bottleneck among D. carinulata in N.
America, populations continue to establish and spread, possibly due to aggregation behavior. We found that D. carinata, D. elongata, and D. sublineata hybridize in the field to varying extents, with D. carinata × D. sublineata hybrids being the most abundant. Genetic diversity was greater at sites with hybrids, highlighting potential for increased ability to adapt and expand. Our results demonstrate the complex patterns of genomic variation that can result from introduction of multiple ecotypes or species for biocontrol, and the importance of understanding them to predict and manage the effects of biocontrol agents in novel ecosystems.

K E Y W O R D S
biological control, de novo genome assembly, hybridization, invasion genomics, RADseq, range expansion range (the range of host species on which the agent can complete development; Van Klinken & Edwards, 2002). Given the potential consequences, postrelease monitoring of biocontrol programs involving the release of different populations or species is necessary (Hufbauer, 2008;McFarlane & Pemberton, 2019). Recently developed molecular tools can now also be routinely employed to track biocontrol agents at the population genetic level.
Here, we use a biological control system to understand the consequences of introduction, intraspecific admixture, and interspecific hybridization in four introduced closely related species rapidly expanding their ranges. To this end, we produced a de novo draft genome assembly of one of the species and used this to (1) identify genetic variation indicating ancestry of the four species, including two ecotype pairs within species, to be able to track their distribution across the introduced range, (2) quantify prevalence and levels of hybridization in the introduced range, and (3) examine the consequences of population bottlenecks and hybridization on genomewide diversity across broad, landscape-wide expansion fronts and the hybrid zone. We predicted that admixture among populations and hybridizing species would lead to an increase in genetic variation, while isolation of disconnected patches or range expansion would lead to a decrease. We aim here to illustrate how building the molecular genetic foundation to monitor and predict evolution in introduced species can improve our understanding of mechanisms and consequences of range expansion and hybridization in novel environments.

| Study system
The case of Diorhabda spp. (Coleoptera: Chrysomelidae), a leaf beetle released to control invasive woody shrubs of the genus Tamarix (hereafter, tamarisk), provides a system in which questions relating to the genomic consequences of invasion can be addressed.
Tamarisk is native to North Africa and Eurasia and has become invasive in riparian areas across the western United States and northern Mexico. Stands of tamarisk can form dense monotypic thickets that cause substantial economic and environmental damage including increased fire intensity and frequency (Drus et al., 2013), increased evapotranspiration (Nagler et al., 2014), diminished soil mycorrhizae critical for native plant species (Meinhardt & Gehring, 2012), and a number of other negative impacts on native flora, wildlife habitat, and recreation (Di Tomaso, 1998;Gaskin & Schaal, 2002;Shafroth et al., 2005;Zavaleta, 2000). The large extent of the tamarisk invasion, which is estimated to cover at least 360,000 hectares (Nagler et al., 2011), coupled with the high value of ecologically sensitive riparian areas and the cost of conventional control, which runs in the millions of dollars (US) per project (Knutson et al., 2019), provided impetus for development and implementation of a biocontrol program.
The first agent released for tamarisk biocontrol was the northern tamarisk beetle, Diorhabda carinulata, originally introduced in 2001 from two ecotypes collected in Fukang, China (44. 17°N, 87.98°E), and Chilik, Kazakhstan (43.6°N, 78.25°E;DeLoach et al., 2003). Initial field releases from Fukang and Chilik were performed at eight locations in North America (Figure 1;DeLoach et al., 2003) and the species established at northern locations but failed to establish below the 38 th parallel in California and Texas  leaving many heavily invaded river systems without a biocontrol option. To address this problem and improve ecological matching across the diverse invaded range (Sands & Harley, 1980), four additional tamarisk-feeding Diorhabda populations were collected from ecologically distinct locations in North Africa and Eurasia and introduced primarily in tamarisk-infested areas of western Texas (Knutson et al., 2019;Michels et al., 2013;Tracy & Robbins, 2009).
Three ecotypes were elevated to species status in a taxonomic revision based in part on morphology of the genital sclerites (Tracy & Robbins, 2009 Tracy & Robbins, 2009).
The Diorhabda system is one of the few examples in which contemporary evolution has been demonstrated in a biocontrol agent of invasive plants (see also Szűcs, Eigenbrode, et al., 2012;Szűcs et al., 2019 for work on Longitarsus jacobaeae). Evolution of response to photoperiod signals enabled rapidly southward-expanding D. carinulata populations to enter diapause in closer synchrony with the seasonal timing of senescence of tamarisk stands growing in more southern and warmer climates, where the growing season is longer than in the north (Bean et al., 2012;Dalin et al., 2010;Hultine et al., 2015). Evolution in diapause induction enabled faster range expansion than initially expected (Nagler et al., 2014). Range expansion of D. carinulata may not lead to reduced genetic variation at the expansion edge (Slatkin & Excoffier, 2012) given negative density-dependent dispersal (Birzu et al., 2019) and migration en masse ("swarming") that is common among mobile insects (Sullivan, 1981), especially at range expansion fronts as host resources are depleted and aggregation pheromones draw individuals to mating sites (Cosse et al., 2005).
Additionally, secondary contact between Diorhabda species in North America has likely initiated hybridization in this cryptic species complex, uniquely providing a window to the stability of cryptic species upon secondary contact and potentially representing a test case of (non)ecological speciation (Smith et al., 2018). In this case, only D. carinulata and D. carinata are sympatric in the native range, D. elongata is in parapatry with D. sublineata to the west and D. carinata to the east, and D. sublineata and D. carinata are the only species pair completely allopatric (Tracy & Robbins, 2009). While no intermediate forms indicative of hybridization among the four Diorhabda species were found in beetles collected from the native range (Tracy & Robbins, 2009), laboratory experiments showed that D. carinata, D. elongata, and D. sublineata can readily cross and back-cross with viable eggs. In contrast, hybrids and back-crosses between D. carinulata and the other three species showed significantly reduced egg viability and male sterility (Bean et al., 2013).
Understanding hybridization, genetic diversity, and range expansion in introduced Diorhabda populations is a high priority in management of the tamarisk invasion. A long-standing goal is to evaluate and enhance Diorhabda as a tamarisk control option, as the tamarisk invasion and the expense of controlling the shrub at a regional scale have heightened interest in biocontrol among regional resource managers (Bean & Dudley, 2018). Recently, because some native species now utilize tamarisk, including an endangered bird subspecies, the southwest willow flycatcher (SWFL; Empidonax traillii extimus) known to nest in the shrub, a new challenge has been presented to coordinate riparian restoration efforts with declining density of tamarisk brought about by biocontrol with rapidly evolving tamarisk beetles (Hultine et al., 2010;Sogge et al., 2008).
Previous resources to monitor hybridization and range expansion include mitochondrial cytochrome c oxidase I (mt-CO1) haplotypes and morphological markers. These conventional and accessible methods have generally been in agreement for species identification of expanding populations (Bean et al., 2013;Knutson et al., 2019;Ozsoy et al., 2018Ozsoy et al., , 2019Ozsoy et al., , 2021, but both morphology and CO1 can lead to incorrect species assignments when hybridization is common, and neither can accurately quantify proportions of ancestry (Rieseberg et al., 1993;Wayne & Jenks, 1991) nor can be used to identify the genetic basis of ecologically relevant traits and inform predictions. Thus, molecular genetic analysis at the whole-genome level is critical for a more detailed understanding of field populations to inform management strategies.

| Whole-genome assembly of D. carinulata
We developed a de novo draft genome assembly using adults from an inbred line established from field-collected beetles in Lovelock, NV (40.02°N, 118.52°W), where D. carinulata, originally sourced from Fukang, China, were released in 2001. We sampled reproductively active males twice from this line, once at the fifth generation (G5) and once at the twelfth generation (G12). We used an inbred line to reduce heterozygosity and facilitate genome assembly (Kelley & Salzberg, 2010). We specifically sampled from this inbred line of D.
carinulata because it was readily sourced from an original release locality unlikely to have experienced admixture, has been well characterized and monitored (Bean et al., 2012), and would assemble more readily than a field-caught individual due to reduced heterozygosity (Vinson et al., 2005). We combined two sequencing approaches for reference genome assembly. First, we extracted genomic DNA from the head, thorax, and dissected testes of the G5 male and constructed a library for whole-genome shotgun sequencing (WGS) using the NEBNext Ultra II DNA Library. This WGS library was sequenced in one lane of a MiSeq platform (Illumina) using v3 reagents to produce paired 300bp reads, resulting in approximately 10.4 million read pairs. Second, we conducted 10× Chromium sequencing, which produces longdistance synthetically linked reads (Weisenfeld et al., 2017). We isolated high molecular weight gDNA from the dissected testes of the G12 male, using a MagAttract Kit (Qiagen). The UC Davis Genome Center prepared a Chromium 10× library with v1 chemistry. This We first assembled the WGS MiSeq 300-bp reads from the G5 male D. carinulata. We used a windowed adaptive trimmer, Sickle version 1.33, to remove adapters and low-quality reads from this library (Joshi & Fass, 2011), and retained approximately 10.3 million pairs trimmed to an average length of 249.8 bp. We built contigs and scaffolds from these reads with SPAdes version 3.7.1 with one iteration of BayesHammer error correction; k-mer values of 21, 33, 55, 77, 99, and 127; mismatch careful mode turned on; repeat resolution and mismatchCorrector enabled; and coverage cutoff turned off (Bankevich et al., 2012). Then, we incorporated the raw 10× Chromium synthetic long reads from the G12 male to scaffold these contigs using the ARCS + LINKS pipeline (Yeo et al., 2018). Briefly, we extracted barcoded reads and trimmed barcodes with the 10× software Long Ranger version 2.1.6, aligned reads to the SPAdes assembly with BWA-MEM version 0.7.17 (Li, 2013), and then supplied the SPAdes assembly and alignments to the ARCS (version 1.0.1) + LINKS (version 1.8.5) pipeline in default mode. ARCS uses the evidence of the synthetic linked reads to construct a graph of linkages for LINKS to then resolve phased scaffolds. We removed contigs of length <200 bp as required by NCBI. We assessed quality of this assembly in terms of overall contiguity using bbstats.sh from the bbmap suite (Bushnell, 2014), as well as the completeness of singlecopy conserved orthologous genes using BUSCO version 5.0.0 with the Insecta database (insecta_obd10) composed of 75 species and 1367 orthologs (Manni et al., 2021;Simão et al., 2015). We were not able to collect from the native ranges of D. sublineata and D. carinata, so we sampled from laboratory colonies derived from the same original collections used for the introductions, maintained at the Palisade Insectary, Colorado Department of Agriculture (see Bean et al., 2007, for details of laboratory culturing). To characterize their distribution in N. America (2) and examine the genomic consequences of range expansion and hybridization (3), we collected samples from the sites of the first releases of all four species, along the D. carinulata expansion front along the Virgin River, and across the suspected hybrid zone in New Mexico and Texas ( Figure 1, Table S1, Figures S1 and S2).

| Sampling and individual genotyping across the native range, introduced range, and laboratory cultures
In total, we sampled 566 beetles, from 37 locations and two laboratory cultures, for population genomic analysis. At each site, we sampled beetles from trees within a 1-km radius and limited the collection of beetles to no more than 5 individuals per tree where possible. We could not find adults at 19TX and instead collected third-instar larvae, which were reared to the adult stage under laboratory conditions. Individuals for 31NV, 32UT, 33NV, and 34UT were sampled from the second generation of laboratory cultures established from these sites. All samples were adult beetles transferred as live individuals to coolers with dry ice or immediately to a −80°C freezer. All samples were stored at −80°C until DNA extraction.
To prepare restriction site-associated DNA sequencing (RADseq) In total, we prepared 634 individually barcoded RADseq samples across eight single-digest RADseq libraries using the 8-bp restriction enzyme SbfI following the protocol described by (Ali et al., 2016 We used Stacks 2.5 (Rochette et al., 2019) to process raw fastq reads, call genotypes, and produce population genetic statistics.
First, raw sequencing reads of each library were filtered for PCR duplicates using clone_filter. Then, reads were demultiplexed by individual barcode and re-oriented using the --bestrad flag in pro-cess_radtags, allowing for 3 mismatches and discarding reads with low-quality scores (Catchen et al., 2013;Rochette et al., 2019;Stahlke et al., 2020). Each processed sample was then aligned to our D. carinulata draft genome using the --very_sensitive flag of bowtie2 version 2.2.9 (Langmead & Salzberg, 2012), and sorted with SAMtools version 1.9 (Li et al., 2009). We called genotypes for all sequenced individuals together in the Stacks 2 module gstacks with the default maruki_low model (Maruki & Lynch, 2017). Then, we required that retained sites were present in the majority of all samples (-R 50) and extracted a random SNP from each ordered locus.
Individuals were further filtered to retain those with >4× effective coverage and <75% missing genotypes with VCFtools version 0.1.16 (Danecek et al., 2011). We removed 82 individuals from the dataset with this filter. Finally, genotypes were called again, variant calling format (vcf) files were generated, and population genetic summary statistics were evaluated using Stacks populations. We refer to the final catalog of SNPs comprising all 552 individuals as the global dataset.

| Source population ancestry and hybridization
We used Structure version 2.3.4 (Pritchard et al., 2000) to identify genetic clusters. Given that founding populations were from distinct sources across Eurasia and the potential for rapid range expansion to lead to dramatic differences in allele frequency among sites, we used the uncorrelated allele frequency model and allowed the alpha parameter to be inferred for each population (Falush et al., 2003).
For each K from 1 to 10, we executed 10 independent runs, allowing a burn-in period of 10,000 steps and 10,000 Markov chain Monte Carlo replicates, and printed the estimation of 90% credible intervals. We used PopHelper 2.3.0 (Francis, 2017) to visualize results and characterize the posterior probability across values of K. After assessing global ancestry assignment with Structure, we checked for relationships between alignment rates and individual species ancestry assignment ( Figure S3).
To identify the genomic characteristics of the four species and track them (1), we compiled published data (Hudgeons et al., 2007;Knutson et al., 2012Knutson et al., , 2019Michels et al., 2013;Pratt et al., 2019) and gray literature describing original releases (directly from the native range) and redistribution efforts (translocations from original release localities) to guide our inference of ancestry assignment (Table   S2). Then, we used the Structure ancestry assignments of parental species from laboratory cultures, single-source population release sites (e.g., Fukang ecotype D. carinulata at 1WY and Chilik ecotype at 34UT; Table S2), and the native range (46CH and 37CR-43GR) to guide our ancestry inference at the remaining localities. Using those diagnostic samples to assign clusters to species identity, we examined the confidence intervals across independent runs to conservatively identify the threshold at which ancestry could be confidently inferred, q = 0.067 (i.e., the lower bound of the 90% credible interval), below which admixture identification could be unreliable and due to technical biases (Caniglia et al., 2020).
Because two of the species, D. carinulata and D. elongata, were each introduced from two source locations, we examined population substructure within each to identify the distribution of source populations in N. America. We constructed population maps for individuals that had D. carinulata or D. elongata ancestry (respectively) above the 0.067 threshold, then re-filtered SNPs as above in the Stacks populations module. We then reran Structure for each subset of individuals. This secondary analysis also provided a check for sensitivity of population genetic statistics and ancestry assignment to unbalanced sampling in the global Structure analysis (Meirmans, 2019).
Finally, with species identities and population substructure in hand, we identified hybrids within N. America (2). We classified sampled localities as "hybrid" if the majority of individuals had ancestry from more than one species. We visualized the distributions of q- Latitudinal variation serves as a proxy for photoperiod, temperature, and host-genotype variation across Diorhabda populations.
To test for a relationship between ancestry and latitude, we constructed linear models of relevant q-value from latitude for D. sublineata ancestry, as the dominant species within the suspected hybrid zone (see Section 3), and separately D. carinulata source population ancestry to test whether both source populations were represented in southward expansion.

| Genome-wide diversity
To examine the consequences of population bottlenecks and hybridization on genome-wide diversity (3), we quantified genomic differentiation among all individuals (the global dataset) within and across sites using population-based measures. We used the populations module of Stacks to calculate π (nucleotide diversity) and F IS (the inbreeding coefficient). We present π and F IS calculated from only variant sites. To estimate the proportion of private alleles within each population, we used ADZE v. 1.0 (Szpiech et al., 2008) to rarify alleles, allowing for appropriate comparison across sampling localities with different sample sizes. For this, we input the global Structure file, and rarified sample size to maximize the number of SNPs (n = 567 retained) without losing populations with too many missing loci (25% missing data threshold). We present the proportion of private alleles rarified to a standardized sample size of 4. Standard errors in these analyses were calculated across RAD loci.
To test whether genetic diversity results (π, F IS, and proportion of private alleles) were statistically different between hybrids versus pure species and native versus introduced, we performed a oneway ANOVA in R. We grouped samples within localities according to Structure results using the threshold described above. Native range populations included 46CH, 43GR, 44GR, 41 CR, 37CR, 39CR, and 38CR. Although differing sample sizes between localities could have impacted estimates of private alleles (Leberg, 2002), we employed rarefaction, sampled more introduced populations compared with the native range populations, and used replication through sampling several sites to ensure conservative bottleneck estimates. We excluded laboratory colony populations from this analysis.
To quantify isolation by distance (IBD), we calculated the distances between the latitude and longitude coordinates of each sampling locality using the haversine formula and constructed a pairwise distance matrix (Sinnott, 1984). Then, using the vegan package (Oksanen et al., 2019), we conducted a Mantel test with 999 permutations on pairwise F ST and distance matrices (Hutchison & Templeton, 1999) for the following groups: (a) introduced D. carinulata, (b) native range D. elongata, and (c) the suspected hybrid zone.
After characterizing IBD, we tested for a linear relationship between latitude of collection site and Chilik ancestry for the D. carinulata SNP dataset and, separately, D. sublineata ancestry using the global SNP dataset.
To examine the consequences of rapid range expansion in D. carinulata, we tested for the signal of asymmetrical range expansion with two origins (1WY and 34UT) across all of introduction sites of D. carinulata. Although 1WY is not the source population for D. carinulata released in Nevada, Colorado, and Utah, it is the best representative population for Fukang releases (Table S2). We filtered genotypes in the Stacks populations module for only those populations. We used the rangeExpansion R package version 0.0.0.9000 to estimate the strength of founder effects, the fit of the predicted range expansion dynamic, and the directionality index (ψ), which is a measure of directional clines in allele frequency created by successive colonization events (Peter & Slatkin, 2013).

| SNP genotyping
We constructed a total of eight RADseq libraries containing over 552 individuals across laboratory cultures, the native range, and the introduced range. We removed an average of 39.52% reads identified as PCR duplicates across libraries and retained or recovered a total of 86.28 million reads after initial filtering and demultiplexing. Individual reference alignment rates against the de novo draft assembly of D. carinulata were 72.77% on average, but we noted two distinct peaks near 75% and 100% that reflected species assignments ( Figure S3). Effective coverage depth averaged 24.1× after removing individuals with coverage <4×. In the global SNP dataset, 153,453 loci were merged across paired-end reads for an average locus length of 854.16 bp (std. error = 2.70) and a total of 2,247,632 nucleotide sites. We retained 1457 SNPs with an average of 8.1% missing data. In the subset of samples with D. carinulata ancestry, we retained 2629 SNPs across 182 samples. In the subset of samples with D. elongata ancestry, we retained 1766 SNPs across 99 samples.

| Source population ancestry and hybridization
Structure analysis suggested differential establishment, range expansion, and admixture among all six source populations from the four Diorhabda species. We present first the results from the global dataset consisting of all samples from all collections. The greatest change in likelihood, (i.e., the Evanno method; Evanno et al., 2005) occurred at K = 2, splitting D. carinulata from the other three introduced Diorhabda spp. and hybrids ( Figures S4, S5, S7). At K = 4, change in likelihood values plateaued (i.e., the Ln''(ΔK) method; Figure S4) and matched species identities for laboratory cultures, native range, and isolated release sites (Figure 2). The modal ancestry assignments for K = 4-10 were identical across replicates after aligning clusters, that is, with low variation among replicates and no additional clusters were recovered, except for hierarchical substructure (see below; Figures S9-S14). Therefore, we used cluster assignments from one of the representative runs (replicate 1) of K = 4 to infer parental species and hybridization. Using a threshold of q = 0.067 for inferring ancestry from a parental species, we found 179 D. carinulata, 176 D. sublineata, 69 D. carinata, and 93 D. elongata.
We did not find evidence of hybridization among laboratory cultures or in the source populations for D. carinulata (46CH) and D. elongata (44GR-41CR).
We further investigated hierarchical substructure among previously identified ecotypes within D. carinulata and D. elongata (Figure 3), indicated within the global structure results in a minority of runs per K ( Figures S6, S9-S14). In both cases, we found that K = 3, the best K according to multiple likelihood comparison methods (Figures S16, S17), corresponded to the two source populations introduced from each species and likely hybridization with other Diorhabda spp. (Figure 3, Table S2). We found that the Chilik ecotype, first introduced in 34UT, was the dominant ecotype represented along the Virgin River expansion front (Figures 1   and 3a). Individuals from sites 5CO and 11CO in eastern Colorado showed admixture with the Chilik ecotype (Figure 3a), likely due to anthropogenic movement of individuals from western Colorado, which was colonized by individuals from southeast Utah (Table S2).
We detected D. carinulata assigned to Fukang ancestry farther east than its suspected range in 6KS, 28NM, and 18TX (Figure 3a). In America (Figure 3b). In both cases of population substructure, the average distance between individuals within each of the ecotype clusters (H exp ) was between 0.07 and 0.12 and allele frequency divergence between the two clusters was about 0.006-0.008, supporting low differentiation between these ecotypes. Some individuals we identified as hybrids with other Diorhabda species (presented below) showed ancestry from a third cluster (Figure 3).
Although a third cluster may not always represent another distinct ancestry source (Lawson et al., 2018), in this case we have additional evidence that the ecotypes were also found as hybrids with one of the other three Diorhabda species (Figure 2 Table S3).
We found highly significant linear associations with large residuals between latitude and both D. sublineata in the global SNP dataset (adjusted R 2 = 0.433; p < 0.01; Figure S18a) and the Chilik ecotype in the D. carinulata (adjusted R 2 = 0.239; p < 0.01; Figure S18b), demonstrating that latitudinal variation may have influenced differential establishment and spread among genotypes.

| Genome-wide diversity
We found significant differences in genetic diversity metrics among hybrid, pure, and native range populations. Nucleotide diversity F I G U R E 2 Genetic clustering revealed using Structure (Pritchard et al., 2000) for K = 4 across all species. Each individual sample is represented by a bar. Individuals are grouped by collection site and ordered by localities of Table S1. Groups from left to right are (a) D. carinulata Fukang source collection (46CH, Figure S2), then north to south from original release sites along expansion front, which are followed by D. carinata and D. sublineata laboratory cultures and native range D. elongata collected in Greece ( Figure  S1). (b) Individuals collected within the hybrid zone of North America (Figure 1), from north to south. Locality IDs include the country or state sampled and symbols for relevant groups as follows: * = native range, Δ = source population, ∇ = release sites, • = hybrid zone, ■ = D. carinulata expansion front (π) was significantly greater at sites with hybrids (mean = 0.0863) than in either the sites with pure individuals (mean = 0.0335) or from the native range (mean = 0.0381), supporting the prediction that hybridization could increase genetic diversity (Figure 4). The proportion of private alleles was greatest among sites within the native range (mean = 0.0053) compared to those in introduced sites with pure individuals (mean = 0.0030) and hybrids (mean = 0.0016), supporting a bottleneck upon introduction for D. carinulata and D. elongata (Figure 4). Sites with hybrids also had a significantly higher average F IS value (mean = 0.157) than either the sites with pure individuals (mean = 0.0454) or within the native range (mean = 0.027; Figure 4), consistent with recent hybridization and/ or assortative mating among species. All results were significant at α = 0.05 (Table S4).
Taking genetic diversity results into a spatial context, we characterized the relationship between divergence and geographic distance among collection sites. Overall evidence for IBD was weak

| DISCUSS ION
To better understand the genomic consequences of multi-species introductions and range expansion, we used reference-based population genomics in an exemplar biocontrol system, Diorhabda (spp.). F I G U R E 3 Genetic clustering at K = 2-3 within populations of (a) D. carinulata and (b) D. elongata. Symbols as in Figure 2 This work represents a crucial foundation to advance evolutionary applications by describing the current distribution of species and ecotypes in the introduced range, interspecific hybridization, and impacts on genome-wide diversity. We discuss these results in the context of eco-evolutionary processes, highlight implications for the tamarisk biocontrol program in N. America, and discuss opportunities for further study of this system to improve our understanding of contemporary evolution and biocontrol of invasive plants.

| Distribution of parental taxa and hybridization
We found clear evidence for differential establishment of introduced Diorhabda populations and spread of these beetles from release sites across the introduced range. Among many original release sites and along suspected colonization routes, individual ancestry assignments were largely composed of the species or population that had been released in the location, suggesting that the extant populations at these release sites remained stable for several generations and naturally expanded along riparian corridors in a predictable pattern. The Fukang release of D. carinulata in Lovell, WY (1WY), appeared geographically stable, with uniform ancestry (Figure 3a) and a lack of inbreeding (Figure 4) despite being relatively isolated. The Chilik release of D. carinulata in Delta, UT (34UT), spread naturally to 12AZ along the Virgin River corridor, and also shows uniform ancestry (Figure 3a). However, the eco-evolutionary mechanisms driving differential establishment and spread among Diorhabda populations will require further study. For example, it is unclear why very little D. elongata F I G U R E 4 Population genetic statistics including nucleotide diversity (π; top panel), inbreeding coefficient F IS (middle), and the proportion of private alleles (bottom panel) for the global SNP dataset, consisting of all individuals collected at each locality and grouped (from left to right) by collections within the D. carinulata range, laboratory cultures, and the native D. elongata range, and those within the suspected hybrid zone (ordered as in Figure 2, by latitude within group). Shading of individual bars indicates whether the collection was an original release site, site within the native range, natural colonization site, or laboratory culture (from darkest to lightest). Bars represent ± the standard error for the respective statistic ancestry was detected in N. America despite many releases in Texas and New Mexico (Figure 1). The appearance of the D. carinulata Fukang ecotype in Kansas (6KS), New Mexico (28NM), and Texas (18TX; Figure 3a) was surprising given the lack of establishment reported previously (Bean et al., 2012), although releases of that population did occur near there (Figure 1). We have preliminary evidence of environmental filtering or adaptation among Diorhabda source populations, with genomic clines forming for the D. carinulata Chilik ecotype and D. sublineata along latitudinal gradients. However, the distributions are confounded by release history (Figure 1, Table S2), and the large residuals of this model due to the presence of the D. carinulata Fukang ecotype at southern latitudes of NM and TX ( Figure S9b) suggest that stochastic natural dispersal from release sites or specific genetic variation, rather than broad ecotype identity, could be driving cline formation.
Differential rates of hybridization among Diorhabda spp. suggest that there may be several outcomes when multiple closely related F I G U R E 5 Pairwise F ST compared with haversine distance between respective sites indicates patterns of isolation by distance associated with population structure for D. carinulata collected (a) in North America (Mantel's r = 0.4481, p = 0.1011 and (b) across the potential hybrid zone (Mantel's r = 0.5108, p < 0.005). In a, crossed-squares indicate pairwise comparisons within ecotype, solid squares, across ecotype. A linear model for each distribution projected behind points as a red line with standard error in gray taxa are released together, which are likely to depend at least in part upon reproductive barriers. Nonecological speciation theory predicts that when reproductive isolation between taxa is maintained by geography, it is more likely to break down upon secondary contact than when reproductive isolation is maintained by ecological or behavioral differences in sympatric species (Czekanski-Moir & Rundell, 2019). Our data on hybridization frequency supported these predictions. Reproductive isolation between D. sublineata and D. carinata was geographic in the native range (Tracy & Robbins, 2009) and has broken down in the introduced range. Although we were not surprised to find hybrids between allopatric and peripatric species pairs D. carinata, D. elongata, and D. sublineata where they were released in close proximity and thought to establish (Figure 1 (Bean et al., 2013). Sex-biased asymmetry of hybridization could be examined by employing existing mt-CO1 efforts with genome-wide ancestry analyses like these (Ozsoy et al., 2018(Ozsoy et al., , 2019(Ozsoy et al., , 2021). We did not recover mtDNA loci in this dataset. The elevated F IS observed in localities with multiple ancestry (Figure 4; e.g., 6KS and 28NM with both "pure" samples and hybrids) reflects that parental species were still detected as partially isolated, sympatric populations, not a panmictic population, in the introduced range (i.e., the Wahlund effect; Waples, 2015). Examining the mechanisms that contributed to speciation of Diorhabda in the native range (e.g., divergent environments, ecological interactions, sexual selection) and the role of those barriers in the introduced range present would improve our understanding of the stability of cryptic species broadly and in biocontrol (Fišer et al., 2018;Rundle & Nosil, 2005;Smith et al., 2018).

| Genomic diversity during range expansion in D. carinulata
The genomic basis of rapid range expansion in D. carinulata provides yet another example of the so-called "genetic paradox of invasions," wherein a reduction of genetic diversity during introduction does not preclude establishment and spread (Baker & Stebbins, 1965).
While we found a clear signature of population bottleneck by comparing the proportions of private alleles between samples from the native range (46CH) to localities with predominant Fukang ancestry (1WY, 2CO,4CO,5CO,11CO) in N. America (Figure 3a), neither the proportion of private alleles nor nucleotide diversity (π) declined along range expansion fronts (Figure 4). Comparing private alleles can provide a more sensitive detection of bottlenecks because they can capture low-frequency alleles from the native range that were not sampled in the introduced populations, while nucleotide diversity (π) also reflects allele frequencies more influenced by higher-MAF loci (Luikart et al., 1998). However, to determine whether this establishment and rapid range expansion reflects a "true paradox" (Estoup et al., 2016), changes in ecologically relevant quantitative variation and alternative mechanisms should be explored. For example, the weak signal of asymmetrical range expansion ( Figure   S11) and lack of significant IBD ( Figure 5) suggest that, at least at the time of sampling, population expansion was not unidirectional or smoothly distributed across the landscape, possibly reflecting anthropogenic-mediated dispersal and admixture between ecotypes ( Figure 3a). Along the expansion front, the potential effects of a genetic bottleneck could have been reduced by negative densitydependent dispersal preserving genetic diversity (Birzu et al., 2019).
If large census and effective population sizes are maintained by this mechanism, it could facilitate rapid local adaptation in novel environments and evolution along expansion fronts by assortative mating for aggregating co-dispersers (Burton et al., 2010).
Our range expansion results suggested a contact point for the two introduction sites in Nebraska, far outside the range of geographic possibilities, likely due to the Fukang D. carinulata observed in Kansas (6KS) and Texas (18TX; Figure 3b). Similarly, several of our sampled sites in the range expansion analysis were not connected by natural dispersal even though they were most genetically similar. To better understand range expansion in this biocontrol system and others like it, a different approach that incorporates anthropogenic movement, spatial distribution of habitat, and direct estimates of dispersal parameters would likely yield a more informative result regarding the genomic mechanisms, routes, and impacts of range expansion. We could build upon this work with biologically realistic simulations (Haller & Messer, 2019;Landguth et al., 2020) and demographically informed models using approximate Bayesian computing (Estoup & Guillemaud, 2010) to more accurately assess genomic mechanisms and consequences of rapid range expansion.
For example, remote sensing of D. sublineata defoliation and expansion has shown that tamarisk continuity and area width predict dispersal distance along a riparian corridor (Ji et al., 2017), whereas rangeExpansion assumed a continuous habitat and natural dispersal (Peter & Slatkin, 2013). Further work investigating the ongoing range expansion of D. carinulata should also examine possible roles for few loci of large effect (Dlugosch et al., 2015) and plasticity (Bay et al., 2017) operating in the evolution of diapause induction in D. carinulata.

| Eco-evolutionary processes influence the impacts of biocontrol
Evolutionary processes have long been of interest in the field of biocontrol (Simmonds, 1963), and has included efforts to mitigate potential negative effects of losing genetic diversity by augmenting population sizes, and of avoiding ecological mismatch through introducing individuals from deliberately targeted locations in the native range. Despite this long interest, only in recent years are the potential consequences of eco-evolutionary processes on the success of biological control programs being acknowledged and explored (Szűcs et al., 2019). Here, we provide further evidence that more detailed and nuanced information, including genomic data, can help us better understand the eco-evolutionary processes occurring among introduced biocontrol agents. Our work specifically documents population expansion despite a substantial genetic bottleneck, differential establishment and spread among source populations, and differential admixture among those populations.  (Gilman & Behm, 2011;Seehausen et al., 2008). Considering that previous laboratory experiments with hybrids created in the laboratory showed changes in phenotypes related to both fecundity and host preference (Bitume et al., 2017), the abundance of hybridization we observed warrants further study. These laboratory results have not been verified in the field, and host preference was measured to the third generation (Bitume et al., 2017). A recent regional decline in Diorhabda population density, and extirpation from some previously occupied areas, has been noted in Texas, Oklahoma, and Kansas (Knutson et al., 2019). This underscores the possibility that hybrid breakdown could compromise fitness of Diorhabda in the field, an interpretation consistent with the observation that up to 57% of field-collected hybrids displayed abnormal genital sclerites (Knutson et al., 2019). Testing of these traits (e.g., host choice and fecundity) in field-collected, genotyped populations is critical to better understand changes in risk or efficacy of the biocontrol program due to hybridization. Our dataset could be used to develop a panel of markers for more rapid and cost-effective identification of hybrids with targeted sequencing over time to test these hypotheses (e.g., RAD-capture, GTseq; Meek & Larson, 2019;Reid et al., 2020).
Our population genomics approach presents a much-needed tool to monitor biocontrol releases of multiple populations and cryptic species, highlighted by a notable discrepancy between morphological analyses (Knutson et al., 2019). The laboratory crosses found by Bitume et al. (2017) to be the most fecund relative to parental types, D. carinata × D. sublineata, were the hybrid pairs we found to be most abundant and widely distributed here, but they were not previously described in the morphological analysis of hybridization in this region. One possible explanation is that the sampling design of Knutson et al. (2019) did not include sites farther north into New Mexico, where the bulk of our D. carinata × D. sublineata hybrids was found. However, we also found some of these hybrids in Texas, in close proximity to the locations sampled by Knutson et al. (2019).
These results highlight the value of population genomics to monitor hybridization in cryptic species. Our dataset could be used to improve the accuracy of morphological hybrid identification by validating morphological markers (Griffin et al., 2020;Padial et al., 2010).
Our inferences regarding the mechanisms and impacts of range expansion and hybridization are currently limited because Diorhabda collections were sometimes transported without detailed documentation regarding population sizes or population sources. Therefore, we cannot determine based on these data alone whether any hybrid genotype or ancestry combination is more successful without more complete records. For example, we know that many D. carinulata releases were made in New Mexico (Table S2), but the source, precise release sites, precise release numbers, and establishment rates are largely unknown. In general, the practice of anthropogenic movement, often undocumented within management units, presents an interesting trade-off. On the one hand, this increases the availability and likely efficacy of biocontrol across users, but on the other, it makes it more difficult for biocontrol research to understand the patterns of range expansion and adaptation. For example, with these data, we could perform simulations and compare our results against informed null hypotheses or priors to better calibrate test of range expansion and test for selective introgression (Estoup & Guillemaud, 2010;Hopper et al., 2019;Moest et al., 2020). Evolutionary biologists, biocontrol scientists, and the stakeholders of target invasive species would be well served with improved records and catalogs of genetic material from biocontrol agent source populations, releases, and follow-up monitoring. Currently, there is no entity charged with genomic monitoring of biocontrol releases and these efforts rely on short-term funding and idiosyncratic academic-governmental relationships for each biological system.
The draft assembly of D. carinulata is one of only two currently available reference genomes of biocontrol agents of invasive plants, both in the family Chrysomelidae, subfamily Galerucinae , an inviting opportunity for broader comparative work. The resource we developed here is unique in that it can be used for four intentionally released biocontrol agents (the four Diorhabda spp.), compared with Ophraella communa, which was not intentionally released and is spreading adventively (Müller-Schärer et al., 2014). Nonetheless, only two genomes for any biocontrol agents of invasive plants represent a substantial missed opportunity considering that there are over 332 established invasive plant biocontrol agent species worldwide (Schwarzlander et al., 2018). Further development of the D. carinulata reference genome, including annotation, would greatly improve our ability to identify SNPs, structural variants, and genes associated with traits related to efficacy and safety as a biocontrol agent, for example host choice and diapause induction. Still another opportunity is presented by coupling genomic studies of biocontrol agents with genomic resources from the invasive pests (Lee et al., 2018) to examine co-evolutionary interactions .
The application of genomic approaches in biocontrol systems has the potential to improve both our understanding of contemporary evolutionary processes and management of invasive species and conservation (Leung et al., 2020;Muller-Scharer et al., 2020;Roderick & Navajas, 2003;Sethuraman et al., 2020;Szűcs et al., 2019). Specifically, our results provide a baseline time point upon which we can build to further disentangle the mechanisms of rapid range expansion and consequences of hybridization in Diorhabda. More generally, we highlight possible predictors of human-mediated translocations and range expansion outcomes, including native range species boundaries and dispersal behavior, and show how genomic tools in a biocontrol system can test these predictions.

ACK N OWLED G EM ENTS
The 10X

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
Code, tabular results, and supplemental tables are hosted at https:// github.com/Astah lke/Diorh abaPo pulat ionSt ructure and will also be posted on Dryad. The reference genome will be available on NCBI under accession JAJJBS000000000 following processing.