Go west: Population genomics reveals unexpected population fluctuations and little gene flow in Western hemisphere populations of the predatory lady beetle, Hippodamia convergens

Abstract Hippodamia convergens—the convergent lady beetle, has been used extensively in augmentative biological control of aphids, thrips, and whiteflies across its native range in North America, and was introduced into South America in the 1950s. Overwintering H. convergens populations from its native western range in the United States are commercially collected and released across its current range in the eastern USA, with little knowledge of the effectiveness of its augmentative biological control. Here we use a novel ddRADseq‐based SNP/haplotype discovery approach to estimate its range‐wide population diversity, differentiation, and recent evolutionary history. Our results indicate (1) significant population differentiation among eastern USA, western USA, and South American populations of H. convergens, with (2) little to no detectable recent admixture between them, despite repeated population augmentation, and (3) continued recent population size expansion across its range. These results contradict previous findings using microsatellite markers. In light of these new findings, the implications for the effectiveness of augmentative biological control using H. convergens are discussed. Additionally, because quantifying the non‐target effects of augmentative biological control is a difficult problem in migratory beetles, our results could serve as a cornerstone in improving and predicting the efficacy of future releases of H. convergens across its range.


| INTRODUC TI ON
Large-scale releases of translocated or captive-raised organisms have been commonly used worldwide throughout the last century, including for conservation purposes (e.g., reintroduction or relief of inbreeding depression), commercial enterprises (e.g., logging, aquarium pet trade), sporting activities (e.g., hunting and fishing), and for biological control (Laikre et al., 2010).Despite the pervasiveness of these mass releases, non-target effects of translocations of native species are largely ignored in policy and research.While population augmentations can be demographically and economically beneficial, they can also adversely impact intraspecific genetic variation by causing the breakdown of local adaptations, loss of genetic variability, and the reshaping of population structure (Laikre et al., 2010;Simberloff & Stiling, 1996;Terui et al., 2023;Wajnberg et al., 2001).
Genetic data are important sources of information for gaining insight into historical and recent evolutionary history of species that have had their populations established or augmented by large-scale releases of individuals, as occurs during biological control (Jones et al., 2023;Kajita et al., 2012;Li et al., 2019Li et al., , 2020Li et al., , 2021Li et al., , 2023;;Sentis et al., 2022;Sethuraman et al., 2020;Wang et al., 2023).
Population genomics, in particular, provides a powerful approach toward understanding how genetic variation has been impacted on a recent time scale (Maigret et al., 2020;Nunziata et al., 2017), including the time frame encompassing human-mediated movements and augmentations of populations during biological control.These insights can be important for at least two primary reasons.First, the pathway to successful pest suppression for many biocontrol introductions is often unknown.Second, by characterizing the degree to which population structure has changed across the native range of a species over long-term time scales, and by estimating the impact of gene flow and population admixture between native and translocated populations (e.g., Li et al., 2020), the context for interpreting the non-target impacts of introduction efforts becomes available.
For example, loss of genetic variation through biocontrol efforts could limit the potential for populations to adapt to future conditions, ultimately decreasing long-term fitness of native and released populations (Frankham, 2005;Rhymer & Simberloff, 1996;Roderick & Navajas, 2003).Alternately, human-mediated releases may act to homogenize populations via admixture, and thereby reduce the potentially deleterious effects of localized pervasive inbreeding thus reinforcing the need to track the effects of gene flow and admixture due to human-mediated releases of species.
There is a long history of using insect predators for biological control, with billions of individuals being released over the past 100 years (Van Lenteren et al., 2003).Natural enemies are released to control pest populations, with or without the goal of permanent establishment of populations (Hajek & Eilenberg, 2018;Heimpel & Mills, 2017).Predatory coccinellids are widely used in biological control, with numerous native and exotic lady beetle species sold commercially and distributed throughout North America (Obrycki & Kring, 1998;White & Johnson, 2010).Population genetic studies have been useful in reconstructing population history of introduced lady beetle species that have successfully established populations in their introduced range (Lombaert et al., 2010;Kajita et al., 2012;Sethuraman et al., 2018).Less studied, but equally important is characterizing the population dynamics of a native species of lady beetle used in biological control (but see Sethuraman et al., 2015Sethuraman et al., , 2018)).
Convergent lady beetles, H. convergens, are one of the most abundant and widespread Hippodamia species in North and Central America, ranging from southern Canada and throughout the USA, and into central America (Gordon, 1985, Global Biodiversity Information Facility [www. GBIF. org], Lost Lady Beetle website [www.lostl adybug.org]).This species has been established in its non-native range in South America, including in Peru and Chile, where millions of individuals were introduced over multiple years (Hagen et al., 1976, Global Biodiversity Information Facility [www. GBIF. org]).First used as biological control agents in the early 1900s (Carnes, 1912), H. convergens was one of the most commonly used natural predators for control of aphids and whiteflies in the USA (Van Lenteren et al., 2003).During the 1900s and early 2000s, millions of adult individuals were collected by commercial insectaries annually from overwintering sites in California and sold for release (Dreistadt & Flint, 1996;Obrycki & Kring, 1998) throughout the USA.While H. convergens adults can still be purchased online from a number of sources, recent guidelines developed for using commercially available insect predators have not included H. convergens as a recommended predator for augmentative releases (LeBeck & Leepla, 2021).Additionally, selected insectaries, e.g., Arbico-Organic, are no longer selling H. convergens due to the effects of adverse climatic factors on populations and over-harvesting of overwintering aggregations (https:// www.arbic o-organ ics.com/ categ ory/ ladybugs).A recent study employing seven microsatellite loci found California genotypes admixed into eastern U.S. populations with weak population structure across the range, a potential non-target effect of augmentation (Sethuraman et al., 2015).Alternatively, this pattern could be the result of natural movements.Hippodamia convergens is capable of long-distance migrations (Rankin & Rankin, 1980), often dispersing multiple miles (Hagen, 1962).
In this study, we expand upon previous research by using genomescale data sampled across a larger geographic range of H. convergens.
Using large numbers of single-nucleotide polymorphisms (SNPs), we can refine our understanding of population structure and the genetic effects of releasing large numbers of western U.S. individuals across the larger range of the species.Further, SNP-based demographic inference allows for the testing of complex evolutionary histories, including timing and origins of population introductions, asymmetric levels of migration, and population size history (Fraser et al., 2015;McCoy et al., 2014).
Here, we used genome-wide SNP markers to examine genetic  S1).Locality information was not available for the commercially purchased individuals; however, we assume they were collected from geographical locations in Arizona and California and represent native populations from these areas.For each sampling location we collected 1-12 individuals; however, two sampling locations were geographically close and pooled for analyses (Table S1).
We performed double digest restriction-site associated DNA (ddRAD) sequencing using the protocol outlined in Peterson et al. (2012).Briefly, we extracted DNA from whole individuals (after gut removal) using a Qiagen DNEasy tissue kit (Qiagen, Valencia, CA, USA).We digested 1000 ng of DNA with the restriction enzymes EcoRI and NlaIII.Custom inline barcodes and P1/P2 adapters were ligated onto the digested DNA fragments and uniquely barcoded samples were pooled and size-selected between 473 and 579 bp using a Pippin Prep (Sage Science).Size-selected DNA was then amplified using High-Fidelity DNA Polymerase (Bio-Rad) with 12 cycles.
During PCR, a unique Illumina index sequence and Illumina sequencing primers were incorporated on ends of samples, so that each individual had a unique combination of inline and index sequences.All samples were then pooled and sequenced on a single lane of Illumina HiSeq 2500 at Florida State University's College of Medicine using 150 bp paired-end reads.To increase sequence diversity in the initial portions of sequence reads, the library was sequenced using an additional 1% spike in of PhiX control library.
We used standard Illumina chastity filtering of our sequence data and performed downstream processing using STACKS v2.0Beta6 (Catchen et al., 2011(Catchen et al., , 2013)).The process_radtags script was used to perform demultiplexing and quality-filtering.The denovo_map.pl script was used with default settings to identify unique RAD loci and call SNPs.The populations script was used to generate population datasets with loci that occurred in at least half of the sampling sites, in at least 50% of individuals at each site (r = 0.5), and with a minimum coverage of 10 reads (m = 10).After an initial run of the STACKS pipeline we excluded 8 individuals due to excessive missing data (>50% missing genotypes across loci; Table S1), bringing the total number of populations sampled down to 17.Further filtering was performed using vcftools v0.1.16(Danecek et al., 2011) to retain bi-allelic SNPs genotyped across at least 50% of all individuals.Data was filtered to exclude SNPs with minor allele frequency (MAF) less than 0.05, which have been shown to create biases in quantifying genetic connectivity (Benestan et al., 2015;Roesti et al., 2012).To avoid the inclusion of linked sites, only a single SNP was sampled from a locus.Sequence (haplotypic) data were also generated for analyses of evolutionary history (see description below).Additional de novo assembly and filtering based on more conservative filters (PHRED Q score of ≥33, maximum of 5 low quality bases per read, minor allele frequency of ≥0.05, maximum of 8 indels, minimum of 4 samples per locus, maximum missing data of 25% per locus) using iPyRAD (Eaton & Overcast, 2020) was also performed to obtain a smaller subset for population structure analyses.We also ascertained that there were no highly differentiated loci across all populations, as determined by estimation of outliers in Weir and Cockerham's Fst using OutFLANK (Whitlock & Lotterhos, 2015).Briefly, we computed Fst across all loci, between all sampled populations, and used the OutFLANK function in R to estimate deviations from expected Fst distribution under neutrality.Outlier loci were then ascertained at a p-value cutoff of 0.05 after FDR correction.
To generate sequence (haplotypic) datasets, we concatenated reads one and two and used the entire 295-bp sequences in these analyses.Loci were filtered to include only unlinked loci, but freely recombining within loci, represented in 90% of sampled individuals, resulting in a 165 locus dataset.homozygosity (H obs ), nucleotide diversity (π), number of private loci, and the mean inbreeding coefficient per population (F IS ).Pairwise F ST values were calculated for all populations in Arlequin with 10,000 permutations for each to determine significance (p < 0.05).

| Population genomic statistics
To visualize the F ST matrix we created a UPGMA dendrogram and heatmap using the function hclust in the R package ggdendro (de Vries & Ripley, 2016).

| Population structure
We used two different approaches to infer range-wide population structure and admixture of individuals.First, we used discriminant analysis of principal components (DAPC) in the R package adegenet (Jombart & Ahmed, 2011) to partition genetic variation into clusters by maximizing variance between clusters and minimizing variance within clusters.We explored a range of cluster (K) numbers from 1 to 16 and used the estimated BIC from each model to identify the optimal K. To identify the optimal number of principal components retained in each analysis, we used cross-validation (xvalDapc with 10 retained).Scatterplots and barplots were used to visualize individual assignment to clusters.
Second, we used a model-based approach in the program ADMIXTURE v1.3 (Alexander et al., 2009), which uses a maximumlikelihood approach to assign individuals to genetic clusters.Five replicate analyses were performed for K = 1-16, with the optimal K chosen as having the lowest cross-validation error across replicates (Alexander & Lange, 2011).We also evaluated levels of K above the identified optimum, as these may also be useful in explaining the data (Meirmans, 2015).We used the web server CLUMPAK to generate assignment plots across multiple values of K (Kopelman et al., 2015).

| Analysis of molecular variance
Using the optimal clustering suggested by DAPC and Admixture analyses, we also performed an AMOVA with 1000 permutations, to estimate the degree of population structure and the covariance structure (1) within individuals in each sampled population, (2) within populations in estimated clusters, and (3) among clusters.

Pairwise differentiation and its significance (computed as Weir and
Cockerham's Fst) between clusters was also estimated using 100 permutations.

| Contemporary gene flow estimation
To estimate contemporary gene flow between clusters identified by ADMIXTURE and DAPC (which also conform to their sampled geographical regions), we performed a Bayesian Markov Chain Monte Carlo (MCMC) analysis using BayesAss3-SNPs (Mussmann et al., 2019), which implements an extension of BayesAss v.3.0 (Wilson & Rannala, 2003) for large SNP datasets.We used 10 million iterations, discarding 1 million as burn-in, and sampled every 1000 iterations for estimates of posterior densities of migration estimates.
Sufficient mixing of chains and convergence of MCMC were determined using TRACER v1.7.1 (Rambaut et al., 2018).Modes of posterior density estimates were obtained and compared across three runs (with different random number seeds).

| Ancestral demographic model testing and gene flow estimation
Building on the resolution of genetic clusters, we used two approaches to infer population history and test models of gene flow across the range of H. convergens.First, we performed demographic model testing using a composite-likelihood modeling approach in fastsimcoal2 v2.5.2.21 (fsc2; Excoffier et al., 2013), which uses the site frequency spectrum (SFS) to infer demographic parameters.To construct the folded joint SFS for each population pair we used the dadi.Spectrum.from_data_dictcommand in dadi v1.6.3 (Gutenkunst et al., 2009).STACKS data filtering required an SNP to occur in ≥50% of individuals from each population, so the SFS was then downprojected to 50% to ensure the same number of gene copies were sampled for all loci.SFS entries with less than 10 SNPs were pooled (-C10 option in fsc2).Invariable sites were not included in the SFS, so to scale estimation of other parameters from the SFS we fixed the ancestral population size to be 714,000 as calculated directly from our data (Excoffier et al., 2013;Lanier & Knowles, 2015; Supporting information for details).
Population divergence models were constructed using the three population clusters identified in DAPC and Admixture analyses.A total of nine different models were generated covering various histories of population divergence, with and without migration, and with population sizes allowed to vary over time (Figure S1).In addition to these three population models, we also tested pairwise population models, which reduced the number of estimated parameters and accounted for working with relatively small numbers of loci.Parameters estimated from models included the effective population size (N e ) of each population, the divergence time for each splitting event (T DIV ), and migration rates (m).We assumed a generation time of 1 year, which is typical for H. convergens except when environmental conditions and prey availability allow for a second generation, when converting estimates to years (Hagen, 1962;Hodek, 2012;Nedved & Honek, 2012).Defined parameter ranges were uniformly distributed with N e ranging from 10 to 10,000 and T DIV from 10 to 10,000.A minimum of 100,000 simulations were performed to estimate the SFS, with a minimum and maximum of 10 and 100 loops (ECM cycles).The stopping criterion, defined as the minimum relative difference in parameters between two iterations, was set to 0.001.A total of 100 replicate runs were performed per model and the overall maximum likelihood (ML) run was retained for parameter estimation.The relative likelihood across compared models was generated using Akaike Information Criteria (AIC) as outlined in Excoffier et al. (2013).Confidence intervals for parameters in the best-supported model were obtained with a parametric bootstrap approach by simulating 100 SFS with the same number of SNPs from the ML point estimates.
Second, to estimate demographic parameters (effective population sizes, divergence times, and migration rates) under the most supported population tree from runs of fastsimcoal2, and to test scenarios of augmentation and gene flow between sampled populations of H. convergens, we utilized the parallel MCMC method implemented in IMa2p (Hey & Nielsen, 2007;Sethuraman & Hey, 2016).IMa2p analyses use sequence data as input instead of SNP genotypes, so to perform these analyses we used a data set comprising 165 unlinked, and freely recombining-within loci (145 bases each).Assuming a population tree based on the sequence of branching events in the best-fitting model identified in fsc2, we performed a long run of the M mode (MCMC) with 150 chains distributed across 15 CPUs on the University of Kentucky's Lipscomb High Performance Computing Cluster.The MCMC was allowed to burn-in for a period of 16 h, and genealogies were sampled thereafter for 32 h after burn-in.Upper limits on prior values of all parameters were set based on the recommendations of Hey (2011) and adjusted (along with the length of runs) based on several short runs to ensure that the chains were mixing, swapping sufficiently, and reached stationarity prior to sampling.Marginal posterior density estimates of all parameters and 95% confidence intervals were computed on demographic scales by using the Drosophila melanogaster mutation rate of 3.5 × 10 −9 per site per generation (Keightley et al., 2009).To test alternative models of gene flow, an additional L (likelihood ratio test) mode run was performed based on computing joint posterior densities by the method of Hey and Nielsen (2007) and Nielsen and Wakeley (2001).
Three alternate demographic scenarios were tested against the fully nested model containing all bi-directional gene flow parameters be-

| ddRAD dataset and summary statistics
We generated a total of ~215 million paired sequence reads, with an average of 2,239,182 ± 1,244,709 reads per individual (Table S2).
After excluding eight individuals with high missing data and filtering data as outlined above, we generated a data set comprising 9824 bi-allelic SNPs (one SNP sampled per locus).After screening for loci with MAF < 0.05, Hardy-Weinberg Equilibrium (p < 0.05), missingness, linkage disequilibrium, multi-allelic sites, we also constructed a dataset with 1175 bi-allelic SNPs across a total of 78 individuals from 17 sampling locales.No sites were identified across the 1175 loci as Fst outliers by OutFLANK, and therefore no further filtering was performed (Figure S11).Generally, population genetic diversity as measured through summary statistics was comparable among all sampling sites (Table 1).All further analyses of genetic structure and population differentiation were performed with (a) the untrimmed dataset with 9824 SNPs, (b) with the filtered dataset with 1175 SNPs, and (c) with the filtered dataset with 1175 SNPs, but removing individuals that were sampled from the 1990s (to prevent temporal bias) to ensure concordance.

| Genetic structure and population differentiation
DAPC analysis assigned individuals to three genetic clusters (Figure 2a) corresponding to (1) Arizona + California sites (hereafter referred to as the "western" cluster), ( 2) sites from the central and eastern U.S. (east of the Rocky Mountains, hereafter referred to as the "eastern" cluster), and (3) a cluster comprising the Chilean site.The two commercially ordered populations from Arizona and California clustered with western populations.Individuals were assigned to their respective cluster with high probability.These geographic genetic clusters were genetically distinct from each other, with no overlap in ordination space.
ADMIXTURE analyses produced similar results (Figure 2c).Pairwise F ST values were not significantly different from zero for comparisons among sites within the eastern and western clusters (Figure S2; Table S3).However, F ST values were significant for comparisons among sites between the three identified genetic clusters (F ST = 0.13-0.49,p < 0.05), with the Chilean population identified as the most differentiated from Southern California (Riverside County).

| Ancestral demographic inference
Model testing in fsc2 favored a topology (model 2) with an early split between the eastern and western + Chile populations, followed by  a subsequent split between the western and Chile population, and with the full set of migration parameters between both ancestral and contemporary populations (Figure 3; Tables 2 and 3).Model 2 was slightly favored over a full migration model that identified a more recent common ancestor between the eastern and western populations, with a more divergent Chilean population (Model 3; ΔAIC = 0.78).There was little support for a full migration model that identified a more recent common ancestor between the eastern and Chilean populations (Model 1; ΔAIC = 88.69).All evaluated models with a more restricted set of migration parameters (or no migration) received little support (ΔAIC > 141).This included models consistent with biocontrol efforts, in which gene flow moved unidirectionally from western populations into both eastern and Chilean populations.
Bayesian MCMC-based inference of demographic history under the most supported population tree from fastsimcoal2, using an isolation with migration model with IMa2p (Sethuraman et al., 2015) estimated large bidirectional historical gene flow between eastern and western populations, and eastern and Chilean populations.No significant migration was estimated in either direction between the western and Chilean populations.Estimates of effective population sizes (population mutation rates) were concordant with fastsimcoal2, with the eastern and western populations estimated to be several orders larger than the Chilean population.Common ancestral populations were also estimated to be relatively small, indicative of a recent (post-divergence) population size increase in both eastern and western populations.In short, there is significant bidirectional historical migration between the eastern and western populations, and the eastern and Chilean populations (Table 4).Demographic estimates of effective population sizes, divergence times, and scaled migration rates were comparable between the results of fsc2, and IMa2p, with predominantly overlapping confidence intervals (Table 3), with relatively large eastern and western populations, compared to Chile.

| Contemporary gene flow
Estimates of contemporary gene flow using BayesAss3-SNPs indicated that the majority of gene flow occurred within identified clusters (Table 5), with a negligible fraction of migrants within each cluster classified as migrants.All three clusters had inbreeding coefficients of ≥0.15, indicating the presence of non-random mating within clusters, but little to no contemporary gene flow across clusters.

| DISCUSS ION
The goal of this study was to investigate the impact of large-scale augmentative releases of H. convergens on its population structure and demographic history.Interestingly, our results indicate that releases with corresponding high gene flow from western into eastern populations have had little impact on population genetics of the species to date, with strong population structuring across the range and little signal of admixture of western genotypes into eastern populations.We discuss these findings in light of biological control and their implications for range wide population structure.

| Genetic structure in US
Our results suggest historically strong genetic divergence between eastern U.S., western U.S., and Chilean populations of H. convergens.Estimates of contemporary gene flow indicated little to no continuing gene flow between our identified clusters.Meanwhile, demographic modeling of ancestral processes revealed population expansions of eastern and western populations with high gene flow from west into east, and a small, isolated Chilean population.
We used two demographic inference approaches to explore ancestral population history (fastsimcoal2 and IMa2p), with comparable parameter estimates generated between models.However, uncertainty is associated with absolute values of parameter estimates from both programs which were scaled with a mutation rate from Drosophila melanogaster, and should be interpreted with caution.Instead, we focus on general patterns, which are corresponding between programs.Additionally, we acknowledge that the sampling design has a disproportionate temporal distribution,  Both IMa2p and fastsimcoal2 analyses indicate asymmetrical gene flow between all population pairs.For fastsimcoal2 confidence intervals for migration were wide and often encompass zero, indicating low resolution of these parameters.We suspect that identification of migration may reflect low power to distinguish between models of isolation using a low number of SNPs, as simulation studies have found for IMa2 (Carstens et al., 2009;Hey et al., 2015).Population expansion over time is apparent in all analyses.These results are in contrast to an earlier microsatellite study, which detected admixture of western genotypes into eastern populations (Sethuraman et al., 2015).We speculate that the lack of west-east admixture documented in this current study may be related to the timing of releases of western populations of H. convergens for biological control when densities of local H. convergens are relatively low.Additionally, the longevity of these released beetles might be relatively short due to their storage at low temperatures and a percentage of the western H. convergens females may have already mated at their aggregation sites prior to mass collection, storage at insectaries and release.These results are especially interesting, considering that H. convergens is highly mobile, having been found to travel hundreds of miles (Hagen, 1962).Formation of overwintering aggregations of adults occurs in late summer, with as many as 10 6 beetles converging on  184,264 (3,485-273,352) 73,964 (39,447-108,481) 2,524,867 (465,210-3,157,104) 3,382,642 (2,593,688-3,372,781) 1,653,579 (278,416-2,234,104) 2,440,828 (2,115,384-2,948,718) 76,408 (12,024-100,239) 32,544 (24,654-42,406) Nm

| Chilean population founding
We found the Chilean population has a much lower effective population size than eastern and western U.S. populations.This supports the hypothesis that the Chile population was recently introduced using commercially harvested populations from the western USA (Hagen et al., 1976).However, our estimates of re-

| Impact of agriculture/Bio-control
Results from both IMa2p and fastsimcoal2 suggest that effective population size has increased over time across all structured populations of the species and that the contemporary effective population sizes are large.This is contrasting to the previous microsatellite study finding population declines (Sethuraman et al., 2018), and speculation that native population of H. convergens has decreased after initialization of European agricultural processes, or in response to competition from invasive non-native predatory coccinellids, including the considerably larger Eurasian seven-spotted beetle, Coccinella septempunctata, and the Asian harlequin lady beetle, Harmonia axyridis.However, it is important to note that the large effective sizes that characterize both eastern and western U.S. populations limit our ability to detect recent changes in population size.It is therefore possible that despite our results, populations could be currently in decline.The data collected here may therefore serve as a valuable baseline for genetic monitoring into the future (Schwartz et al., 2007).

| Importance for biological control
An appreciation of the evolutionary history of species that are manipulated by humans may be important for our understanding of the benefits, limitations, and consequences of our activities.Following the large-scale changes in North American agroecosystems during the past 500 years, the native natural enemy fauna (e.g., arthropod predators) colonized these new agricultural systems to find prey.Thus, many species of North American insect predators have adapted to introduced agricultural crops from Europe and Asia, but our understanding of the genetic bases of these adaptations is limited.
Augmentative releases of H. convergens represent a unique example of biological control and allow for the examination of several potential ecological and genetic nontarget effects of these releases on other predatory species and local populations of H. convergens (Saito & Bjornson, 2008, 2013).Many augmentative releases of H. convergens have appropriately targeted pest species in agricultural systems in California and Arizona (Dreistadt & Flint, 1996;Flint & Dreistadt, 2005;Hagen, 1962;Hagler, 2009).But, for at least the past 75 years, field collected H. convergens from the western USA have been sold and released east of the Rocky Mountains.
Our analysis of SNP variation of North American H. convergens indicates that historical events (e.g., glaciation, introduction of western European agriculture) and not recent human movements of western U.S. populations of H. convergens explain the current distribution and population structure of this predatory insect species.
Thus, the current distribution and population structure of H. convergens in North America can be explained from historical events, our analyses showed that annual augmentative releases are not affecting local populations.Our previous studies documented plasticity in the response of eastern and western populations of H. convergens to aphid prey with no hybrid vigor or heterozygote advantages in hybrids (Grenier et al., 2021).Based on our results, we tentatively propose that augmentative releases of western U.S. populations of H. convergens do not have adverse non-target genetic effects on eastern U.S. populations.However, the potential long-term ecological non-target effects of distributing pathogens and parasitoids within adult H. convergens from western collection sites on eastern populations of ladybeetle populations remains to be determined (Bjornson, 2008;Sato & Bjornson, 2008).All estimates were performed using BA3-SNPS (Mussmann et al., 2019), which extends the method of BayesAss v.3.0 (Wilson & Rannala, 2003).

Eastern
diversity and population genetic structure of H. convergens across its native range in the United States, and an introduced population in Chile.The aims of this study were to (1) examine the population structure of H. convergens across its range, (2) compare levels of genetic diversity, and (3) examine the population history including human-mediated and historic gene flow, population size history, and timing of population founding.Based on previous studies and history of population augmentation, we hypothesized: (1) admixture of western U.S. genotypes into eastern U.S. populations, (2) increases in contemporary gene flow, consistent with augmentation, and (3) a founder effect with corresponding low genetic diversity of the Chilean population. 2 | ME THODS 2.1 | Sampling, ddRAD Sequencing, and SNP genotyping We collected genome-wide SNP data from 83 H. convergens individuals sampled from 16 sites broadly distributed throughout their known US range, including 18 individuals from two commercial suppliers in Arizona and California (Arbico and Rincon-Vitova, respectively), and 8 individuals sampled from one site in Chile representing a putative recently established population in South America (Figure 1; Table Summary statistics were generated using the populations script in STACKS and with the program Arlequin v3.5.2.2 (Excoffier & Lischer, 2010).For each population we calculated observed F I G U R E 1 Geographical map of Hippodamia convergens sampling locations utilized in this study.Numbers in parentheses indicate the total number of sequenced samples included in downstream analyses from each population.
tween ancestral and contemporary populations: (1) a model of no immigration from eastern U.S. into western U.S. populations, (2) a model of no migration from Chile into the western U.S. populations, and (3) neither migration from Chile or eastern populations into the western populations.
Cross-validation indicated K = 2 (c.v.error = 0.4920) as the optimal model with an eastern cluster and western + Chilean cluster.A K = 3 (c.v.error = 0.4922) model split the western and Chilean populations into separate clusters.Models of K ≥ 4 (c.v.error > 0.5214) produced new clusters that primarily created admixed individuals within the eastern and western populations with no further geographic partitioning, indicating that a K = 3 best represents the uppermost level of hierarchical geographic genetic structure across our study system.ADMIXTURE and DAPC analyses with the untrimmed dataset of 9824 SNPs and with 1175 SNPs with sampling locales from 1990s removed also indicated support for the presence of population structure at K = 2 or K = 3 (Figures S5-S10), congruent with the above findings.AMOVA analyses estimated that a majority of variation (59.96%,SS = 9255.21,df = 139) in the SNP data was explained by variability among individuals within all sampling sites.38.51% of variation (SS = 4093.56,df = 2) was captured by variability among eastern, western, and Chilean clusters, while variation among populations within each cluster only contributed 1.53% of the total variation (SS = 1126.922,df = 14).

F
Best fitting topology and evolutionary model from the fsc2 analyses, showing an early split of the eastern population of Hippodam convergens from the common ancestor of the western and Chilean populations, with some degree of continued asymmetric gene flow between them after divergence.

TA B L E 2
Relative likelihood of the assessed models of population history for Hippodamia convergens.Full descriptions of each model are provided in FigureS1.comprising samples from 1992 to 2015-with univoltine clutches, this could very well mean at least 23 recent generations between sampled individuals.Considering a recent study by Sethuraman et al. (2018) that estimated a ~12-fold decline in effective population sizes in the recent (4-5 generations) past, the strong patterns of differentiation between eastern and western populations could potentially be additionally driven by the effects of drift in small, declining populations, apart from the apparent lack of gene flow and admixture.
cent (contemporary) gene flow using BayesAss indicate very little degree of continuing gene flow between the Chilean and mainland USA populations.These observations were also complemented by summary statistics (F ST ) and population structuring programs (DAPC, ADMIXTURE), which detected strong population structure with little to no admixture between eastern and western populations.This suggests that commercially released western collected individuals may not be successfully breeding and establishing gene flow in the East.Isolation of the Chilean population was evident, with many pairwise FST values >0.40 (p < 0.05).This strong genetic population differentiation may be largely due to drift within the Chilean population, post-founding.
This work was supported by NSF ABI 1564659, NSF CAREER 2042516 to AS.This work was funded by the National Institute of Food and Agriculture, U.S. Department of Agriculture, Hatch TA B L E 5 Mean contemporary migration rate estimates (fraction of individuals that are migrants from the cluster on the left to the cluster on the top), and standard deviations around the mean shown in parentheses between the three identified clusters of Hippodamia convergens (eastern, western, and Chilean).
Summary statistics as calculated in the populations script in STACKS and adegenet with 17 populations and 9824 SNPs.N = average number of individuals genotyped at each locus, private = number of variable sites unique to each population, H obs = observed homozygosity.
Parameter estimates in fsc2 were obtained as maximum likelihood estimates with CIs obtained by parametric bootstrapping.Parameter estimates in IMa2p were obtained as marginal posterior density estimates with CIs estimated as 95% highest posterior density intervals.
a Note that the Ancestral Ne was fixed in fsc2 analyses.TA B L E 4Likelihood ratio tests of different demographic models of H. convergens evolution, indicating support for "model 2" with significant bidirectional asymmetric historical migration rates.