Genomic signatures of admixture and selection are shared among populations of Zaprionus indianus across the western hemisphere

Abstract Introduced species have become an increasingly common component of biological communities around the world. A central goal in invasion biology is therefore to identify the demographic and evolutionary factors that underlie successful introductions. Here we use whole genome sequences, collected from populations in the native and introduced range of the African fig fly, Zaprionus indianus, to quantify genetic relationships among them, identify potential sources of the introductions, and test for selection at different spatial scales. We find that geographically widespread populations in the western hemisphere are genetically more similar to each other than to lineages sampled across Africa, and that these populations share a mixture of alleles derived from differentiated African lineages. Using patterns of allele‐sharing and demographic modelling we show that Z. indinaus have undergone a single expansion across the western hemisphere with admixture between African lineages predating this expansion. We also find support for selection that is shared across populations in the western hemisphere, and in some cases, with a subset of African populations. This suggests either that parallel selection has acted across a large part of Z. indianus's introduced range; or, more parsimoniously, that Z. indianus has experienced selection early on during (or prior‐to) its expansion into the western hemisphere. We suggest that the range expansion of Z. indianus has been facilitated by admixture and selection, and that management of this invasion could focus on minimizing future admixture by controlling the movement of individuals within this region rather than between the western and eastern hemisphere.

range(s). For example, they may compete with or prey upon native species (Brown et al., 2002;Wanless et al., 2007). Introduced species can also pose economic threats, act as crop pests, and/or vector disease (Pimentel et al., 2005). Understanding the demographic and evolutionary processes that underlie successful biological introductions is therefore an important challenge in modern evolutionary and conservation biology.
Studies of introduced species have identified a number of demographic or evolutionary processes that may contribute to successful introductions (Allendorf & Lundquist, 2003;Whitney & Gabler, 2008). For example, multiple colonization events and hybridization (and subsequent admixture) between differentiated lineages (i.e., populations or species) have both been identified as processes that contribute to genetic diversity within introduced populations (Facon et al., 2008a). Sources of genetic variation within introduced populations have been of broad interest because genetic variation is likely to be important for populations to respond to novel biotic and abiotic selective pressures that they experience in their introduced range. However, multiple colonization events and admixture can also generate novel genotypes that show low fitness when divergent alleles interact negatively, such as when individuals display outbreeding depression (Chapman et al., 2009;Frankham et al., 2011). Processes that increase genetic diversity within introduced populations and the fitness consequences of that genetic diversity need to be understood in order to fully appreciate adaptation (or maladaptation) in introduced species.
Research using phenotypic data has shown that species can rapidly adapt to novel environments they experience in their introduced range (Colautti & Barrett, 2013;Lee, 2002;Prentis et al., 2008). However, it is also possible that adaptations within introduced populations occur before their introductions, conferring traits that help facilitate the successful colonization of new geographic regions. This scenario of adaptation has been termed "anthropogenically induced adaptation to invade" to describe situations where populations adapt to anthropogenic environments (e.g., cities, farmland, or orchards) encountered in their native range prior to range expansion (Hufbauer et al., 2012). Understanding both the shared colonization history and local adaptation within geographically widespread populations of introduced species is therefore central to our understanding of biological introductions and our ability to devise appropriate strategies for their management.
Understanding the geographic origins of introduced populations, alongside the evolutionary processes that operate upon them (e.g., admixture and local adaptation), is also central to our ability to effectively manage biological introductions. For introductions that are geographically widespread, knowledge of shared versus independent aspects of their colonization and evolution will inform whether a single management strategy can be applied across broad geographic regions or whether each introduced population needs to be treated as a unique case. For example, researchers could monitor the movement of individuals between specific sets of countries to limit the opportunity for admixture or the movement of adaptive alleles among regions.
Genetic data are a powerful tool that can be used to estimate demographic and evolutionary events associated with introduced populations or range expansions (Barker et al., 2017;Bock et al., 2015;Dlugosch & Parker, 2008;Fraimout et al., 2017;Kolbe et al., 2004;Lee, 2002;Olazcuaga et al., 2020). Indeed, genetic data have been used to generate insights into biological invasions that include identifying multiple colonization events and admixture within introduced populations (Barker et al., 2017;Dlugosch & Parker, 2008;Facon et al., 2008a;Gibson et al., 2020;Kolbe et al., 2004;Michaelides et al., 2018;Simon et al., 2020). Genome-scale data can also be used to identify regions of the genome, and candidate genes, with evidence for selection either between native and introduced populations or among populations within species' introduced ranges (Campbell-Staton et al., 2020;Olazcuaga et al., 2020). However, demographic analyses of introductions are typically conducted separately, and using independent data, from studies exploring selection (Johri et al., 2020). Whole-genome sequence data provides a means to jointly explore demographic and selective processes operating within introduced populations, for example, by testing phylogenetic relationships among populations at regions of the genome with evidence of selection compared to putatively neutral regions of the genome. The parallel study of demography and selection in introduced populations has the potential to yield novel insights because it could be used to estimate the geographic or evolutionary origins of alleles that underlie traits that facilitate biological introductions or drive adaptation to novel environments (Calfee et al., 2020). Information on the origins of introductions, from the level of individuals to adaptive alleles, could then be used to inform approaches to mitigate the negative impacts or spread of introductions, in situations where that is the desired outcome (Oduor et al., 2015;Viard et al., 2020).

| 6195
COMEAULT ET AL. Comeault et al., 2020). Current evidence therefore suggests that individuals sampled in northern North America may be migrants that colonize orchards in the relatively warm summer months, and that the range expansion in this region is ongoing. The recent and potentially dynamic nature of Z. indianus's range expansion into North America highlights the need to better understand genetic relationships among populations and whether those populations have locally adapted to environments in their introduced range. Here, we used whole-genome sequences collected from native and introduced populations of Z. indianus to test alternate colonization scenarios, quantify genetic relationships among populations, and explore the geographic (and temporal) dynamics of admixture and selection acting in this species. Our results show that Z. indianus in the western hemisphere are genetically differentiated from Z. indianus found across geographically distant locations in Africa and share a pattern of mixed ancestry across their genomes, suggesting an admixture event that predated (or occurred early during) their introduction into the western hemisphere. We also find a strong signal of selection shared across introduced populations in the western hemisphere, consistent with a scenario of either shared selection across introduced populations, or selection in the ancestor of the lineage that went on to colonize regions in the western hemisphere.

| Population sampling, sequencing, and genotyping
We analysed genome sequences generated from 67 Z. indianus sampled from four locations in their introduced range (Medellín, Colombia, n = 4; Eastern USA, n = 25; Hawaii, USA, n = 4; India, n = 1) and from four locations across their native range in Africa (Zambia: n = 6; Kenya: n = 7; Senegal: n = 14; São Tomé: n = 6; Table   S1). We also analysed one Z. gabonicus and 9 Z. africanus collected from two locations in Africa (São Tome and Kenya). Zaprionus gabonicus and Z. africanus are the two most closely related species to Z. indianus, and each of these species are reproductively isolated from one another (Yassin et al., 2008). A total of 56 of the genomes in our data set were previously analysed to study levels of genetic diversity across different populations and species of Zaprionus (Comeault et al., 2020), and one is from publicly available data used to generate a draft genome assembly for Z. indianus collected in India (Khanna & Mohanty, 2017). Here we add four genome sequences from a population in South America (Medellín, Colombia) and 14 genomes from five locations in the eastern United States (Florida, Pennsylvania, New Jersey, New York, and North Carolina; see Table   S1). All sequences other than the Z. indianus sample from India were generated from single individuals that were either wild-caught and preserved in ethanol or collected as a first-generation offspring of a wild-caught female. Individuals were sequenced to a mean depth of 20-49× using paired-end Illumina reads as described in Comeault et al. (2020). The sequence for Z. indianus from India was generated from two males from a single isofemale line that was collected in Punjab, India (Khanna & Mohanty, 2017).
Raw sequence data was initially parsed and barcodes were removed by the University of North Carolina's high-throughput sequencing facility. We used the bwa mem algorithm (v0.7.15) to map reads, for each individual, to a previously published Z. indianus reference genome generated from an isofemale line established from a female collected in Florida in 2014 (Comeault et al., 2020).
In species with heterogametic sex chromosomes (e.g., XY or ZW sex determination), the smaller effective population size of the sex chromosomes can exacerbate the impact that genetic bottlenecks have on genetic diversity (Belleghem et al., 2018;Pool & Nielsen, 2007). Lower diversity on the X can, in turn, lead to elevated estimates of genetic differentiation along the sex chromosomes (Cruickshank & Hahn, 2014). It is therefore particularly important to consider the genomic location of differentiation in introduced species, as demographic changes (e.g., bottlenecks) are frequently associated with biological introductions (Frankham, 2005;Lee, 2002). To facilitate comparisons between the autosomes and the X chromosome, we used coverage to identify putative-X and putativeautosomal scaffolds in the reference genome used for this study.
We calculated normalized mean sequencing depth (i.e., sequencing depth per scaffold divided by mean sequencing depth across all sequenced sites) across scaffolds for five male and four female Z. indianus, and identified X scaffolds as those in which normalized mean sequencing depth was <0.85 for at least four of the five males and >0.85 for at least three of the four females (for details regarding the choice of threshold see Supporting Information and Supporting Information Data; Comeault et al., 2021). Scaffolds smaller than 100 kb were deemed too small to make reliable assignments and scaffolds with a mean normalized read depth >1.5 in at least seven out of the nine samples (independent of sex) were deemed putative repeat elements and not further assigned. Finally, we aimed to identify scaffolds belonging to the Y, defined as those with a mean normalized coverage of greater than 0.25 in at least four of the five males and <0.25 in at least three of the four females. This approach did not detect any putatively Y scaffolds. Our approach resulted in 94% (137.2 Mb) of the assembly being assigned to autosomes or the X chromosome, with 24.6% (33.7 Mb) being identified as belonging to the X chromosome (7.7 Mb were located on scaffolds that were too small and 0.8 Mb on scaffolds with abnormally high coverage).
Our estimate of X chromosome size is within the range of X chromosomes of other Drosophilid flies (e.g., ~23% to 68% of the genome belong to the X for Drosophila melanogaster, D. busckii, and D. pseudoobscura; searched on https://www.ncbi.nlm.nih.gov/genom e/).
In addition to the genome-wide SNP data set, we genotyped each individual's mitochondrial genome. Because mitochondrial haplotypes are maternally inherited and do not undergo recombination, they can be utilized to estimate the number of colonization events occurring during range expansions associated with biological invasions (Facon et al., 2008b;Kolbe et al., 2004;Michaelides et al., 2018). We mapped raw sequence reads, for each individual, to an assembled mitochondria of Z. indianus (assembled from same isofemale line from Florida used for the reference genome described above; Supporting Information Data) following the same procedure described above for the nuclear genome. Using the processed BAM files, we generated a mitochondrial genome sequence, for each individual, using the -doFasta option in angsd (Korneliussen et al., 2014).
When calling mitochondrial sequences, we filtered reads with map quality <30 and mapping base quality <18 (-minMapQ and -minQ options, respectively).

| Population structure
In 2008 Yassin and colleagues generated mitochondrial haplotype data from CO-I and CO-II genes sequenced from 23 Z. indianus populations sampled from Florida, eastern South America, Africa (including Madagascar), and India, but did not find any evidence of population structure associated with geography (Yassin et al., 2008).
However, we previously identified relatively strong genetic differentiation among Z. indianus populations in their native and introduced ranges using whole-genome data (F ST = 0.14-0.19; Comeault et al., 2020). Here we leverage whole-genome sequence data to further explore relationships among Z. indianus populations. We first conducted a set of analyses to quantify population structure across our samples. Specifically, we carried out principal component analysis was run using the -admix method with default values on genotype likelihoods estimated with angsd using the option "-SNP_pval 1e-6" (GATK method; Korneliussen et al., 2014;McKenna et al., 2010). snphylo was run on hard-filtered SNPs using a linkage disequilibrium (LD) threshold of 0.2 to thin sites and allowing a maximum of 15% of individuals to be missing genotype information at a given site (−l and −M options in snphylo, respectively). astral was run using 7085 gene trees that were randomly sampled from 14,165 gene trees constructed from alignments of phased 500-SNP nonoverlapping genomic windows generated along the 40 largest scaffolds of our assembly (61.2 Mb or ~42% of the reference genome sequence; Supporting Information). Gene trees were constructed using raxml (v8.2.4; (Stamatakis, 2014)) with optimization of substitution rates under the GTR + GAMMA substitution model and 20 runs on distinct starting trees (see Supporting Information and Supporting Information Data for details).
We also estimated phylogenetic relationships among individuals based on mitochondrial genome sequences. We aligned mitochondrial genomes using the mafft aligner (v7.407) and estimated a mitochondrial tree using raxml (v8.2.4; (Stamatakis, 2014)) with optimization of substitution rates under the GTR + GAMMA substitution model and 20 runs on distinct starting trees. Support was estimated for internal branches of this tree using 100 bootstrap replicates.
Finally, to allow comparisons between autosomes and the X chromosome, we calculated nucleotide diversity within nonoverlapping 5 kb genomic windows for each Z. indianus population using vcftools (v0.1.15; (Danecek et al., 2011)) and F ST between populations using angsd (v0.920). When calculating F ST , we first estimated the site frequency spectrum (SFS) and genotype likelihoods, for each population, using Z. africanus as the outgroup, and filtered SNPs using options "-minMapQ 1", "-minQ 20", "-setMinDepth N*6", and "-setMaxDepth N*60" options, where N is the number of individuals sampled from a population. We then estimated all pairwise 2D-SFS using angsd's "realSFS" tool, and F ST in 5 kb genomic windows using angsd's "realSFS fst index" and "realSFS fst stats2" tools (Supporting Information Data). We compared median estimates of nucleotide diversity and genetic differentiation within genomic windows on the X chromosome and autosomes for populations in Z. indianus's native range, in their introduced range, and between their introduced and native ranges.

| Fine-scale relationships across the genome
Results from the analyses of population structure described above clearly identify genetic differentiation among Z. indianus populations at two scales: between populations in the western hemisphere and Africa, and among populations within both native and introduced ranges (see Results). We next explored the relative support for relationships among east African, west African, and introduced (i.e., western hemisphere and India) lineages, across the genome, using "topology weighting by iterative sampling of subtrees" (twisst ;Martin & Belleghem, 2017). This approach allowed us to quantify phylogenetic relationships (or shared ancestry) between a focal introduced population (eastern USA, Colombia, Hawaii, or India) and west and east-African lineages. We estimated topology weights for the three and west African lineages are more closely related to each other than either are to the east African lineage (hereafter referred to as "west African ancestry", with respect to the introduced lineage), the introduced and east African lineages are more closely related to each other than either are to the west African lineage ("east African ancestry"), or the two African lineages are more closely related to each other than to the introduced lineage ("diverged"), respectively.
We estimated relative weights for each of the three topologies populations we included all individuals (Hawaii, Colombia, and India) or subsampled seven individuals (Senegal) or 12 individuals (eastern USA). We ran twisst using the "fixed" method with support for the alternate topologies (i.e., topology weights) estimated from a random sample of 600 subtrees per window. We summarized weights as supporting a given topology if more than 50% of sampled subtrees supported that topology. All other windows were classified as "ambiguous".
Under a scenario where the introduced populations were independently colonized by different populations from the native range, we do not expect patterns of ancestry across the genome to be shared across introduced populations. To test this prediction, we summarized ancestry information for each genomic window, for each introduced population, as the difference between topology weights for topology 1 (shared west African ancestry) and topology 2 (shared east African ancestry). We then calculated correlations in this "ancestry score" between all pairs of focal introduced populations.
In addition to the window-based approach implemented with twisst, we estimated sharing of derived alleles among populations of Z. indianus with treemix. Unlike the three-population window-based approach we used when running twisst, treemix allows us to test for multiple admixture events and across all populations simultaneously.
We first used treemix to infer maximum likelihood ancestry graphs with and without migration among Z. indianus populations using allele frequency estimates from each population we sampled. We specified Z. africanus as the outgroup, grouped SNPs in windows of 500 to account for nonindependence among adjacent SNPs (-k option in treemix), and ran treemix allowing for 0-5 migration events across the tree. . We computed f 4 statistics holding Z. africanus as the outgroup ("A" taxon in the aforementioned tree). Because treemix analyses utilize information from allele frequencies within each population, we did not analyse the Z. indianus sample from India because this sequence was derived from two males collected from a single isofemale line.

| Demographic modelling
We took a demographic modelling approach to explore support for different demographic events occurring during Z. indianus's expansion into the western hemisphere. We first computed SFS for populations sampled in São Tomé, Zambia, and the eastern USA using angsd (v0.920) with options "-domajorminor 1 -gl 1 -domaf 1 -dogeno 3 -doCounts 1 -dopost 2 -doHWE 1 -minHWEpval 0.01 -minMapQ 20 -minQ 20 -doSaf 1". The two African populations were chosen to reflect the two differentiated lineages we sampled in African and the population from the eastern USA was chosen because we had the largest sample size from this population (24 individuals) and this population represents the most recently colonized in the western hemisphere. SFS were polarized using Z. africanus (-anc option in angsd). We then estimated the joint-SFS for the three focal populations using angsd's "realSFS" function.
We inferred demographic parameters and model-fit under 8 different demographic scenarios using the unfolded joint-SFS and fastsimcoal2 v.2.6.0.3 (Excoffier et al., 2013). Three demographic scenarios allowed for a split between the eastern USA population independent fastsimcoal runs, each optimizing parameters for 100 ECM cycles and estimating the expected SFS using 200,000 coalescent simulations (Supporting Information for details). We report AIC and inferred demographic parameter maximum-likelihood point estimates for the three runs that received the lowest AIC under each demographic scenario.

| Local and shared selection experienced during range expansion
Results from the analyses of genetic differentiation introduced above support a scenario where Z. indianus colonized the Americas from the same ancestral population and/or with a shared out-of-Africa demographic history. However, they do not preclude the possibility that different introduced populations are adapting to different local environments using different adaptive genetic variation. We used two approaches to test for shared versus independent selection acting across introduced populations of Z. indianus in the Americas (i.e., those from Colombia, Hawaii, and eastern USA). comparisons). These three levels of comparison allowed us to identify loci that show accentuated differentiation in a given introduced population relative to west-or east-African lineages, among introduced populations, and among native populations, respectively. We estimated PBS within nonoverlapping 5 kb windows using angsd (v0.920) as described above when estimating F ST (section 2.2).
Because this approach was based on windows, we analysed windows on scaffolds identified as putatively belonging to the X chromosome separately from those found along putatively autosomal scaffolds. Genomic windows with a PBS in the top 99% of values for windows found on X scaffolds or along autosomal scaffolds were classified as outliers. Windows that were outliers in more than one of the introduced populations (when compared against west and east African populations) were classified as "shared invasive outlier" windows. These windows have the strongest evidence of experiencing selection early during Z. indianus's expansion into their introduced range or being subject to parallel selection in multiple introduced populations. Genomic windows that experience selection within a single introduced population were identified based on having PBS in the top 99% of values in the analysis that included all three introduced populations (i.e., "within-introduced range" comparison). Finally, genomic windows with a PBS in the top 99% of values in a within-native range comparison were classified as windows experiencing selection in Z. indianus's native range. We tested whether the overlap in the number of outlier windows identified among these different comparisons was greater than expected by chance using randomization tests.
We also used baypass (v2.2; Gautier, 2015) and an independent test of evidence of shared selection across all introduced populations in the western hemisphere. baypass implements a Bayesian hierarchical model (Coop et al., 2010;Gautier, 2015) to estimate loci that show accentuated differentiation among populations and allows the user to specify covariables to test for associations between loci and those covariables. For this analysis, we used geography as a covariable (western hemisphere vs. Africa) and identified SNPs that showed allele frequency differences between the western hemisphere and Africa with the C 2 contrast statistic (Olazcuaga et al., 2020). We considered SNPs with an empirical p-value < .05 and FDR < 0.1 as the best candidates for being subject to selection between Z. indianus's introduced and native ranges (see Supporting Information for further details).

| Genotyping
Using whole-genome sequence data we identified 22,911,754 hardfiltered SNPs across all samples included in this study. As expected,

| Population structure
Principal component analysis (PCA), genetic clustering, and phylogenetic analyses all identified genetic differentiation between introduced and native populations (Figure 1). Among Z. indianus samples, the majority of genetic variation was structured between individuals from introduced and native parts of the species' range (PC2 accounts for ~8.6% of genetic variation; Figure 1a; PC1 separates the two species Z. indianus and Z. africanus and accounts for 39.0% of genetic variation [not shown]). Phylogenetic analysis with snphylo and astral provided further support for differentiation between introduced and native populations: all individuals sampled from the western hemisphere were more closely related to each other than to individuals sampled from Africa, and all African individuals were more closely related to each other than to individuals from the western Hemisphere (Figure 1c; Figure S4 for astral tree). Phylogenetic analyses also indicate differentiation between west African and east African populations (Figure 1c; Figure S4 for astral tree). haplotypes. This analysis also suggests a relatively recent expansion within Africa, with no notable phylogenetic clustering of mt haplotypes among individuals sampled from this continent (Figure 2a).
The single Z. gabonicus carries a mt haplotype similar to the Z. indianus from the western hemisphere. This individual also clusters with Z. indianus from the western hemisphere in analyses using genomewide data (Figure 1a,b). However, we do not know the exact collection location or history of the isofemale line that this individual is from, so do not analyse it further.

| Fine-scale relationships across the genome
We next took a phylogenomic approach to test relationships among populations across nonoverlapping genomic windows, each containing 500 SNPs. Topology weights provide evidence for ongoing lineage sorting and weak phylogenetic divergence across Z. indianus populations ( Figure 3). Specifically, the most common outcome of the TWISST analysis was that a window received ambiguous support (i.e., all three topologies shown in Figure 3a received a weight <0.5; range across comparisons: 31.36%-62.74% of windows; grey bars in Figure 3b). Ambiguous support could be the result of ongoing lineage sorting (i.e., a lack of differentiation or divergence) or a lack of sufficient variation to construct a reliable phylogenetic tree. We found that windows that received ambiguous support had, on average, 83.5 segregating sites, while windows that received support for one of the alternate topologies had an average of 87.9-93.2 segregating sites ( Figure S5). However, many windows classified as "ambiguous" contained a number of segregating sites similar to windows that received unambiguous topology weights, suggesting that ambiguous support was driven by a lack of differentiation rather than a lack of information ( Figure S5). We conducted additional analyses to test the hypothesis that populations in the western hemisphere share a genomic signature of a single admixture event that predated a shared range expansion across the western hemisphere (Section 3.4).
We analysed allele frequencies to estimate support for ad-  (Figures 4b and 3, respectively) therefore both support admixture among populations in Africa and mixed African ancestry within introduced populations in the western hemisphere.

| Demographic modelling
We conducted demographic modelling to test whether our data supported a demographic history including admixture over histories that lacked admixture ( Figures S1-S3). Across eight different demographic histories, each run under three different scenarios of population size change, those that modeled an admixture event were consistently preferred over those without admixture (Table S3).
Across the three scenarios of population size change, the one that allowed for exponential growth in each contemporary lineage was strongly preferred over scenarios where populations were either a constant size or experienced discrete changes in size. The overall best-fit demographic history was one where the North American lineage was colonized by the western African lineage after there was a large admixture event between the two African lineages (scenario F I G U R E 3 Widespread variation in phylogenetic relationships across the genomes of Zaprionus indianus. Vertical bars in (a) represent genomic windows and are coloured based on weights for each of the three possible four-taxon unrooted topologies (see inset). We find that, for the majority of windows, support for any one of the three topologies tended to be ambiguous (i.e., all topologies received support less than 0.5). (b) Counts of windows that supported different topologies, for each population tested. Blue bars show the number of windows that support shared west-African ancestry (topology 1), pink bars shared east-African ancestry (topology 2), orange bars "divergent" ancestry (topology 3), and grey bars where neither of the three tested topologies received a majority of topology weight ("topology" 4).  (Table S3). Under the best-fit demographic scenario, the timing of the split between the North American and African lineages was 603 generations ago (~35-60 years ago assuming a generation time between 3 and 5 weeks) and the timing of admixture between the two African lineages was 427,950 generations ago, with the African lineages experiencing ongoing gene flow throughout their history (Table S4).

| Selection among introduced populations of Z. indianus
We next estimated regions of the genome with evidence of selection in Figure S7-S13), and 74 (19.3%) and 2 (1.2%) were identified in all three populations (blue bars; Figure 5a), for windows on the autosomes and X chromosome, respectively. We generated expected numbers of overlapping outlier windows by randomization (100,000 replicates) and found that for all comparisons made above, the observed number of overlapping windows was greater than expected by chance (all empirical p-values < .0001; Figure S14). Only the situation where 2 windows were shared among all three populations was recovered during randomization, and this was only in 1 of the 100,000 independent samples.
We also identified 66 (autosomal) and 13 (X chromosome) windows that were outliers in at least one of the three between-range comparisons and at least one of the two within-native range comparisons (17.2% and 7.5% of unique outlier windows; orange bars in Figure 5a; empirical p-value from randomization = 0; Figure   S14). These windows represent a shared signal of differentiation between populations in the western hemisphere and Africa, and among African populations. Finally, we found 92 (autosomal) and 8 (X chromosome) windows that were outliers in between-range and Analysis with baypass identified 98 SNPs as outliers, all of which were located in outlier windows identified in the PBS analysis. baypass outliers were distributed across 19 scaffolds. Nine of these SNPs are located on putatively autosomal scaffolds and 89 are on putatively X scaffolds.
We used topology weights inferred with twisst to estimate the origin of putatively adaptive alleles with evidence of selection in at least two of the between-range comparisons (i.e., 158 and 29 PBS outlier windows on the autosomes and X chromosome, respectively; blue-grey bars in Figure 5a). We focused on these windows because they represent the windows with the strongest evidence of being under selection in introduced populations in the western hemisphere, while still providing a reasonable number of windows to explore ancestry across. We found that the distribution of ancestry (i.e., topology weights) across these outlier windows was significantly different from the genomic background for all three populations in the introduced range in the western hemisphere (Pearson's Chi-squared tests; all p < .03; Figure 5b). Most notably, putatively selected genomic windows were much more likely to support a topology where the introduced population has diverged from both African lineages and less likely to be in a genomic region with ambiguous phylogenetic relationships (Figures 3 and 5b; orange and grey bars, respectively).
This pattern is not surprising given that the PBS relies on differentiation to identify putatively selected regions. We also found a stronger correlation in ancestry across autosomal outlier windows compared to nonoutlier windows in all pairwise comparisons between populations from the western hemisphere (Figure 5c), but this was only the F I G U R E 5 Evidence of shared selection among populations of Zaprionus indianus in the western hemisphere. (a) Overlap in outlier windows in different population comparisons. Grey bars show the number of unique windows with evidence of selection ("all unique") based on the PBS being in the top 99% for at least one of the comparisons between a population in the western hemisphere and African populations. Blue-grey and blue bars show the number of windows with evidence of selection in two ("shared two") or all three ("shared three") of the comparisons made between populations in the western hemisphere and those in Africa. Yellow and orange bars show the number of "all unique" windows that also display evidence of selection within Africa ("+Africa") or among introduced populations ("+Introduced"). (b) Ancestry estimates based on topology weights reported from twisst for genomic windows that overlapped a window with evidence of selection in at least two introduced populations in the western hemisphere ("shared two" category in panel (a); refer to Figure 3 for summaries of genome-wide patterns). (c) Correlation in ancestry estimates from twisst between populations in the western hemisphere ("within west. hem.") versus between populations in the western hemisphere and the sample from India ("west. hem. vs. India") for different genomic regions (2) (13)

| DISCUSS ION
Introduced species are becoming an ubiquitous feature of biological communities around the globe (Capinha et al., 2015;Helmus et al., 2014;Levine, 2008;Simberloff et al., 2013). As such, understanding the demographic and evolutionary origins of introduced species is important if we are to understand evolution and biodiversity in the Anthropocene. Research has shown how introduced populations can rapidly differentiate from populations in their native range and adapt to their introduced range (Dupuis et al., 2018;Koch et al., 2020;Ma et al., 2020;Stuart et al., 2021;Zayed & Whitfield, 2008); however, the origin of that adaptive variation is seldom known (however, see Calfee et al., 2020). In this manuscript, we leveraged genome sequences to provide evidence that introduced populations of Z. indianus found at geographically distant locations in the western hemisphere share ancestry with differentiated populations across their native African range (Figures 3 and 4), have a shared set of loci that have been subject to selection in the western hemisphere (Figure 5a), and that those loci are located in genomic regions carrying west-African and east-African ancestry (Figure 5b). Below we discuss how patterns of variation shared among the genomes of these flies helps inform our understanding of the evolutionary processes shaping their introduction into the western hemisphere.

| Differentiation among populations
We found evidence for genome-wide genetic differentiation between geographically-distant populations in the western hemisphere and native populations in Africa (Figures 1-4). Using phylogenomic analysis, we show that the introduced populations in the western hemisphere also harbour ancestry shared with both east and west-

| A signal of admixture in introduced populations of Z. indianus
Admixture has been shown in a number of species that have recently expanded their ranges in association with human activities.  (Oduor et al., 2015;Viard et al., 2020). An important aspect of admixture in introduced species that requires further study is the phenotypic effects of that admixture and whether admixture actively promotes or facilitates range expansion and/or adaptation to novel environments.   (Booker et al., 2020;Li et al., 2012).

| Selection in the introduced range
Combining genomic and phenotypic data with experiments estimating fitness in different environments would help to address this challenge. For example, functional analyses of candidate genes could be used to test whether genetic differentiation has resulted in associated shifts in phenotypic traits of interest (Brand et al., 2020). One could then use experiments or estimates of selection in natural populations to confirm the role of particular genotypes and/or phenotypes in affecting an individuals' performance or fitness in different environments (Barrett et al., 2008;Marques et al., 2017;Nosil et al., 2018;Powell et al., 2020).

| The source of putatively adaptive genetic variation in introductions
We found that regions of the genome that are strongly differentiated between introduced and native populations of Z. indianus harbour both west-and east-African ancestry, suggesting that admixture between these two lineages has played a role in introducing adaptive genetic variation into the population expanding across the western hemisphere. The relative frequency of genomic windows with shared east or west-African ancestry is different between differentiated windows and the genomic background (compare Figures   5b and 3b, respectively). In order to better understand the adaptive consequences of admixture in populations of Z. indianus further work that identifies potential traits affecting fitness, and their underlying genes, is needed. Quantitative-trait locus and admixture mapping represent promising approaches through which this goal could be achieved (Buerkle & Lexer, 2008;Malek et al., 2012;Powell et al., 2020;Rieseberg et al., 2003). Without knowledge on the specific phenotypes contributing to fitness in different environments across Z. indianus's introduced and native ranges we cannot comment on the relative importance of admixture for adaptation in this system. However, highly differentiated genomic windows that harbor east-or west-African ancestry provide an obvious starting point from which to explore links between genetic variation, phenotypic variation, and fitness. Identifying these links could provide valuable insight into the geographic and evolutionary origins of the genetic variation that help facilitate successful range expansions.

| CON CLUS IONS
Understanding when and where admixture and local adaptation occur during the evolutionary history of introduced species is an important challenge in modern evolutionary biology. Previous work has shown how introduced species can experience admixture within their introduced range and adapt to novel environments they experience (Barker et al., 2017;Colautti & Barrett, 2013;Dlugosch & Parker, 2008;Gibson et al., 2020;Kolbe et al., 2004;Olazcuaga et al., 2020;Popovic et al., 2020;Simon et al., 2020). However, adaptations that help facilitate range expansions associated with human activities may also evolve in the invading species prior to colonization and range expansion (Hufbauer et al., 2012). Here we have provided support for the latter scenario in populations of Z. indianus, where there is a signature of admixture and selection that is shared across geographically widespread populations in the western hemisphere. Further work is needed to increase our understanding of the adaptive consequences of admixture and the timing of selection on phenotypes that help to facilitate range expansions displayed by many invasive, weedy, or human-commensal species.

ACK N OWLED G EM ENTS
We thank S. Ingley, M. Cenzer, S. Rudman, and J. Jaenike for providing flies from locations in the eastern USA, and P. Schmidt, E. analysis, decision to publish, or preparation of the manuscript.

CO N FLI C T O F I NTE R E S T
The authors declare no conflicts of interest.