Locus- specific introgression in young hybrid swarms: Drift may dominate selection

Closely related species that have previously inhabited geographically separated ranges are hybridizing at an increasing rate due to human disruptions. These human-mediated hybrid zones can be used to study reproductive isolation between species at secondary contact, including examining locus- specific rates of introgression. Introgression is expected to be heterogenous across the genome, reflecting variation in selection. Those loci native hybridization. We genomic cline We 11.4% of cline the genome wide and 17.6% of all excess rates of introgression. Based on simulations, we believe that many of these markers the genome- wide average due to drift, rather than because of selection, and we suggest that these simulations can be useful as a null distribution for future studies of genomic clines. Future work on red deer and sika could determine the policy implications of allelic- due to drift rather than selection, and could use replicate, distinct hybrid down those loci that to


| INTRODUC TI ON
The rate of hybridization between closely related species that have recently come into secondary contact is increasing, due to human-assisted migration and environmental change (Grabenstein & Taylor, 2018;Parmesan & Yohe, 2003). While such hybridization is not necessarily negative (Hamilton & Miller, 2016), in many cases hybridization can cause problems for native species. If F1s are | 2105 inviable or sterile then hybridization is a loss of reproductive effort (Allendorf et al., 2001). Alternatively, the presence of viable, fertile hybrid offspring can lead to populations with large numbers of hybrids, and in the most extreme cases, whole populations comprised only of hybrid individuals (Allendorf et al., 2001). Biodiversity can be lost through hybridization, either if all remaining members of a species are hybrids (extinction via hybridization; Allendorf et al., 2001;Allendorf & Luikart, 2009;Rhymer & Simberloff, 1996;Todesco et al., 2016), or if particular endemic alleles are replaced by novel alleles introduced by backcrossing and driven to fixation via selection (as described by Petit, 2004).
Hybrid zones, whether naturally occurring or due to human interference, can be used as "natural laboratories" for research into selection and the genetics of reproductive isolation between species (Hewitt, 1988). The rate of introgression of alleles between species is expected to be heterogenous across the genome, reflecting variation in selection (Baack & Rieseberg, 2007). Backcrossing coupled with recombination will separate haplotypes that are commonly found together and create novel haplotypes on which selection can act on alleles in unique genetic backgrounds (Arnold et al., 1999). Alleles that move quickly across the species barrier are assumed to be under positive selection in their new genetic background, while alleles that do not introgress between species are candidates for contributing to reproductive isolation (Baack & Rieseberg, 2007). Drift will also be acting on these alleles, particularly if hybridization is rare or one of the parental populations is small. In these cases, we expect substantial variation in the degree of introgression across loci, as a result of the sampling error introduced by reproduction and recombination (Baird et al., 2003). If non-native alleles are increasing in frequency, whether due to selection or drift, we should apply the precautionary principle until we can determine the effects (positive or negative) of these alleles.
Identifying those endemic alleles that are most likely to be replaced by novel alleles gives a target for policy makers to reflect upon and consider protecting.
Geographic cline analyses have been used to determine the extent of hybridization between two species at a contact zone (Barton & Gale, 1993;Barton & Hewitt, 1985). Traditionally, the width of these geographic gradients of allele frequencies can be used to infer selection on each allele as it introgresses from one species to another across a landscape (Mallet et al., 1990). Genomic clines, which replace geographic gradients with hybrid indices, have been used in the same way, and have the advantage that they can be applied even when hybrids have a mosaic distribution, or exist as a hybrid swarm (Gompert & Buerkle, 2011Lexer et al., 2007). Genomic clines use a multinomial regression that predicts the probability of a particular genotype ( ) given a hybrid index (h), where: Here, is analogous to the location of the cline centre and can be interpreted as the direction of introgression, i.e., a positive means excess ancestry from species A to species B and negative means excess ancestry from species B to A. is analogous to the width of the cline and can be interpreted as the strength of the barrier to gene flow (Janoušek et al., 2015). Positive is interpreted as a narrow cline, where introgression is impeded, and negative is a wide cline, where introgression is faster than expected based on the genomic expectation (Gompert & Buerkle, 2009). α and β are not explicitly expected to covary with each other (although they are not fully independent), nor are α and β necessarily expected to covary with divergence estimates between the parental species in the system such as F ST (Charlesworth, 1998).
However, those loci that are both highly diverged between species (i.e., high F ST ) and slow moving (large positive β) are good candidates for loci involved in reproductive isolation (Gompert & Buerkle, 2009;Lexer et al., 2007), particularly if they are not expected to be highly diverged because of other genomic constraints (i.e., recombination cold spots; Burri et al., 2015;Cruickshank & Hahn, 2014). Studies of naturally occurring hybridization regularly find many markers, spread across the genome, with significant α and β estimates, and typically find more loci that are significant for α than β (but see Pulido-Santacruz et al., 2018 who found no divergent α or β SNPs between either woodcreeper (Willisornis) or antbird (Xiphorhynchus) species pairs). For example , Janoušek et al., (2015) found that as many as 70% of SNPs diverged from genome-wide expections in a house mouse (Mus musculus musculus and M. m. domesticus) hybrid zone, Parchman et al., (2013) using 59,100 SNPs found more than 1000 significant for α and more than 400 significant for β between manakin species (Manacus candei and M. vinellinus), and Sung et al., (2018) reported ~30% of 45,384 SNPs with significantly diverged α and ~1% of SNPs with significantly diverged β between iris species (Iris hexagona and I. fulva). The vast number of reported genome wide SNPs with excess α and β from many systems are unlikely to all be related to selection, especially given that selection must be extremely strong to be detected at the genome-wide level in artificial selection studies (e.g., Castro et al., 2019). Simulations of admixed populations that varied in population size found that, particularly with a population size of only 100, both α and β estimates could be quite variable, and when loci under selection were simulated, particularly when there was weak selection and low levels of admixture, there were high false discovery rates (Gompert & Buerkle, 2011). Before genomic regions can be considered candidates to be responding to selection, careful consideration of expections due to nonselective forces must be undertaken (Gompert & Buerkle, 2011).
The red deer (Cervus elaphus) is an emblematic animal native to Scotland. It was named as one of "Scotland's big 5" in a campaign to increase engagement with wildlife ran by the Scottish Government between (NatureScot, 2016Scottish Wildlife Trust, 2013), and is known for its large body size, large antlers and bright red summer coat. Red deer are abundant through much of Scotland and they are popular for hunting (deer stalking) and with tourists and unpopular for their ecological impacts, particularly on young trees.
The red deer is currently a species of least concern, but the greatest threat to them in Europe is introgression by Japanese sika (C. nippon; IUCN, 2020). Physically smaller sika were introduced to Scotland in the late 19 th century, and have since hybridized with the red deer (Ratcliffe, 1987). In some parts of the Kintyre peninsula, Argyll, more than 40% of sampled phenotypic red deer and sika individuals are hybrids according to 50,000 SNP markers, with the majority being the result of multiple generations of backcrossing (McFarlane, Hunter, et al., 2020;. Hybrid deer tend towards an intermediate phenotype and thus are smaller, have smaller antlers, and are more likely to have the spots typical of sika than parental species red deer (Bartos et al., 1981). Initial hybridization may be constrained by the substantial size difference between species, but it is clear that at least some F1s and many backcrosses are fertile (Harrington, 1979;McFarlane, Hunter, et al., 2020;. While there is a trend from red deer in the north to sika in the south of Kintyre, the distribution of hybrids does not follow a cline, being instead concentrated in specific areas (Senn et al., 2010), and we have thus recently redefined this system as a "bivariate hybrid zone" (McFarlane, Hunter, et al., 2020;McFarlane & Pemberton, 2019;. Additionally, in a study using 20 microsatellite markers, there was no evidence that the number of hybrid individuals was changing over a period of 15 years (Senn et al., 2010). Taken together, the red deer and sika system in Scotland is an excellent model for understanding how hybridization between a native and introduced species can play out genetically.
In this study, we sought evidence among red-sika hybrids that specific genome regions have introgressed more or less than expected under neutrality, in ways that might be interpreted as being due to selection. We used 45K SNP genotypes in 222 Kintyre hybrid deer to estimate genomic clines and show that, as in the other studies cited above, many loci exceed background expectation in terms of direction of introgression α and cline width β. We paid particular attention to the X chromosome, as we have previously found relatively more diagnostic and ancestry informative markers on the X than on the autosomes (McFarlane, Hunter, et al., 2020;. We then conduct population genetic simulations to investigate admixture scenarios that shed light on the likely roles of drift and selection in generating these results. Scotland (now Forestry and Land Scotland) as part of normal deer control measures. Deer were shot as encountered, without regard to the phenotype of the animal (Smith et al., 2018). Sample collection consisted of ear tissue and has been previously described elsewhere (Senn & Pemberton, 2009;Smith et al., 2018). Samples were either preserved in 95% ethanol or frozen for long-term storage.

| DNA extraction and SNP genotyping
SNPs were genotyped on the Cervine Illumina iSelect HD Custom BeadChip using an iScan instrument following manufacturer's instructions as in Huisman et al., (2016). When this SNPchip was developed, SNPs were selected to be spaced evenly throughout the genome based on the bovine genome with which the deer genome has high homology (Johnston et al., 2017). In the present study we have used the bovine map as this allows use of all of the SNPs, including those that are not polymorphic in red deer, and thus were difficult to map.
The majority of the 53K SNPs (45K after quality control; see below) included were selected to be polymorphic in red deer, 4500 SNPs were selected to be diagnostic between either red deer and sika or red deer and wapiti (Cervus canadensis) (Brauning et al., 2015). While one pool of 12 sika from Kintyre were whole genome sequenced during the development of this SNP chip, the main focus was on polymorphic SNPs in red deer on Rum, a well-studied, isolated island population of red deer in the inner Hebrides (Brauning et al., 2015).
We used the DNeasy Blood and Tissue Kit (Qiagen) according to the manufacturer's instructions to extract DNA for SNP analysis, with the exception that we eluted twice in 25 μl buffer TE to obtain DNA at a sufficiently high concentration. Concentration was assayed using the Qubit dsDNA BR Assay Kit (Invitrogen). Any samples below 50 ng/μl were vacuum-concentrated, re-extracted or omitted from SNP analysis. We used a positive control twice on each 96 well plate to check for consistency between batches (Huisman et al., 2016). We scored genotypes using GenomeStudio using the clusters from Huisman et al., (2016), and clustered SNPs manually if they could not be resolved in these clusters (McFarlane, Hunter, et al., 2020;. All quality control was carried out in PLINK (Purcell et al., 2007).
We excluded individual samples with a call rate of <0.90, and deleted loci with a minor allele frequency of <0.001 and/or a call rate of <0.90 (as in McFarlane, Hunter, et al., 2020;. We did not exclude SNPs based on Hardy Weinberg Equilibrium (HWE) as highly differentiated markers between red and sika are not expected to be in HWE.
In McFarlane, Hunter, et al., (2020)) and McFarlane, Senn, et al., (2020)), we used ADMIXTURE (Alexander et al., 2009) to assign a Q score to each individual. Using the credible intervals (CI), we assessed individuals as pure sika, if the CI overlapped 0, pure red deer if the CI overlapped 1 or hybrid if the CIs overlapped neither 0 or 1 (McFarlane, Hunter, et al., 2020;. Of the 513 genotyped deer from Kintyre, 222 were assigned as hybrids, 159 as red deer and 132 as sika. We use these species assignments in the analyses in the present paper.

| Diversity
We estimated genetic divergence between red deer and sika in Kintyre using the hierfstat package in R (Goudet, 2005). We compared only individuals that previous analysis identified as pure species red deer or sika (McFarlane, Hunter, et al., 2020; and we estimated F ST at each individual locus following Nei (1987). We used a linear model in R (R Core Team, 2013) with F ST as the response variable, and the X chromosome as a reference to ask how the F ST of SNPs on the autosomes differed from those SNPs on the X chromosome.

| Bayesian genomic clines
The genomic clines method is designed to detect loci with alleles that have introgressed at rates that deviate from genome-wide expectations, as those alleles that move faster than expected might be under selection in the novel parental genomic background and those loci that move more slowly than expected might be related to post zygotic reproductive isolation (Lexer et al., 2007). We used the program bgc (Gompert & Buerkle, 2012) to estimate Bayesian genomic clines across the hybrid individuals in our population. bgc compares the genotype of each locus in each individual to that individual's hybrid index to estimate values of α, which is comparable to a geographic cline centre and β, comparable to a geographic cline slope (Gompert & Buerkle, 2012).
Red deer and sika were each assigned to parental populations, and all admixed individuals were put into a "hybrid population". This is in contrast to some previous analyses in which individuals are separated based on whether they are from a population in which admixture occurs (Royer et al., 2016;Taylor et al., 2014;Trier et al., 2014). We calculated allele frequencies for the two parental populations using PLINK (Purcell et al., 2007), while hybrid genotypes were considered individually. We ran bgc on both all 44,997 SNPs and on thinned SNPs (n = 6684, thinned according to recombination in PLINK, r < 0.2), and found the same frequency of SNPs significant for α and β across the genome, so below we report analyses run on all SNPs. We ran bgc five independent times, for 50,000 iterations each time, with a burnin of 25,000 and a thinning interval of 200, and assessed convergence by eye. To be as conservative as possible when determining which loci significantly deviated from the genome-wide expectation, we used the widest possible confidence intervals for each locus from the five chains (Janoušek et al., 2015). Loci with credible intervals that did not overlap with 0 are referred to as "excess" loci. Additionally, we assumed a normal distribution for each α and β with the same mean and standard deviation as the empirical data. We then asked which SNPs had α or β estimates in the 2.5% upper and lower tails of this distribution. Those loci outside of the 95% distribution are referred to as "outlier loci".

| SLiM simulations to characterize expectations under drift
We wanted to determine the impact of population size and history on the potential role of drift in hybridized populations. Theoretically, there is an expectation that rare, recent hybridization should result in extremely variable rates of introgression across the genome (Baird et al., 2003). We used SLiM (Haller & Messer, 2017) to build some simple models that varied the rate of admixture, the length of time admixture has been occurring and the abundance ratio of each parental type population (1:1 or 3:1). We simulated 1000 individuals with a single chromosome of 1e 7 markers, split into two populations of either 500 of each species or 250 and 750 of the two species, and allowed these parental populations to evolve for 3000 generations with a standard rate of neutral mutation (0.01), typically resulting in an F ST between 0.40 and 0.60. Note that we did not simulate any markers to be under selection. We then allowed migration and hybridization between the two populations at a given rate (0.002, 0.02, or 0.2) for a given number of generations (10, 100 or 1000). While 1000 generations is substantially longer than the red deer and sika have been in secondary contact, we designed these simulations to be useful to a wide variety of systems, including those with much more ancient hybridization. We then took the SNP data for all individuals alive at 10, 100 or 1000 generations and put them through our PLINK-ADMIXTURE-bgc pipeline (as above). One deviation from the above pipeline is that due to computational constraints bgc was only run for 2500 iterations, with a burnin of 200 iterations and a sampling interval of two. We ran bgc five times for each simulation, and, as with the empirical analyses, categorized loci based on the widest possible CIs. As bgc analyses may not have converged over such a small numbers of iterations, this could lead to wider CIs than if convergence had occurred in all chains, making this analysis conservative with respect to finding excess loci. We ran each simulation 50 times to determine what proportion of markers deviated significantly from the genome-wide expectation. We did not identify outlier loci for α and β, as this is less commonly done in the literature, and is difficult to standardize across studies.  Table S1).

| bgc
We found substantial variation between loci in the location and rate of genomic clines between red deer and sika. Positive α can be interpreted as introgression from red deer to sika, while negative α is introgression from sika to red deer. While most of the 44,997 SNPs that we examined were not significant, there were many SNPs that were excess or outliers compared to the genome-wide expectation based on hybrid indices. As noted above, a SNP was considered significantly excess if the 95% confidence interval did not overlap  β (Table 1). As was the case when we used all the SNPs, we found many more excess loci with positive α than negative α (2534 vs. 7) and many more positive than negative α outlier AIM SNPs (2234 vs. 0), suggesting more extreme introgression from red deer into sika than from sika into red deer. We found fewer positive than negative excess β AIM SNPs (101 vs. 2301), and fewer positive than negative outlier β AIM SNPs (4 vs. 2179). Similarly to when we examined all SNPs, excess and outlier α and β SNPs were found across the genome. In contrast to when we examined all SNPs, there was a substantially higher proportion of AIM SNPs that were different than the genome wide expectation (69.3% DM&AM significant excess vs. 19.1% from all SNPs and 65.5% AIM significant outlier vs. 9.5% from all SNPs).

| SLiM simulations
Across the scenarios that we simulated, we found that the majority of simulated loci were not significant for either α or β estimates.
However, we did find that in cases where there had only been 10 generations of admixture, and a low level of hybridization, most loci had either a positive or negative β estimate, suggesting faster or slower than expected movement through the cline (Figure 3, panels "sle", "slo" and "sme"). While the proportion of loci with significant β decreased with increasing number of generations and increased admixture, there are loci with significant β found in every other simulated scenario, with sometimes as many as 40% of loci introgressing at extreme rates when compared to the average rate of introgression F I G U R E 2 (a) α estimates with 95% credible intervals for SNPs significantly different from zero ("excess"), from a bgc analysis of a red deer x sika hybrid swarm in Kintyre, Scotland. α = 0 can be interpreted as the genomic cline centre, positive α estimates indicate alleles that are more shifted from red deer into sika than the genome-wide expectation, and negative αs indicate alleles shifted from sika into red deer. SNPs are displayed on the bovine chromosomes rather than the deer linkage map. (b) β estimates with 95% credible intervals for SNPs significantly different from zero ("excess"), from a bgc analysis of a red deer x sika hybrid swarm in Kintyre, Scotland. β = 0 can be interpreted as the average rate of introgression, positive β estimates are indicative of a narrow cline, and slow introgression, while negative β estimates are analogous to faster than average introgression. SNPs are displayed on the bovine chromosomes rather than the deer linkage map across the entire genome. Additionally, in scenarios where hybridization has been progressing for longer (Figure 3, m and l rows), as many as 15% of loci have negative alpha estimates. This appears to be more extreme with increased rates of hybridization.

| DISCUSS ION
Using 44,997 SNPs, we found extremely variable F ST between red deer and sika across all chromosomes, although the X chromosome had a substantially higher F ST than the autosomes. We also found 5128 α excess SNPs, of which 3618 were outliers and 3618 β excess SNPs of which 3128 were outliers (Table 1). When we compared these excess and outlier SNPs to our list of AIMs, we found a high proportion of AIM loci were excess and/or outliers (Table 1). This suggests that some caution should be used when interpreting the results of genomic clines of diagnostic or ancestry informative markers, as there could be a relationship between informativeness and extreme clines of these markers.
The higher F St on the X chromosome is not unexpected.  & Rieseberg, 2007;Barton, 1979). Genes that are associated with postzygotic isolation have been recorded on the X in other systems (Qvarnström & Bailey, 2009). As we discuss below, those markers with extremely diverged F ST and slow rates of introgression, particularly those on the X chromosome, are good candidates for further examination for a role in reproductive isolation.

F I G U R E 3
We used SLiM (Haller & Messer, 2017) to simulate admixing populations that had been in secondary contact for either a short (s, 10 generations, top row), medium (m, 100 generations, middle row), or long (l, 1000 generations, bottom row) length of time since admixture started. For each length of secondary contact, we also simulated rates of migration and interbreeding between populations, as either low (l, 0.002, left two columns), medium (m, 0.02, middle two columns), or high (h, 0.2, right two columns), and the abundance ratio of each pure population, as either even (e, 1:1) or odd (o, 1:3). Each simulation was run 50 times, no selection was simulated, and we categorized (into nine categories; legend) the direction and rate of introgression among simulated hybrid individuals using bgc. Overall, introgression at most loci did not deviate from genome-wide expectation, but especially in cases with a short time since admixture started and a low rate of admixture (top, left two panels), many loci introgressed faster than genome-wide expectation despite the total absence of any selection in the simulations approximately 800), as it was a function of the admixture rate, and the stochasticity built into these individual-based simulations. In any case, the 222 deer hybrids from Kintyre are substantially fewer than the 500 or 1000 hybrid individuals that were simulated in the best performing models by Gompert and Buerkle (2011). This is good reason to be cautious about interpreting excess or outlier α estimates as evidence for selection on these loci.
We found substantially more loci with positive than negative excess and outlier α's, indicating that more alleles have shifted from red deer into sika than from sika into red deer. There are three possible explanations for this, which are difficult to distinguish in our study system. First, there could be asymmetry in backcrossing, such that there is more backcrossing into sika than there is into red deer.
This was previously indicated in an analysis of microsatellite data by Goodman et al., (1999) who estimated that the rate of backcrossing into sika was twice the rate of backcrossing into red deer (H = 0.002 vs. H = 0.001), although based on mitochondrial DNA, it is clear that backcrossing does proceed in both directions (Smith et al., 2018).
Second, the pattern of increased positive vs. negative α estimates could be due to marker selection. The SNP chip we used was mainly designed to provide polymorphic loci for studies within red deer, and has just 2250 SNPs that were selected to be diagnostic between red deer and sika (Brauning et al., 2015), although ultimately only 629 SNPs are diagnostic in our study population (McFarlane, Hunter, et al., 2020;. Across the other loci on the SNP chip, the sika population is less diverse than the red deer probably due to a demographic history of bottlenecks and the fact the chip was primarily designed for use in red deer. These two features together make it difficult to document whether shared alleles are introgressing from sika into red deer, whereas it is easier to document the introgression of private alleles from a large, outbred, polymorphic population of red deer into sika. Further, it is difficult to quantify the relative contribution of each of these processes. The third possible mechanism explaining the seemingly higher proportion of red deer alleles introgressing into sika than in the other direction is that, as sika are an introduced species in the UK, it is possible that some alleles that are introgressing from red deer to sika are indeed the result of adaptive introgression, because they increase the fitness of hybrids. Adaptive introgression can involve a faster response to selection in a new environment than selection on a new mutation since the allele is already proven, albeit in a different background (Hedrick, 2013), and has been suggested to be a potentially positive conservation outcome of anthropogenic hybridization (Hamilton & Miller, 2016). Without fitness estimates, it is extremely difficult to demonstrate adaptive introgression in wild populations (Taylor & Larson, 2019), making it difficult to tease apart these three possibilities.
Empirically, we found 3349 (~6.7%) SNPs with a negative, excess β estimate (3006 negative β outliers), suggesting that these SNPs were introgressing faster than expected between red deer and sika.
While red deer and sika have been hybridizing in Scotland for at least 6-7 generations, it is possible they may have hybridized prior to introduction to Scotland, as hybridization was reported in the Irish source population before animals were introduced to Kintyre (Powerscourt, 1884). Either way, this is a case of recent hybridization. The rate of backcrossing has previously been estimated using 11 microsatellite markers as 0.002 into sika and 0.001 into red deer (Goodman et al., 1999), which is consistent with our simulated "low" admixture parameter. The ratio of red deer to sika is variable across Kintyre (Smith et al., 2018). Thus, our empirical work is most consistent with the "sle" or the "slo" simulations, where we found that most SNPs were excess β, either positive or negative (Figure 3). Thus, we found substantially fewer significant negative β SNPs than we may have expected from the simulations, highlighting that these simulations are just a toy example, rather than a highly accurate simulation of this natural system. For comparison, many studies of hybridization that have used bgc have not found significant β estimates. For example, a recent study of ibis (Plegadis falcinellus, P. chihi, P. ridgwayi) hybridization using diagnostic markers found no significant negative β SNPs, in spite of the ibis hybrid zone probably only being 60 or so years old (Oswald et al., 2019). In contrast, a study of recent sole (Solea aegyptiaca x S. senegalensis) hybridization found 52% of all loci exhibited an extreme β value, with 26% of all loci exhibiting a negative β estimate (Souissi et al., 2018 (Baird et al., 2003), the potential for extremely different results depending on the marker panel used (Table 1), the age of a hybrid zone, and rate of admixture between species ( Figure 3). As such, a more comprehensive meta-analysis approach is probably needed to understand factors driving genomic cline variation across taxa.
Although we cannot be sure that any loci demonstrate selection in our study system we found a number of SNPs that exhibited extreme introgression as judged by α or β estimates.  Figures S2 and S3), suggesting little evidence for strong selection or "islands of differentiation" (Cruickshank & Hahn, 2014;Turner & Hahn, 2010;Wolf & Ellegren, 2017). Simulations of genomic clines that included epistatic interactions on reproductive isolation (i.e., Bateson-Dobzhansky-Muller interactions; Dobzhansky, 1937;Muller, 1940) are difficult to detect using bgc (Gompert & Buerkle, 2011), so we would not claim the lack of evidence in this case as evidence of the absence of genes involved in reproductive isolation in this system. Substantially more work is needed to address this question.
There is an expectation that when there is recent, rare hybridization, the genomic outcome of introgression is extremely stochastic (Baird et al., 2003), and it has previously been noted how difficult it is to derive a null distribution for locus-specific introgression (Gompert & Buerkle, 2011). Drift can substantially increase or decrease the frequency of different blocks, in the complete absence of selection.
This is consistent with what we saw in our SLiM simulations, where, when we simulated 10 generations of admixture with a rate of admixture of 0.002, we found in some cases that 50% of markers had wider clines and 50% of markers had narrower clines than predicted from the genome-wide expectation ( Figure 3). As noted above, the hybrid population sizes also varied with admixture rate, particularly when hybridization was rare and had only been ongoing for 10 generations (scenarios sle and slo). This is consistent with untargeted sampling in wild populations, as, if hybridization is recent and rare, there will be proportionately fewer hybrids in the population. This confirms that extreme β estimates should not be taken as evidence of selection (Gompert & Buerkle, 2012), or of adaptive introgression (Taylor & Larson, 2019), as this introgression happens in the absence of selection. This is particularly true when hybridization is recent and rare, leading to relatively few hybrids in the population.
Previous neutral simulations of 25 generations of admixture with an admixture rate of 0.2, comparable to our she and sho simulations but with a simulated population size of 100, found substantial variation in the estimated α or β estimates, with α being more variable than β (Gompert & Buerkle, 2011). These simulations found that α or β were less variable when the population sizes simulated were 500 or 1000, although some outlier α or β loci were still found in some simulations in these cases (Gompert & Buerkle, 2011). As this pattern was less extreme when hybridization had been progressing for many generations (i.e., 100 or 1000), this provides an additional rationale for researchers to quantify the length of time admixture has been occurring in their system prior to drawing conclusions (Loh et al., 2013;McFarlane & Pemberton, 2019 To conserve a species in the presence of hybridization, we must first quantify both the number of individuals in the population that are hybrids, and the proportion of alleles that could be replaced by introduced alleles, i.e. in line with the gene-based theory of conservation (Petit, 2004). In our study area, we found approximately 43% of individuals are hybrids (McFarlane, Hunter, et al., 2020; and in the present study, we have identified 60 SNPs with both an excessive negative α and excessive negative β estimate, indicative of introgressive alleles moving from the introduced sika into the native red deer faster than expected. These SNPs are spread across 26 different chromosomes. Whether the pattern of these SNPs is the result of selection or drift, it is still the case that there are sika alleles that are spreading into red deer populations via hybridization faster than those at other loci. These are the genome regions that are of potential conservation concern for Scottish red deer as these alleles may most quickly replace their red deer alternates. Techniques such as admixture mapping could be used to try to link SNPs to phenotypes known to be under selection (Buerkle & Lexer, 2008), and then cross check these SNPs against those introgressing fastest. Such gene-targeted conservation is unlikely to be successful (Kardos & Shafer, 2018), particularly since many of the traits of interest in red deer (e.g., redness, antler size and shape, body size) are likely to be polygenic (Santure & Garant, 2018 (Gompert & Buerkle, 2011, so different methods must be employed to distinguish between alleles undergoing adaptive introgression or involved in reproductive isolation and those loci that deviate from genomic expectations due to stochastic processes. One approach would be to study replicate hybrid zones, on the assumption that stochastic processes will act independently in each instance of secondary contact, but selection will not. Loci which have consistent excess β estimates would be the best candidates for being under selection, either for or against introgression into a novel background. In house mice, it was found that 28/41 SNPs had different genomic clines between two replicates, as assessed using a likelihood ratio test that compared the clines, encompassing both α and β, suggesting that few if any of the extreme SNPs could be related to genetic incompatibilities or adaptive introgression (Teeter et al., 2010). While it should be noted that detecting signals of even very strong selection at the genome wide level is extremely difficult, requires substantial power and a strong signal (Castro et al., 2019), those SNPs with extreme β across multiple replicate hybrid zones would be strong candidates for being involved in either adaptive introgression, or impeding gene flow between species. Future research on red deer x sika hybridization could capitalize on replicate hybrid populations across Europe (e.g., Ireland: Smith et al., 2014;Lithuania: Ražanskė et al., 2017;and Poland: Biedrzycka et al., 2012) where the many points of sika introduction have generated natural replications of this.
Genomic clines allow for the estimation of local introgression, allowing us to identify particular regions of the genome that could be involved in adaptive introgression or, conversely, reproductive isolation between species at secondary contact. The difficulty in differentiating the patterns caused by selection versus those caused by drift, as we have simulated here, should not be taken as a deterrent to using such methods. Instead, we believe that there is an opportunity to make use of replicated hybrid zones, including those with varying degrees of hybridization (Mandeville et al., 2019) to compare genomic clines, as those loci with consistent rates of introgression in independent populations are more likely to be under selection. Further, future research could employ meta-analysis techniques across studies and species to quantify rates of introgression across the genome, while controlling for phylogeny, to make generalizations about selection across the genome at secondary contact.

ACK N OWLED G EM ENTS
We thank the Forestry and Land Scotland rangers, especially Fraser