Signals of differential introgression in the genome of natural hybrids of Caribbean anoles

Hybridization facilitates recombination between divergent genetic lineages and can be shaped by both neutral and selective processes. Upon hybridization, loci with no net fitness effects introgress randomly from parental species into the genomes of hybrid individuals. Conversely, alleles from one parental species at some loci may provide a selective advantage to hybrids, resulting in patterns of introgression that do not conform to random expectations. We investigated genomic patterns of differential introgression in natural hybrids of two species of Caribbean anoles, Anolis pulchellus and A. krugi in Puerto Rico. Hybrids exhibit A. pulchellus phenotypes but possess A. krugi mitochondrial DNA, originated from multiple, independent hybridization events, and appear to have replaced pure A. pulchellus across a large area in western Puerto Rico. Combining genome‐wide SNP datasets with bioinformatic methods to identify signals of differential introgression in hybrids, we demonstrate that the genomes of hybrids are dominated by pulchellus‐derived alleles and show only 10%–20% A. krugi ancestry. The majority of A. krugi loci in hybrids exhibit a signal of non‐random differential introgression and include loci linked to genes involved in development and immune function. Three of these genes (delta like canonical notch ligand 1, jagged1 and notch receptor 1) affect cell differentiation and growth and interact with mitochondrial function. Our results suggest that differential non‐random introgression for a subset of loci may be driven by selection favouring the inheritance of compatible mitochondrial and nuclear‐encoded genes in hybrids.


| INTRODUC TI ON
Hybridization is relatively common between closely related species that lack complete reproductive isolation (Capblancq et al., 2019;Mallet, 2005;Nosil, 2012).Hybridization enables recombination between lineages that come into contact, which can lead to genetic homogenization and impede or reverse the process of divergence (Abbott et al., 2013;Gow et al., 2006;Seehausen, 2006;Taylor et al., 2006).Hybrids often exhibit lower fitness than parental species due to endogenous selection (e.g., genetic incompatibilities between parental genomes) or exogenous selection (e.g., inferior fitness in local environmental conditions), processes that can limit hybrid persistence (Barton, 2001;Fierst & Hansen, 2010;Seehausen, 2004).However, in some cases, hybrids may exhibit higher fitness than parental species because of the acquisition of functional novelty or a new combination of advantageous traits (Seehausen, 2004).These novel features can provide fitness advantages to hybrids, for example, by optimizing cellular processes such as metabolism and immune response or by facilitating adaptation to distinct thermal environments (Martin et al., 2020).While hybrids typically express phenotypes that are combinations of parental traits (Grant & Grant, 1994), in some cases, hybrids may instead exhibit mostly or entirely the phenotype of a single parental species (Ferris et al., 1983;Hird & Sullivan, 2009;Jezkova et al., 2013a;Liu et al., 2010;Tegelström, 1987).The absence of obvious phenotypic introgression in hybrids has generally been attributed to either extensive directional backcrossing (i.e., breeding of the hybrids with only one parental species) or reduced introgression between hybridizing taxa due to major chromosomal incompatibilities (Llopart et al., 2005;Rieseberg et al., 1996).
Genetic introgression describes the incorporation of alleles from one population into the gene pool of another population (Anderson, 1949;Anderson & Hubricht, 1938;Harrison & Larson, 2014).Herein, we use the terms "hybridization" and "genetic introgression" to describe separate but related phenomena: hybridization refers to the process of interspecific mating, whereas genetic introgression describes the genetic outcome of hybridization and subsequent backcrossing and natural selection altering the frequencies of subsets of hybrid genotypes.During introgression, alleles that are not targeted by selection or that do not have incompatibilities with the recipient genomic background are expected to introgress freely, assuming that recombination occurs between these and the physically linked targets of selection.We refer to these loci as exhibiting "neutral introgression" (Figure 1), where a hybrid is equally likely to inherit an allele from either parental population.In this scenario, the only expected directional pattern is a decrease in the number of introgressed alleles from one parental population as the frequency of backcrossing with the second parental population increases.
Alternatively, if alleles (or allele combinations across multiple loci) from one parental species increase the relative fitness of individuals in the other parental population, these alleles are more likely to introgress (Harrison & Larson, 2014;Hedrick, 2013;Morales et al., 2017;Pardo-Diaz et al., 2012;Whitney et al., 2006); we refer to these loci as exhibiting "differential introgression" (e.g., Payseur, 2010;Walsh et al., 2016; Figure 1).Differentially introgressed loci may contain or be linked to alleles that confer a selective advantage to the hybrids in their local environment (Hedrick, 2013).These loci may also show patterns of differential introgression because they are involved in complex co-evolved gene interactions (Pardo-Diaz et al., 2012, 2015).For example, mitochondrial-nuclear interactions are fundamental to individual fitness and various biological processes (Burton, 2022;Burton & Barreto, 2012;Chang et al., 2016), including energy production and cellular respiration, both of which depend on coordinated operation between nuclear and mitochondrial processes, a coordination that can be disrupted by introgression (Wolff et al., 2014).As a result, selection will act to optimize these interactions (i.e., positive selection) and eliminate deleterious combinations of interacting alleles (i.e., purifying selection; Hill et al., 2019;Wolff et al., 2014).
It is possible to detect differential introgression by searching for F I G U R E 1 Visualization of neutral introgression and differential introgression scenarios with respect to the extent and pattern of nuclear introgression in Anolis pulchellus hybrids (A.pulchellus with A. krugi mtDNA).In neutral introgression, matings between A. pulchellus males and A. krugi females produce hybrids that exhibit genetic introgression.When alleles are not under selection, introgression occurs indiscriminately across the genome, leading to a pattern of random introgression among hybrids.In this scenario, the introgressed A. krugi alleles will differ among hybrids.In differential introgression, loci are under selection (or linked to a gene or genes under selection), leading to non-random introgression that manifests as identical or enriched patterns of alleles from A. krugi in the hybrids.Blue = A. pulchellus markers; red = A. krugi markers.Blue circles = A. pulchellus mtDNA; red circles = A. krugi mtDNA.
alleles from one species that are at a higher frequency in hybrids than expected at random (Ballard & Melvin, 2010;Boratyński et al., 2014;Doiron et al., 2002;Llopart et al., 2005).In practice, finding conclusive evidence of differential introgression is challenging, especially in natural populations.
Caribbean anole communities can be highly diverse, with up to 11 species coexisting in a single locality (Warheit et al., 1999).Despite frequent interspecific interactions, hybridization between anole species seems to be rare, and apparent hybridization has only been suggested in eight species pairs (Losos, 2009).The rarity of interspecific hybridization in anoles has been attributed to strong premating isolation mechanisms mediated by effective visual species recognition signals that prevent interspecific mating (Leal & Losos, 2015;Losos, 2009).
One such rare example of interspecific hybridization in anoles occurs between the sister species Anolis pulchellus (Puerto Rican Bush Anole) and A. krugi (Olive Bush Anole) in Puerto Rico (Jezkova et al., 2013a).These two species are often syntopic across Puerto Rico and are distinguished by their distinctive body-colour patterns and the coloration of their dewlaps (Figure 1), extensible throat fans used for intra-and interspecific communication (Fleishman et al., 2009(Fleishman et al., , 2016;;Leal, 1999).Both species are locally abundant and widespread; A. pulchellus generally occurs in grassy, sunny, warmer habitats, whereas A. krugi is typically associated with forested or forest-edge, partially shaded, cooler areas (Fleishman et al., 2009;Gunderson et al., 2018;Rivero, 1998;Schwartz & Henderson, 1991;Williams, 1983).A previous study documented unidirectional mtDNA introgression between these two species and inferred that hybridization between these anoles had occurred recently and independently at numerous localities (a minimum of 10 independent hybridization events were estimated; Jezkova et al., 2013a).Individuals of A. pulchellus with foreign (A.krugi) mtDNA phenotypically resemble "pure" A. pulchellus (those with native mtDNA, hereafter referred to simply as "A.pulchellus"), and large parts of western Puerto Rico are exclusively inhabited by A. pulchellus harbouring A. krugi mtDNA, suggesting that the latter may have a selective advantage over pure A. pulchellus (Jezkova et al., 2013a).However, the nature of the putative selective advantage of A. krugi mtDNA or patterns of nuclear genomic variation in these hybrids has not been explored.
In this study, we used genome-wide sampling of single nucleotide polymorphisms (SNPs) to investigate the nuclear genomic composition of A. pulchellus that carries foreign (A.krugi) mtDNA (hereafter referred to as "hybrids").We evaluated the extent to which these hybrids possess nuclear alleles from each parental species and gathered evidence for neutral and differential introgression of A. krugi alleles into the nuclear genome of the hybrids.Alleles experiencing neutral or differential introgression are predicted to yield distinctive genetic signatures (Figure 1).Alleles exhibiting neutral introgression are expected to be present in a random subset of the hybrids, due to stochasticity and a varying number of backcrossing events since hybridization (Figure 1).
In contrast, alleles with evidence of differential introgression should be present in the majority of the hybrids (Llopart et al., 2005(Llopart et al., , 2014;;Rieseberg, 2011;Rieseberg et al., 1996).We identified SNPs with signatures of differential introgression using a methodology that we developed and made available in the R package HybridFindR.We also identified signatures of differential introgression using existing methods that account for the demographic histories of parental and hybrid populations.We identified genes linked to each SNP that exhibited a signal of differential introgression by mapping our data to the genome of the congeneric species A. carolinensis (Alföldi et al., 2011).Finally, we conducted gene ontology (GO) analysis to identify molecular functions associated with those genes.We hypothesized that the introgression of A. krugi alleles into the nuclear genome of the hybrids is non-neutral and that a subset of introgressed alleles exhibits signals of differential introgression and is associated with genes that influence mitochondrial function (Jezkova et al., 2013a).

| Sampling design
We generated two SNP datasets, a medium-density (MD) dataset targeting ~10,000 loci/individual for 80 individuals and a high-density (HD) dataset targeting ~150,000 loci/individual for 10 individuals (Figure 2, Table S1), to complement mtDNA datasets previously developed for 211 A. krugi (Rodríguez-Robles et al., 2010) and 309 A. pulchellus and hybrids (Jezkova et al., 2013a).For the MD SNP dataset, samples included hybrids (n = 24 samples, 9 localities representing ≥10 independent hybridization events), A. pulchellus (n = 27, 10 localities) and A. krugi (n = 29, 10 localities; Figure 2 and Table S1).This sampling encompasses the major mitochondrial lineages, spans the geographic distribution of each group (A.pulchellus, A. krugi, hybrids), and includes individuals with published mtDNA sequences (Figure 2; Jezkova et al., 2013a;Rodríguez-Robles et al., 2010).The HD dataset consisted of four hybrids representing four independent introgression events: two A. pulchellus and four A. krugi (Figure 2 and Table S1).We assumed that hybrids between A. pulchellus and A. krugi can be determined using mtDNA sequences and that wherever hybridization between A. pulchellus and A. krugi occurred, A. pulchellus harbours A. krugi mtDNA haplotypes.Hybrids and A. pulchellus are geographically separated, with populations of hybrids being restricted to the western part of Puerto Rico and A. pulchellus occurring throughout the remainder of the island.Nevertheless, hybrid and pure A. pulchellus are known to co-occur at one locality in central western Puerto Rico (locality 17; Figure 2).

| Laboratory methods
We generated the MD genome-wide SNP dataset using doubledigest RAD (restriction-site associated DNA) sequencing (ddRADseq: Jones et al., 2012;Peterson et al., 2012).Genomic DNA was simultaneously cut with the rare-cutting restriction enzyme SbfI (5′-CCTGCA*G + G − 3′; New England Biolabs Inc.) and the commoncutting restriction enzyme Sau3AI (5′-*GATC -3′; New England Biolabs Inc.).Barcoded Illumina adapter oligonucleotides, which included an 8 bp unique molecular identifier, were ligated to the ends of digested DNA to allow for hierarchical pooling, multiplexing of samples and PCR clone filtering.Following adapter ligation, samples were pooled into sets of eight and size-selected for a range of 340-600 bp using a Blue Pippin device (Sage Science).After size selection, samples were amplified with pool-specific indexed primers and amplification products were further pooled into two groups (based on molarity calculations) on a Bioanalyzer (Agilent) using a DNA 7500 chip.We sequenced the final pooled libraries using 100 bp paired-end reads on an Illumina HiSeq 2000 lane.
We obtained the HD SNP data using single-digest RAD sequences from Floragenex, Oregon, which generated and sequenced RAD libraries following the methods outlined by Baird et al. (2008), Hohenlohe et al. (2010) and Emerson et al. (2010).Genomic DNA was cut with the PstI restriction enzyme; adaptor oligonucleotides and individual barcodes were ligated to the ends of digested DNA; and the resulting fragments were sequenced on the Illumina GAIIx platform with single-end 80-bp chemistry.
For both RAD datasets, we removed PCR clones, quality-filtered and demultiplexed read data using the functions clone_filter and pro-cess_radtags in Stacks v1.42 (Catchen et al., 2013), requiring a minimum Phred quality score of 33, and using default parameters for all other options.For both MD and HD datasets, we used the dDocent pipeline for quality trimming and for mapping the reads to the assembled genome of A. carolinensis (genome assembly AnoCar2.0;1,78 Gb; Alföldi et al., 2011) using BWA (Li, 2013; see Table S1 for information on genome assembly).We filtered the variants produced by dDocent using FreeBayes (Garrison & Marth, 2012), following the general steps recommended in the dDocent documentation (Puritz, Hollenbeck, et al., 2014;Puritz, Matz, et al., 2014) using vcftools and vcflib (Danecek et al., 2011).We also filtered variants (i.e., single-nucleotide polymorphisms; SNPs) to include those called in 50% of individuals (--max-missing 0.50) and with a minimum mean read depth among samples of 5 (--min-meanDP 5).Individuals with 60% or greater missing data were excluded, and we removed loci with minor allele frequencies lower than 0.05 (--maf 0.05) and SNPs that were in strong linkage disequilibrium (LD; Rho ≥0.96).LD was estimated using a custom R script developed by Finger et al. (2022).When SNPs were found to be in LD, we randomly selected one to include  in our analysis.We also analysed the sequences in dDocent de novo, that is, without mapping them first to the A. carolinensis reference genome (Alföldi et al., 2011).The de novo approach allowed us to analyse reads that could not be mapped due to the divergence between A. carolinensis and A. krugi and A. puchellus (see Supplemental Information for detailed de novo methods and results).
We estimated the individual hybrid index (HI) and 95% confidence intervals for each dataset using bi-allelic SNPs and the Bayesian Markov chain Monte Carlo (MCMC) method implemented in the R-package gghybrid v.1.0.0 (Bailey, 2020).The HI represents an estimate of the proportion of alleles that were inherited from one of the two parental species and is dependent on the allele frequencies of parental populations (Buerkle, 2005).HI estimates range from 0 to 1, with extreme values (i.e., values close to 0 or 1) corresponding to pure individuals of each parental species.When parental individuals are fixed for different alleles and half of the hybrid alleles are from either parental population, we expect the HI to be 0.5 for all hybrids (see Table S2 for a visualization).Alternatively, when parental populations (S0 and S1 populations; S = Source) are polymorphic and the S0 population possesses S1 alleles, HINDEX estimates are no longer 0.5, even though hybrid genotypes are half S0 and half S1 alleles (see Table S2 for a visualization).We used the recommended settings and filtered our datasets using a minimum minor allele frequency of 0.1 and a minimum allele frequency difference (minimum difference between parental populations) of 0.25 to remove uninformative loci.We ran 10,000 MCMC iterations with a burn-in of 5000 iterations and visually inspected the MCMC chains to ensure mixing and convergence (Figure S1).We assigned localities of A. krugi and limits).This method accommodates codominant markers and markers that are not fixed between taxa, making it appropriate for closely related species (Buerkle, 2005).We first estimated the HI for the entire SNP dataset and then separately for each chromosome to determine whether the HI estimation was consistent across the nuclear genome.Due to the small number of SNPs on individual microchromosomes, we considered all SNPs on microchromosomes as a single group for comparative purposes.
We developed a new approach to test whether the observed pattern of differential introgression events across multiple individuals was likely due to random chance or if instead selection may be promoting the differential introgression of individual SNPs (Figure 3).
This approach was developed because hybrids do not form a typical clinal hybrid zone between parapatrically distributed taxa, where the degree of introgression decreases with the distance from the contact zone.Therefore, our dataset is not suitable for geographic cline analysis that requires a wide range of HI estimates along the hybrid zone.Our procedure tests the null hypothesis that at each SNP, the proportion of alleles from one parental species in the genomes of the hybrids is due to neutral introgression.This assessment is achieved by using permutations of population labels of individuals (hybrids vs. parental); it calculates the proportions of the selected set of parental alleles in permuted hybrids and then compares these proportions with the empirical proportions observed from the selected parental alleles in the hybrid samples.We assigned A. krugi as the parental species to perform our comparisons, and for each hybrid SNP we identified the corresponding parental allele using source alleles (alleles from parental populations) previously determined by HI estimation.
For each hybrid SNP, we calculated the proportion of A. krugi alleles in hybrids to represent our observed statistic.Next, we permuted the population labels of individuals 100,000 times and calculated the proportion of A. krugi alleles in permuted hybrids, to generate a null distribution of permutations' proportions.We then calculated a p-value as the proportion of random permutations that yielded a frequency of introgressed alleles greater than the frequency derived from the observed data.We corrected our data for multiple comparisons using the Bonferroni method (Dunn, 1961) and considered any SNP with a corrected p-value less than 0.05 to be a candidate SNP exhibiting differential introgression.We then calculated the percentage of A. krugi and A. pulchellus alleles that were randomly and differentially introgressed in the genomes of the hybrids.For example, if 367 of 3140 SNPs were differentially introgressed with an upper HI index of 0.161 (16.1% of A. krugi loci in the hybrid genome), then 11.6% (367/3140) of A. krugi alleles were differentially introgressed and 4.5% (16.1%-11.6%)were randomly introgressed.
One drawback of our analysis is that it does not account for the demographic histories of parental and hybrid populations.
Therefore, we assessed whether genetic drift could account for the observed distribution of the introgressed alleles.We did this in two steps.First, we performed neutral demographic simulations to test competing hypotheses for patterns of diversification among A. pulchellus, A. krugi and hybrid populations using three-dimensional (3D) joint frequency spectra (JSFS) of genetic variation using the program moments v.1.1.13(Jouganous et al., 2017).
We tested five models that differ in the patterns of gene flow (asymmetric or symmetric gene flow; Figure S2) between parental and hybrid populations.We built folded JSFS and down-sampled genotypes to minimize missing data and maximize the number of segregating sites using the program easySFS (https:// github.com/ isaac overc ast/ easySFS).Models were optimized and ran following the methods of Leaché et al. (2019), with python scripts for performing model fitting downloaded from github.com/ dport ik/ momen ts_ pipeline (Portik et al., 2017).We performed modelling with folded JSFS because outgroup data were not available.We conducted four rounds of optimizations for each model, using the parameters from the best replicate as starting values for the next round of optimization.Replicate analyses were carried out to ensure that the optimization routine was stable.Before the final analysis, we used the replicate with the highest likelihood for each model to calculate Akaike information criterion (AIC) scores (Burnham & Anderson, 2004), and the model with the lowest AIC was considered the best model.We then converted parameter estimates of theta (θ) and migration (M ij ) for the best model to effective population sizes (N e ) using the equation N e = θ/4 μL, where μ represents the mutation rate and L is the number of loci multiplied by their length.We estimated migration rates between populations (m ij ) with the equation m ij = M ij /2N e and divergence times (t) using the equation t = T/2N e g, where g is generation time.
In the second step, we used coalescent simulations to determine whether the observed allele frequencies could be generated by genetic drift.We used parameters inferred by the best-performing model in the previous step as input into msprime v.1.0.2 (Baumdicker et al., 2022).We simulated 1000,000 loci across two separate runs for each dataset.Simulations were processed with PGDspider v.2.1.1.5(Lischer & Excoffier, 2012); custom scripts are available at https:// github.com/ kfarl eigh/ Anoli sSims .We used processed files from each simulation run as input into gghybrid, calculated the observed proportion of parental alleles in the hybrids and compared the distribution of simulated frequencies to our empirical distribution.We compared distributions with the Kolmogorov-Smirnov test (Massey, 1951) to determine whether the distributions were significantly different.We compared the two simulated distributions to ensure that they yielded consistent results.
A second potential drawback of our approach is that the null distribution generated by permutations can be influenced by the sample sizes of the populations and the amount of backcrossing.
Nevertheless, our sample sizes were approximately equal (A. krugi,25;A. pulchellus,25;Hybrids,23), and because the levels of backcrossing in the hybrids are unknown, we did not include this parameter in our model, which should result in a more conservative test of differential introgression.
We identified loci harbouring SNPs with a signal of differential introgression and linked those loci to known genes and their associated functions based on the A. carolinensis genome annotation (Alföldi et al., 2011).We used the Genome Data Viewer (https:// www.ncbi.nlm.nih.gov/ genome/ gdv/ brows er/ genom e/? id= GCF_ 00009 0745. 1) to determine whether an SNP falls within an annotated gene transcript in A. carolinensis.In instances where an SNP occurred within an annotated gene transcript, we performed GO analyses of genes containing these SNPs using QuickGO (Binns et al., 2009) to infer their possible biological relevance to hybrids (see Appendix S1 for methods used to annotate de novo candidate SNPs exhibiting differential introgression).
We then created two datasets each for the MD and HD datasets by mapping sequence reads onto the genome of A. carolinensis (reference-based; Alföldi et al., 2011) and by clustering reads de novo (i.e., reference-independent; see Section 2.2 for details).After removing individuals with greater than 60% missing data, we were left with 73 individuals in the MD dataset and 10 individuals in the HD dataset (Figure 2, Table S1).The datasets comprised 3140 SNPs and 14,444 SNPs when reads were mapped to the A. carolinensis genome (Alföldi et al., 2011), with an average depth of 7.5 and 88.9 for the MD and HD datasets, respectively (see Appendix S2 for de novo results).

| Analysis of SNP datasets
We analysed the MD dataset in a phylogenetic context using Neighbour-Net, which nested all hybrids within a clade that also included pure A. pulchellus (Figure 4a).This nuclear SNP tree contrasts with mtDNA-based results, which grouped hybrids with A. krugi and with foreign mtDNA (hybrids) are differentiated along the second PCA axis, which explained 13.5% and 13.7% of the variance in the two datasets, respectively (see Appendix S2 for de novo results).
HI estimates for each dataset revealed that hybrid individuals displayed HIs that were distinct from (i.e., outside the range of) pure A. pulchellus (Figure 5b,c, Figure S4a,b), but that showed partial A.
krugi ancestry (Table 1, Table S3; Figure 5b,c, Figure S4a,b).For the MD dataset, these hybrids had average HI values ranging from 0.09 to 0.16.HI values based on the HD dataset for such hybrids were slightly higher, ranging from 0.21 to 0.22 (Table 1, Table S3; see Appendix S2 for de novo results).HI estimates at the chromosome level generally followed a similar pattern, with hybrids falling outside of the parental limits in the HD dataset (Figures S5 and S6).
However, the HI estimates at the chromosome level in the MD dataset do not fall outside the parental limits because HI values derived from individual chromosomes displayed higher variance (indicated by larger confidence intervals; Figure S4).This finding is likely driven by the reduced number of chromosome-specific SNPs in the MD dataset (Figures S4 and S6).
Differential introgression analysis of the MD dataset identified 367 and 591 candidate SNPs with a greater proportion of A. krugi and A. pulchellus alleles in hybrids than expected by chance, respectively (Figure 6; see Appendix S2 for de novo results).Demographic modelling for the MD dataset using moments identified the best model of diversification as the model with asymmetric gene flow between A. pulchellus and hybrid individuals and asymmetric gene flow between A. krugi and hybrids (Table S4).Migration rates into hybrids were estimated to be greater than migration from hybrids to either A. pulchellus or A. krugi (Table S4).Effective population size estimates were similar for A. pulchellus and hybrid populations and were smaller than the estimate for A. krugi (Table S4).Differentially introgressed candidate SNPs with a greater proportion of A. krugi alleles in hybrids than expected by chance can be found throughout the A. carolinensis genome: on autosomes and sex chromosomes, as well as on unplaced scaffolds (Figure S8).We identified 97 reference SNPs that fell within the boundaries of annotated genes with known function, whereas 8 SNPs mapped to annotated genes with no known function.Thirty-six of these outlier SNPs associated with known genes are located in protein-coding regions of genes (i.e., exons); the other 69 are found in non-coding regions of the genes (i.e., introns; Table S5).The remaining outlier SNPs (N = 262) map to non-coding regions.Candidate genes (linked to outlier SNPs) have functional annotations related to cellular metabolism, development, immunity and structural components within the cell (Table 2, Table S5).Three of the candidate genes that displayed a greater proportion of A. krugi alleles in hybrids, notch receptor 1 (notch1), delta-like canonical Notch ligand 1 (dll1) and jagged1 (jag1), are components of the Notch signalling pathway.This pathway is known to directly interact with the mitochondrial proteome by down-regulating key proteins in respiratory complex I and the glutamine catabolic process, which decreases dependence on exogenous glutamine for cell survival (Basak et al., 2014).Another candidate gene, NADH: Ubiquinone Oxidoreductase Subunit B11 (ndufb11), codes for a protein that is an accessory subunit of respiratory complex I (Stroud et al., 2016).
Differentially introgressed candidate SNPs with a greater proportion of A. pulchellus alleles in hybrids than expected by chance are also distributed throughout the A. carolinensis genome.We identified 84 reference SNPs that are within functionally annotated genes, whereas 7 SNPs map to annotated genes with unknown function.Twenty-six of the SNPs associated with known genes are located in protein-coding regions of candidate genes (i.e., exons); the other 59 are found in non-coding regions of genes (i.e., introns; Table S5).The remaining outlier SNPs map to non-coding regions (N = 506).Candidate genes harbouring differentially introgressed A. pulchellus alleles are related to cellular metabolism, development and structural components within the cell (Table S5).Two genes, acylglycerol kinase (agk) and pwwp2b, are known to be associated with thermoregulation (Prentki & Madiraju, 2008) and thermogenesis (Yan et al., 2021), respectively.

| DISCUSS ION
Introgression can be an important source of adaptive genetic variation in nature (Hedrick, 2013).Adaptive introgression can introduce alleles associated with traits that increase individual fitness because they are better suited to the biotic and abiotic factors of a particular environment or associated with "repair" genes that have accumulated deleterious variants due to mutation and genetic drift.In this study, we developed an approach to identify patterns of differential introgression in naturally-occurring hybrids resulting from mosaic sympatry.Our approach leverages recent bioinformatic methods to identify parental source alleles (Bailey, 2020) and then employs a permutation procedure to determine whether the observed proportion of an allele from a particular parental population is greater in Sample numbers correspond to those in Table S1.

| Genomic composition of A. pulchellus hybrids
Genome-wide estimation of the ancestry of genetic markers in hybrid individuals revealed that hybrid genomes are predominately derived from A. pulchellus and that only 10%-20% of hybrid genomes come from A. krugi ancestry (Table 1; Figure 5b,c).This higher degree of A. pulchellus ancestry is consistent with the pulchellus-like phenotype of the hybrids and is qualitatively similar to patterns of uneven hybrid ancestry observed in other taxa (Llopart et al., 2005;Rieseberg et al., 1996).Typically, reduced ancestry from one parental lineage can be attributed to one of three, non-mutually exclusive processes: (i) continuous backcrossing of hybrids with one of the parental lineages (Bachtrog et al., 2006;Rand et al., 2006), (ii) restricted introgression that results in underrepresentation of one of the parental genomes in the hybrids due to intrinsic factors, such as chromosomal incompatibilities or strong epistatic interactions in the nuclear genome (Llopart et al., 2005(Llopart et al., , 2014;;Rieseberg, 2011;Rieseberg et al., 1996) and (iii) subsequent selection acting in favour or against one or more parental alleles in the hybrids, and backcrosses due to extrinsic factors, such as environmental conditions.
While our data do not allow us to explicitly test among these three processes, below we discuss expectations for each process and discuss whether they are consistent with our results.

Anolis pulchellus Anolis krugi Anolis pulchellus with
A.krugi mtDNA backcrossing should also lead to variation across hybrids in both the number and genomic locations of A. krugi alleles.This is because, in the absence of selection, backcrossing should result in stochastic inheritance of parental alleles and because different individuals may be the product of different numbers of backcrossing events with unique outcomes and distinct recombination events.Our results suggest that backcrossing alone does adequately explain the reduced representation of A. krugi alleles in the hybrids.First, all hybrids exhibited a similar proportion (10%-20%) of introgressed A. krugi loci.Second, all our analyses suggest the presence of differentially introgressed alleles in the genomes of the hybrids.In particular, the coalescent simulations suggested that the empirical frequencies of differentially introgressed alleles differed significantly from the simulated frequencies, as many A. pulchellus and A. krugi alleles are present at higher frequencies in hybrids than what would be expected at random.Indeed, only ca. 5% of A. krugi loci exhibited a signal of random introgression in the hybrid genomes.Although this percentage was higher in the de novo assembly (57% of de novo SNPs showed a signal of random introgression; Appendix S2), our findings indicate that a large proportion of the A. krugi alleles in the hybrids exhibit differential introgression.We believe that the discordance between inferences from the reference genome and de novo inferences may be due to the loci that exhibit random introgression representing neutral loci, which generally accumulate substitutions faster than non-neutral loci and would therefore be less likely to map to the reference genome of a different species.
The higher number of randomly introgressed loci in the de novo dataset can also be caused by uncorrected LD, given that the location of loci in the genome is unknown.
Restricted introgression may also lead to a reduced representation of the genome of one of the parental species in the hybrids.
Restricted introgression can occur due to intrinsic factors that limit recombination, such as the presence of chromosomal divergence, non-linear linkage blocks (which prevent recombination), deleterious interspecific interactions between loci (Aeschbacher et al., 2017;Llopart et al., 2005Llopart et al., , 2014;;Martin et al., 2019;Rieseberg, 2011;Schumer et al., 2018) or broad selection against minor-parent ancestry (in a genome background dominated by another parental species) due to disruptions in epistatic interactions among coevolved loci from the major parental lineage.Restricted introgression is typically observed in deeply divergent species pairs, as greater divergence times increase the probability of chromosomal incompatibilities arising between hybridizing species (Ayala et al., 1974; Harrison TA B L E 1 Average hybrid index (HINDEX) estimates, 95% confidence interval (in parentheses) for Anolis pulchellus, hybrids (A.pulchellus with A. krugi mtDNA), and A. krugi, the parental limit for each source population (A.pulchellus and A. krugi) for the medium-and high-density SNP reference datasets, and the number of SNPs included in each analysis.

Parental limits (A. pulchellus, A. krugi) SNPs
Medium-density SNP reference 0.001 (0-0.016)0.122 (0.092-0.161) 0.999 (0.997-1) 0.0145, 0.997 2188 High-density SNP reference 0 (0-0.002)0.219 (0.211-0.226) 0.999 (0.998-1) 0.003, 0.998 6606 Note: HINDEX represents an estimate of the proportion of alleles that were inherited from one of the two parental species, and in this study, HINDEX is an estimate of the proportion of alleles that were inherited from A. krugi.Parental limits are used to determine whether an individual exhibits signals of hybridization.An individual that falls outside of the parental limits can be considered a hybrid.

F I G U R E 6
The proportion of Anolis krugi (red) and A. pulchellus (blue) alleles in hybrids at the 367 candidate SNPs exhibiting a signal of differential introgression in the medium-density SNP reference dataset.
TA B E 2 Gene ontology (GO) information for the 11 candidate, differentially introgressed SNPs discussed in Sections 3.2 and 4.2.Note: Candidates were found to be located within a gene transcript in the genome of Anolis carolinensis (see Section 2.3 for detailed methods; Alföldi et al., 2011).For the nine reference candidates, columns represent the SNP location on the A. carolinensis assembly, the gene name, the genomic element where the SNP is located (i.e., intron, exon) and the GO categories for each gene, which can provide information about their biological relevance (i.e., cellular component (C), molecular function (F), biological process (P)).Reference SNPs are named based on the chromosome or unplaced scaffold where they are located (characters before the underscore) and their position on that chromosome or unplaced scaffold (characters after the underscore).For the two de novo candidates, columns represent the dDocent contig that candidates are located on, the gene name, the expected value (E-value) from BLAST (Altschul et al., 1990) and the GO categories for each gene (see Appendix S1 for methods used to annotate de novo candidate SNPs).The GO information for all candidate SNPs identified as exhibiting differential introgression appears in Table S5.

Genomic
(cblb) is involved in immune Gamma-aminobutyric acid receptor subunit alpha-5 (gabra5) is involved in the development of the nervous tissue; and acylglycerol kinase (agk) is involved in thermoregulation (Table 2, Table S5).
The contrasting thermal preferences of A. pulchellus and A. krugi are likely to impose distinct thermal constraints on these anoles.As previously stated, A. krugi prefers partially shaded, cooler habitats, whereas A. pulchellus prefers open, sunnier, warmer habitats.In anoles and other lizards, species that inhabit partially shaded conditions tend to be active at cooler temperatures, which likely influences their performance curves (Gunderson & Leal, 2016).Anolis pulchellus hybrids may therefore exhibit a performance curve with a "broader shoulder" and reach optimal performance at lower temperatures.If so, this thermal profile may allow hybrids to outperform pure A. pulchellus at cooler temperatures.A broad shoulder performance curve can also increase the daily activity times of hybrids by allowing them to be active in early morning and/or late afternoon periods during which ambient temperatures tend to be cooler.For ectotherms, total activity time is thought to be critical for population persistence (Gunderson & Leal, 2016;Huey et al., 2010).

Candidate genes associated with differentially introgressed loci in
A. pulchellus have been previously implicated in limb development, in particular the sonic hedgehog gene (shh; Tickle & Towers, 2017) and genes associated with Notch signalling (Mašek & Andersson, 2017).
Limb length is an adaptive trait in anoles (Donihue et al., 2018;Losos et al., 1997), and a recent study of A. sagrei identified signals of positive selection in a genomic region that contains several candidate genes that are involved in limb development (Bock et al., 2021).
Both the forelimbs and hind limbs of A. krugi are longer than those of A. pulchellus (Losos, 1990), and in anoles, longer limbs provide enhanced maximal sprint speed, which permits rapid locomotion to capture prey and escape predators (Losos & Irschick, 1996), a trait that may be selectively advantageous.We currently lack morphological data on A. pulchellus hybrids, and it will be informative to determine whether limb length differs between pure A. pulchellus and hybrid individuals.Similarly, additional work is needed to understand how the Notch pathway impacts limb development in anoles.
Together, our findings suggest that the observed signatures of differential introgression in A. pulchellus hybrids may be caused by selection on physiological and morphological traits.Still, we recognize that genes we identified to exhibit differential introgression may not be under selection themselves but instead be linked to genes under selection that were not represented in our functional inferences.

| Mitochondrial-nuclear interactions
Mitochondrial-nuclear (mito-nuclear) interactions are fundamental to most eukaryotes, and many biological processes, such as energy production, cellular respiration, cellular development and regulation, hinge on these coordinated actions between genes of the nuclear and mitochondrial genomes (Currat et al., 2008;Hedrick, 2013;Toews & Brelsford, 2012;Wolff et al., 2014).Mito-nuclear genomes of non-avian reptiles encode 13 protein-coding genes, but the function of mitochondria can depend on >1000 different proteins encoded by mitochondrially targeted genes in the nucleus (Sloan et al., 2017).Anolis krugi's mitochondrial introgression into A. pulchellus was previously attributed to selection on the mitochondrial genome (Jezkova et al., 2013a).Adaptive introgression of foreign mtDNA may occur in species or populations that have suffered from the accumulation of deleterious mitochondrial mutations (Llopart et al., 2014;Sloan et al., 2017), although adaptive introgression has also been linked to thermal tolerance and local adaptation to different climatic conditions (Sloan et al., 2017;Toews & Brelsford, 2012).
Recent studies have shown that mitochondrial introgression may be accompanied by preferential introgression of regions that are enriched for nuclear-encoded mitochondrial targeted genes (Beck et al., 2015;Morales et al., 2017), and mito-nuclear co-introgression has been indirectly inferred to affect basal metabolic rate (Boratyński et al., 2016).Our results contribute to this growing set of examples and indicate that differential introgression at some nuclear loci in A. pulchellus hybrids may be driven by selection to maintain functional mitochondrial-nuclear interactions with A. krugi's mtDNA genome.The Notch signalling pathway, which is associated with some of the differentially introgressed SNPs in hybrid A. pulchellus, is an evolutionary conserved pathway (Shi & Wang, 2017) that is known to affect the mitochondrial proteome and is crucial for cellular development (Bray, 2006) and metabolism (Basak et al., 2014).
The three genes involved in Notch signalling that are differentially introgressed in A. pulchellus hybrids encode proteins that represent the major components of this pathway (receptor; notch1 and ligands; dll1, jag1: Shi & Wang, 2017).This finding suggests that Notch signalling may be important in our study system and that differential introgression may be driven by the need to preserve or optimize this pathway (Hill et al., 2019;Wolff et al., 2014).Specifically, the activation of notch1 alters 10 unique proteins by down-regulating protein expression in respiratory complex I and the glutamine catabolic process; 8 of those proteins belong to mitochondria-localized metabolic pathways, including oxidative phosphorylation, glutamine metabolism, the Krebs cycle and fatty acid oxidation (Basak et al., 2014).
Metabolism is critical for cellular function and generally increases with temperature, as higher temperatures require more adenosine triphosphate production, thus increasing metabolism (Clarke & Fraser, 2004).This may be particularly important in our system, as ectothermic vertebrates, such as Anolis lizards, display a strong reliance on environmental thermoregulation (Martin et al., 2020).The Notch signalling pathway also affects the regeneration and metabolism of nicotinamide adenine dinucleotide + hydrogen (NADH) and nicotinamide adenine dinucleotide phosphate (NADPH), and respiratory complex I in general, the first and largest component of the mitochondrial oxidative phosphorylation system (Brandt, 2006).This too may be important in our system, as respiratory complex I plays a central role in energy metabolism and overall respiration (Sharma et al., 2009).Therefore, alterations to the function or composition of respiratory complex I may affect metabolism and cellular respiration, which in turn may influence the fitness of A. pulchellus hybrids.Our Approximate sampling localities of individuals used to generate the medium-density (N = 73) and high-density (N = 10) SNP datasets used in this study: Anolis pulchellus (A.pulchellus with native mtDNA, dark blue circles, N = 25), hybrids (A.pulchellus with A. krugi mtDNA, light blue circles, N = 23) and A. krugi (red circles, N = 25).Circle size is proportional to sample size.The localities for which a single sample was used for the high-density SNP dataset (N = 10) are indicated with an asterisk (*) next to the locality code.The locality codes of A. pulchellus are numerical, whereas the locality codes of A. krugi represent municipality abbreviations, following Jezkova et al. (2013a) and Rodríguez-Robles et al. (2010).The list of all samples and sampling localities is listed in TableS1.(b) Approximate sampling localities with known mtDNA assignments for A. pulchellus (dark blue circles), A. krugi (red circles) and hybrids (light blue circles) identified inJezkova et al. (2013a).
with A. krugi mtDNA

A
. puchellus with native mtDNA as parental populations.Parental limits are calculated using the 95% confidence intervals of the parental populations and are defined as the maximum value for the S0 population (A.pulchellus) and the minimum value for the S1 population (A.krugi).These limits are used to determine whether hybrid individuals are more similar to one source population (i.e., whether the hybrids fall within the parental limits) or whether they are a mix of the two parental populations (i.e., hybrids fall outside the parental F I G U R E 3 Visualization of the procedure used in HybridFindR to identify SNPs exhibiting signals of differential introgression in Anolis pulchellus hybrids (A.pulchellus with A. krugi mtDNA).First, we used gghybrid(Bailey, 2020) to identify the A. krugi allele for each SNP.We then determined the proportion of A. krugi alleles in the hybrids to calculate the observed statistic.Next, we permuted the population tags and estimated the proportion of A. krugi alleles in permuted hybrids to calculate the permuted statistic.Permutations were repeated 100,000 times to generate a null distribution.Finally, we compared the observed statistic to the null distribution, calculated a p-value, and applied the Bonferroni correction to identify SNPs that may exhibit signals of differential introgression.

Figure 4b ;
Jezkova et al., 2013a).The PCA analysis of the MD and HD datasets yielded a signal of two population clusters along the first axis (corresponding to A. pulchellus and A. krugi; Figure 5a, Figure S3); these clusters explained 40.1% of the variance in the MD dataset and 49.7% of the variance in the HD dataset.Anolis pulchellus with native mtDNA (pure individuals) Figure S7).

F
I G U R E 4 (a) Neighbour-Net trees depicting the genetic relationships among Anolis pulchellus (A.pulchellus with native mtDNA, dark blue), A. krugi (red) and hybrids (A.pulchellus with A. krugi mtDNA, light blue).The trees were inferred from the de novo clustering of (a) the medium-density SNP dataset including all individuals (N = 73) and (b) the mtDNA dataset (N = 73; Jezkova et al., 2013a).
to neutral introgression alone.This approach may be particularly useful in systems where the geography of introgression does not resemble a traditional clinal hybrid zone or where multiple independently localized instances of hybridization are detected.Using our new locus permutation approach and coalescent simulations, we searched for SNPs exhibiting a signal of differential introgression, presumably due to selection, on the allelic content of hybrid genomes and linked these SNPs to known genes and their corresponding molecular functions.These candidate genes can thus be studied in detail in future experimental or comparative studies.Our approach enabled us to explore how differential introgression may have shaped the genomic ancestry and architecture of hybrid A. pulchellus, individuals that phenotypically resemble pure A. pulchellus but possess the mtDNA of A. krugi.This instance of natural hybridization and mitochondrial introgression is particularly interesting because i) documented cases of hybridization between sympatric anoles are remarkably rare in nature(Losos, 2009), ii) mitochondrial introgression is unidirectional, suggesting hybridization is dominated by matings between A. pulchellus males and A. krugi females and iii) mitochondrial introgression is geographically widespread, resulting in what appears to be a complete replacement of the native A. pulchellus mtDNA by that of A. krugi throughout a large area in western Puerto Rico(Jezkova et al., 2013a).
Scatterplot of the first two axes of the principal component analysis (PCA) for Anolis pulchellus (A.pulchellus with native mtDNA, dark blue circles), A. krugi (red circles) and hybrids (A.pulchellus with A. krugi mtDNA, light blue circles).PC1 and PC2 were derived from the de novo clustering of the mediumdensity SNP dataset (N = 73).Inset: Results of a PCA analysis of de novo clustering of the high-density SNP dataset (N = 10).The percentage of variance explained by each axis is indicated in parentheses.(b, c) Hybrid index graphs with A. pulchellus (blue circles) and A. krugi (red circles) as parental references for the (b) medium-density and (c) high-density SNP datasets for hybrid A. pulchellus (A.pulchellus with A. krugi mtDNA, grey circles).Lines extending from each point are the 95% confidence interval for each individual.Dotted blue and red lines indicate the parental limit for each reference population.
binding, C: adherens junction, C: apical plasma membrane, P: positive regulation of Notch signalling pathway, P:glomerular visceral epithelial cell development Reference chr1_223343339 Delta-like canonical notch ligand 1 (dll1) Exon F: calcium ion binding; F: notch binding; P: astroctye development; P: cell differentiation Reference chr11_40594200 Spectrin beta, non-erythrocytic 5 (sptbn5) Intron C: cytoskeleton, C: cytoplasm, P: actin filament capping, F: actin binding, respiratory chain complex I assembly, C: mitochondrial respiratory chain complex I, C: mitochondrial respiratory chain complex I, C: membrane, C: integral component of membrane Reference chr3_170687954 Cbl proto-oncogene B (cblb) Intron P: immune response, C: membrane raft, P: negative regulation of epidermal growth factor-activated receptor activity, P: signal transduction, Fion binding; F: chromatin DNA binding; P:animal organ regeneration; -A receptor activity; P: associative learning; P: brain development; P: nervous system process