Whole‐genome analysis of multiple wood ant population pairs supports similar speciation histories, but different degrees of gene flow, across their European ranges

Abstract The application of demographic history modelling and inference to the study of divergence between species has become a cornerstone of speciation genomics. Speciation histories are usually reconstructed by analysing single populations from each species, assuming that the inferred population history represents the actual speciation history. However, this assumption may not be met when species diverge with gene flow, for example, when secondary contact may be confined to specific geographic regions. Here, we tested whether divergence histories inferred from heterospecific populations may vary depending on their geographic locations, using the two wood ant species Formica polyctena and F. aquilonia. We performed whole‐genome resequencing of 20 individuals sampled in multiple locations across the European ranges of both species. Then, we reconstructed the histories of distinct heterospecific population pairs using a coalescent‐based approach. Our analyses always supported a scenario of divergence with gene flow, suggesting that divergence started in the Pleistocene (c. 500 kya) and occurred with continuous asymmetrical gene flow from F. aquilonia to F. polyctena until a recent time, when migration became negligible (2–19 kya). However, we found support for contemporary gene flow in a sympatric pair from Finland, where the species hybridise, but no signature of recent bidirectional gene flow elsewhere. Overall, our results suggest that divergence histories reconstructed from a few individuals may be applicable at the species level. Nonetheless, the geographical context of populations chosen to represent their species should be taken into account, as it may affect estimates of migration rates between species when gene flow is spatially heterogeneous.


| INTRODUC TI ON
Reconstructing divergence histories using genetic data has become gold standard in speciation genomics (Ravinet et al., 2017), which has been eased by the development of sequencing technologies and inference tools (Beichman et al., 2018). Classically, the speciation history between two related species is inferred using samples from a single pair of populations, one from each species, with the purpose of estimating key evolutionary parameters such as divergence times, migration rates, and effective population sizes (e.g., Nadachowska-Brzyska et al., 2013;Sutra et al., 2019;Yagi et al., 2019). Such an approach implicitly assumes that the divergence history between the two sampled populations is representative of the divergence history of the species as a whole, that is, across their whole ranges. This assumption is expected to hold if species diverge in allopatry without gene flow. However, outside of studies on parallel evolution, where multiple population pairs are routinely analysed (see, e.g., Flanagan et al., 2021;Rougeux et al., 2017;van Belleghem et al., 2018), this assumption is rarely explicitly tested; and when speciation occurs with gene flow it is unclear to what extent parameter estimates may fluctuate across the ranges of both species.
Gene flow between two diverging lineages can vary through time (Sousa & Hey, 2013) and space. For instance, secondary contact between two species after a range expansion is likely to only affect populations in a specific geographic region (e.g., Green et al., 2010).
In such cases, reconstructing the divergence history between the two species would depend on the geographic location of the set of populations sampled, because populations also evolve in space (see Bradburd & Ralph, 2019 for a recent review on spatial population genetics). While some studies have previously reconstructed the speciation history between several species using multiple population pairs (e.g., Chueca et al., 2021;Filatov et al., 2016;Garcia-Erill et al., 2021;Ito et al., 2020;Pabijan et al., 2017;Rougemont & Bernatchez, 2018;Stankowski et al., 2020;Zieliński et al., 2016), to our knowledge variation inferred between the multiple comparisons has not been reported. In this study, we investigate how the divergence history between two species may vary depending on the geographic location of the pair of heterospecific populations considered, using mound-building red wood ants.
Red wood ants of the Formica rufa species group (Hymenoptera, Formicidae) play important ecosystemic roles in boreal forests (Frouz et al., 2016;Stockan et al., 2016) and are a good system to study variability in the divergence history across species geographical ranges. This is because the Palaearctic F. rufa species group encompasses up to 13 species (Seifert, 2021), many with different distribution areas that probably experienced gene flow or secondary contact in different regions. Phylogenetic studies using mitochondrial markers suggest that speciation within this group took place in the Pleistocene, in the last 500,000 years (Goropashnaya et al., , 2012. While their speciation history is unknown, they may have diverged in different forest refugia during Pleistocene glaciations . Among the F. rufa species group, F. polyctena and F. aquilonia are two nonsister species with contrasting distributions.
Within the European part of their Palaearctic ranges, F. aquilonia occurs in Fennoscandia, Russia and mountain areas in Central Europe (i.e., Alps), whereas F. polyctena occurs in Central Europe and in Southern parts of Fennoscandia (Seifert, 2018). Both species overlap in Switzerland (where they occupy different altitudinal niches) and around the Baltic Sea, and natural hybrids have been characterized in Southern Finland (Beresford et al., 2017;Kulmuni et al., 2010Kulmuni et al., , 2020.
Here, we reconstructed the speciation history between F. polyctena and F. aquilonia using whole-genome data by contrasting mul-  Cherix et al., 2012) and sympatry in Finland (where hybridization has been characterized; Beresford et al., 2017). Our results suggest that divergence started in the Pleistocene and that it occurred with continuous asymmetric gene flow from F. aquilonia to F. polyctena. Interestingly, all sample comparisons supported divergence with gene flow, with comparable divergence times ranging from 523,900 to 561,745 years ago, in agreement with previous findings. Nevertheless, we found support for bidirectional gene flow at recent times only in Finland, where it could be mediated by hybrids (even if our Finnish samples were collected away from any known flow elsewhere. Overall, our results suggest that divergence histories reconstructed from a few individuals may be applicable at the species level. Nonetheless, the geographical context of populations chosen to represent their species should be taken into account, as it may affect estimates of migration rates between species when gene flow is spatially heterogeneous.

K E Y W O R D S
demographic inference, divergence with gene flow, Formica red wood ants, secondary contact, site frequency spectrum, sympatry hybrid population). By using multiple sampling locations distributed across both species ranges, we were able to draw a consistent picture of the speciation history (i.e., divergence times, ancestral effective population sizes and ancestral migration rates) while uncovering variability in migration rates, which could be explained by local opportunities for gene flow.

| Study system
Formica polyctena and F. aquilonia are two Palaearctic ant species of the Formica genus that inhabit coniferous and mixed forests. They are haplodiploid (females are diploid and males are haploid) and arrhenotokous (mothers produce male offspring from unfertilized eggs; De La Filia et al., 2015). They are social insects and, as such, labour is divided between reproductive queens and workers. As both of these species are polygynous, each nest may have hundreds of egg-laying queens of different ages. The species are also supercolonial, meaning that local populations comprise large colonies of many cooperating nests (hereafter, population is used as a synonym for supercolony). Polygyny results in low relatedness among individuals sampled within the same nest and/or supercolony (e.g., Sundström et al., 2005). New queens are produced in the Spring and they may shed their wings without a nuptial flight. As such, matings can happen with males from their own population. Finally, if long-range dispersal occurs, it is probably male-biased (Maeder et al., 2016).

| Sampling
Our main aim was to understand to what extent samples from different geographical locations yield similar speciation histories and parameter estimates. To address this, females (workers) of each species were sampled from several locations across Europe ( Figure 1). For F. polyctena, sampling was carried out in two locations in Switzerland (East and West) and in the Åland islands (Finland). We targeted the Åland islands, specifically, in an attempt to sample nonadmixed F. polyctena, since F. aquilonia has not been reported there (Sorvari, 2021). For each sampling site, three diploid workers were sampled from different populations, and/or different nests within the same population, whenever possible (Table 1). In addition, one more worker of each species was collected in Southern Finland, where hybridization between both species has been previously documented (Beresford et al., 2017). For F. aquilonia, sampling was carried out in Switzerland (East), Scotland and Finland (both Central and Southern Finland). Overall, 10 workers were sampled for each species (20 individuals in total, Table 1).

| Morphological identification
Prior to sequencing, a subset of samples (at least one nest per geographical location, Table 1) were morphologically identified at the species level using numeric morphology-based alpha-taxonomy (NUMOBAT). This method produces species assignment probabilities at the nest level and is based on 16 morphological characters, which were measured for six individual ants per nest (CS, CL/ CW, SL/CS, nCH, OccHL, nGU, GUHL, nPN, mPnHL, nMes, nMet, MetHL, nPr, EyeHL, nSc and SL/Smax;Seifert, 2018Seifert, , 2021; see Table   S1 for a brief description of these morphological characters).

| DNA extraction and sequencing
For all samples, DNA was extracted from whole-bodies with a SDS (sodium dodecyl sulphate) protocol. DNA libraries were constructed using NEBNext DNA Library Prep Kits (New England Biolabs).
Only biallelic SNPs with quality equal or higher than 30 were kept. In order to refrain from removing entire sites when only a subset of individuals had inadequate genotype calls, individual genotypes with genotype qualities lower than 30 were coded as missing data. Genotypes with depth of coverage lower than eight were also coded as missing data, after which sites with missing data across more than half of our samples were removed.
To remove genotyping errors that cause sites to show excessive heterozygosity (e.g., due to unresolved paralogues or alignment errors), we first pooled all samples together, purposefully creating a Wahlund effect. After this, we applied a filter based on Hardy-Weinberg equilibrium and excluded sites displaying heterozygote excess (p < .01; see, e.g., Pfeifer et al., 2018).
We applied a filter based on sequencing depth by setting individual-specific thresholds: sites were only kept if their coverage was between half and twice the mean individual value. Finally, we removed sites located on the third chromosome, also known as the social chromosome. This chromosome harbours genes responsible for polymorphism in social organization in Formica species, controlling if a colony is headed by one (monogynous) or multiple (polygynous) queens (Brelsford et al., 2020). Recombination is rare between monogynous and polygynous alleles of this chromosome (supergene, Brelsford et al., 2020), leading to the maintenance of ancestral polymorphisms across Formica species, which could bias our demographic inference.

| Population structure
Population structure was studied by means of two individualbased methods, principal component analysis (PCA) and sNMF clustering analysis (Frichot et al., 2014), the latter of which estimates individual ancestry coefficients. These analyses were performed in r (v3.6.3; R Core Team, 2017) using the lea package (v3.0.0; Frichot & François, 2015) with a thinned data set, keeping every 200th SNP (resulting in an average distance of 28 kb between consecutive SNPs, 7693 SNPs overall). For the sNMF analysis, 20 independent runs were performed for a given number of ancestral clusters K, with K ranging from 1 to 8 (i.e., 20 runs were performed for each K value). The K value with lowest crossentropy obtained by sNMF was considered as the best K value, and the run with the lowest cross-entropy for the best K value was considered as the best run.
Observed and expected heterozygosity, inbreeding coefficients (F IS ) and pairwise fixation indices (F ST ; Weir & Cockerham, 1984) were calculated using custom scripts. For F IS , confidence intervals were estimated using 10,000 bootstraps over loci. Pairwise F ST values were also computed between populations using the snprelate package (v1.20.1; Zheng et al., 2012).

| Demographic modelling
To document the divergence history across the species ranges, we compared alternative demographic models using demographic parameters inferred from the site frequency spectrum (SFS, see "SFS  The sympatric population pair was formed by Finnish populations of both species. Each model was run 100 times, with 80 iterations per run for likelihood maximization, and 200,000 coalescent simulations per iteration to approximate the expected SFS. The mutation rate was assumed as 3.5 × 10 −9 per bp per haploid genome per generation, which is approximated from estimates available for social insects (Liu et al., 2017).
In the Formica genus, young queens can start laying eggs in their first years of life and have been estimated to live up to five years for F. polyctena (Horstmann, 1983), with queens of different ages within a single nest (i.e., overlapping generations). As such, we assumed a generation time of 2.5 years.
After obtaining point parameter estimates and likelihoods for all models and sample comparisons (see below), the parameter estimates inferred by the model with the highest composite likelihood were considered as the best history for each pair of samples.

| Speciation history between F. polyctena and F. aquilonia
To ascertain whether the speciation history between F. aquilonia and F. polyctena is different across both species ranges, we considered four overall divergence scenarios: allopatry, sympatry, isolation after migration, and migration after isolation ( Figure 2). All  For the last two models, the resize was assumed to coincide with a change in the migration rates ( Since our data set included linked sites, fastsimcoal2 estimates are based on composite likelihoods, preventing application of AIC for model choice.

| Introgression from unsampled ("ghost") populations
In order to rule out the possibility that the signal of gene flow we detect between F. aquilonia and F. polyctena is actually caused by gene flow from an unsampled (so-called "ghost") species into either of the focal species, we modelled two different "ghost" scenarios, which are based on species relationships within the F. rufa species group as described in Goropashnaya et al. (2012). The first scenario models ghost introgression from F. rufa, which is a sis-

| SFS characteristics
To perform the demographic analyses detailed above, we built SFSs using data from two populations (2D-SFS) using custom R scripts the individuals selected to be resampled in each window were the ones with higher amounts of data in that specific window (in each window, the individuals kept are those with less missing data at the sites included in that window). Overall, each comparison was carried with data from at least two distinct individuals per geographic location, a number of individuals not expected to impact model selection (Fraïsse et al., 2018).
All SFSs included the number of monomorphic sites, which, in conjunction with a mutation rate, allows scaling parameter estimates inferred by the models (e.g., to obtain divergence times in number of generations). We estimated these numbers using the proportion of polymorphic sites in relation to the total number of callable sites of individuals in a specific data set. The total number of callable sites was obtained for each individual using mosdepth (v0.2.9; Pedersen & Quinlan, 2018) considering individual depth of coverage thresholds defined earlier for SNP calling.

| Confidence intervals
We used a nonparametric block bootstrapping approach to obtain confidence intervals for parameters of the "Sympatry" model with asymmetric migration (

| Impact of linked selection
Genome-wide heterogeneity in both migration rates and/or effective population size due to the effect of linked selection can impact demographic history inference (e.g., Tine et al., 2014). Since fast-simcoal2 cannot take this heterogeneity into account, we tried to minimise the extent of linked selection in our SNP data set by excluding sites within genes or 10 kbp of the nearest coding region and rerunning all analyses with this reduced data set. While we acknowledge that 10 kbp is a rather short distance (despite high recombination rates usually observed in social Hymenoptera, e.g., Sirviö et al., 2011), increasing this distance to 20 kbp would exclude too many SNPs (88% of the sites excluded overall, vs. 75% at 10 kb) for robust inference .

| Sequencing and SNP calling
Illumina sequencing yielded on average 7.12 Gb of raw data per sam-

| Species identification
A subset of samples used for genomic analyses were also used in morphological species identification. As such, they were included in all analyses with the prior that they may be admixed themselves (i.e., expecting support for models including gene flow either from F. aquilonia or an unsampled species, see below).

| Summary statistics and genetic structure
The principal component analysis (PCA) performed on the thinned data set clearly separated both species along the first principal component (PC, Figure 1), which explained ~29% of variation and was statistically significant (p < .01; Figure S1). Finnish individuals of F. polyctena were plotted closer to F. aquilonia individuals, when compared to other non-Finnish F. polyctena individuals.
The sNMF analysis considered one to eight possible ancestral clusters (K). Cross-entropy analysis revealed that the best K value for our data was two ( Interspecific differentiation ranged from 0.256 to 0.497, the lowest F ST being observed in Finland (see Figure S4).
Mean expected heterozygosity (H e ) per sampling location ranged from 0.120 to 0.185 (  For all these comparisons, divergence in sympatry, with negligible gene flow at current times, was found to have the highest composite expected likelihood ( Figure 3 and Table 4; Tables S3-S5 for parameter estimates obtained with all models). These results held even when excluding sites in coding regions and within 10 kb of coding regions (Tables S7-S10). The direction of gene flow, migration rates, divergence times and ancestral population sizes were all consistent between comparisons involving different pairs of locations ( Figure 3a,b,c, Table 4). The most parsimonious explanation for this consistency is that different pairs of sampling locations share a similar divergence history. Hence, taken together, we consider that our results support a scenario of divergence in sympatry. The expected data sets approximated by the "Sympatry" model with asymmetric migration (Figure 2c) fit the observed data reasonably well ( Figures   S5 and S6), suggesting that it captures key demographic events shared by all sampling locations. For all comparisons, the models with the second highest likelihoods are also scenarios of divergence with gene flow, either before or after a period of isolation (Tables S2-S4). The overall trends in parameter estimates (e.g., ancestral population effective sizes, migration rates; see below) are maintained,

Formica aquilonia
Switzerland but variation in parameter estimates between population pairs is far greater in models with second highest likelihoods compared to the model with the highest likelihood. Importantly, the likelihood of the"Allopatry" model with ten parameters was consistently lower than that of the best model, indicating that model complexity did not drive our results (Tables S3-S6). We also note that the allopatry models without gene flow can be seen as nested within models with gene flow. Given that the estimated scaled migration rates (2Nm) are much higher than zero (e.g., ancestral 2Nm between 0.50 and 1.25 across comparisons, despite a search range between 10 −10 and 20, see Tables S2-S6), provides further evidence for gene flow.
Our analyses of several sampling location pairs yielded consistent results on four aspects of the divergence history between F. aquilonia and F. polyctena (see Table 4  All times are given in number of generations and represented proportionally to each other across panels, as the time of divergence in (a) was taken as reference. All effective sizes are given in number of haploids. Sizes at a given time period (i.e., before or after the size change) are represented proportionally to each other across panels, with the F. polyctena sizes in (a) serving as reference, that is, all recent, post size-change N e estimates are proportional to each other but not to ancestral, presizechange N e estimates, while all ancestral N e estimates are proportional to each other but not to recent estimates). Arrows indicate the number of migrants per generation, their size is representative of this value. The direction and colour of the arrows are indicative of the direction of the gene flow. Note that, while recent migration rates after the size change are not represented in (a) and (b), these migration rates are different from 0. Scaled migration rates (2Nm) are 0.01 from F. aquilonia to F. polyctena and 7.29 × 10 −5 from F. polyctena into F. aquilonia for (a), and 4.04 × 10 −5 from F. aquilonia into F. polyctena and 2.88 × 10 -6 from F. polyctena into F. aquilonia for (b) (see Tables S3-S6). Confidence intervals are displayed in Table 4 F. polyctena at a rate averaging 0.5 migrants per generation (2Nm ,   Table 4). This signature of asymmetrical gene flow was found in all the models that considered gene flow during divergence (Tables S3-S6).

| Past gene flow between F. polyctena and F. aquilonia cannot be explained by migration from an unsampled sister species
As several species from the F. rufa species group are known to hybridize (Seifert, 2021), we tested whether the patterns of gene flow inferred between F. polyctena and F. aquilonia could be caused by migration from an unsampled, "ghost" species. The likelihood of these models with "ghost" introgression was lower than all other "nonghost" models, suggesting that gene flow between F. polyctena and F. aquilonia during divergence was the most parsimonious scenario for the samples considered in our study (see Tables S3-S6 for the parameter estimates obtained with the four "ghost" models for all population pairs).

| DISCUSS ION
Divergence history between two related species is often inferred using samples from a single population pair, assuming these samples capture the divergence history throughout the species' ranges. Yet, this assumption is rarely tested. In this study, we test it by using whole-genome data from samples collected across the European distributions of Formica polyctena and F. aquilonia to reconstruct their speciation history. Using a moderate number of individuals from each geographic sampling location, we were able to infer consistent speciation histories across distinct sample pairs, both in terms of mode of divergence and parameter estimates (e.g., divergence times, ancestral population sizes). We consider that this consistency supports our general conclusion that F. aquilonia and F. polyctena diverged with gene flow. Our approach also underlined how present-day species distributions affect the inference of the divergence history. Interestingly, and in line with expectations, we detected reduced gene flow at recent times between allopatric sampling locations while gene flow increased between sympatric Finnish samples in recent times. Finally, our modelling results ruled out the alternative scenarios where gene flow inferred between F. polyctena and F. aquilonia would actually be caused by gene flow from a third, unsampled species. Below, we discuss the implications of our findings regarding divergence in the F. rufa species group, and, more generally, the insights gained from contrasting two species across their ranges in terms of speciation research.

| Samples from multiple locations yield a similar divergence history between the wood ants F. polyctena and F. aquilonia
Analyses carried out with several heterospecific samples across the species ranges supported the same scenario: F. polyctena and F. aquilonia diverged with asymmetric gene flow. Formerly, the possibility of divergence in allopatry in different glacial refugia had been discussed for F. rufa species group ants .
However, it is not surprising that we found evidence for gene flow between F. polyctena and F. aquilonia, as several Formica species retain the ability to interbreed and produce viable offspring Purcell et al., 2016;Seifert & Goropashnaya, 2004) and interspecific gene flow has been previously described in many ant species (Feldhaar et al., 2008;Seifert, 2009Seifert, , 2019Steiner et al., 2011). Yet, one of our most unexpected results is that gene flow during divergence is consistently inferred to be asymmetrical, with gene flow from F. aquilonia to F. polyctena. This would be observed if prezygotic isolation mechanisms were stronger in F. aquilonia than in F. polyctena or if F. aquilonia males were more likely to disperse than F. polyctena males. This signal of unidirectional gene flow could also be linked to a difficulty for F. polyctena individuals to find conspecific mates, which would be expected if this species has a smaller population size than F. aquilonia, as is consistently inferred by our analysis (see below).
While the analyses of multiple heterospecific sample pairs supported the same divergence scenario, they also yielded similar and F. aquilonia) around 490,000 years ago (Goropashnaya et al., , 2012. In all heterospecific population pairs considered, the effective size of the ancestral population of both F. polyctena and F. aquilonia was inferred between 460,000 and 500,000 haploid individuals. After the divergence of these species, F. aquilonia was always inferred to have a larger ancestral N e than F. polyctena. Consistent N e estimates were obtained across models, indicating that samples of each species from different locations probably shared the same ancestral population. Due to the supercoloniality of both F. polyctena and F. aquilonia, we suggest that these species could follow the dynamics of a metapopulation (i.e., different supercolonies would be demes within a metapopulation). This could inflate our effective population size estimates, as coalescence of lineages within the same deme is expected to be faster than between lineages in different demes in a metapopulation (Wakeley, 2004). Under this scenario, lineages would be "trapped" within their demes before being able to coalesce with lineages from distinct demes. The maintenance of these lineages over longer time scales would therefore inflate the estimated effective population sizes. However, as both species in our study have similar colony structures, they should both be equally impacted by this overestimation.
While the two study species are known to hybridize in Southern Finland , they can also hybridize with other closely related species from the F. rufa species group such as F. lugubris or F. rufa (Seifert, 2021;Seifert & Goropashnaya, 2004;Seifert et al., 2010). In our "ghost" introgression models, we tested all pos- To our knowledge, this work represents the first instance where speciation histories between two ant species were reconstructed using genome-wide data. However, as for many non-model organisms, some key parameters required for inference remain unknown.
The generation time we used was based on F. polyctena queen longevity estimates (Horstmann, 1983), while the mutation rate was approximated using data from other social insects (honeybee and bumblebee mutation rates; reviewed in Liu et al., 2017). While these approximations should not bias our inferences for one species more than for the other, some uncertainty is associated with the date estimates provided throughout our manuscript. For example, increasing the generation time would result in older divergence estimates in absolute time (i.e., years). Finally, we attempted to control for linked selection that could generate genome-wide heterogeneity in migration rates and/or effective population size. We did this by rerunning all demographic models after removing sites within or close to genic regions, which yielded comparable results to those obtained with all sites. However, the wood ant genome is compact, so picking sites far from genic regions is difficult. Future work (following e.g., Tine et al., 2014) should fit models allowing for variation in effective migration rate and population size to study the impact of linked selection on the reconstruction of demographic histories in wood ants.

| Present-day context of heterospecific sample pairs induces variability in inferred migration rates
By comparing samples from multiple locations across the ranges of both species, our approach pinpointed commonalities in the divergence histories across all comparisons, providing a robust picture of the speciation history between F. polyctena and F. aquilonia. It also allowed us to document variation in the inferred estimates of certain parameters. As such, we can interpret this observed variation in parameter estimates in light of the present-day context of the heterospecific populations under consideration (i.e., whether or not they are presently in contact).
The Finnish F. polyctena population is clearly different from other populations considered when it comes to its recent effective population size (N e ). Instead of contracting population size, akin to the other populations of both F. polyctena and F. aquilonia, estimates suggest that this population has recently expanded. This seems unlikely given that F. polyctena is at its range margins in Finland (Stockan et al., 2016), and, based on two surveys, is the minority species in Finland (Punttila & Kilpelainen, 2009;Sorvari, 2021 Figure S3). Finnish samples were also more genetically diverse than their conspecific populations sampled outside Finland ( withstand increased exposure to sunlight (Punttila, 2020). This phenomenon might increase opportunities for direct contact between these two species due to closer proximity between heterospecific nests. The second scenario would involve the F. polyctena × F. aquilonia hybrid populations, which are common in Southern Finland (Beresford et al., 2017;. These hybrid populations could mediate gene flow between the species via backcrosses between hybrids and individuals of each species (e.g., indirect gene flow). Elucidating these causes would require a dense survey of both nonadmixed and hybrid wood ant populations in Finland. As demonstrated here, present-day sympatric versus allopatric sample pairs may yield different inferences of recent, but not older, history. This is because, deep enough in time, the samples of each study species will share the same ancestral population, so their present-day context does not affect the inference of old events. However, the longer it may take to reach the most recent common ancestor (MRCA) between any two populations, more variation may be expected in inferred demographic parameters.

| Implications for speciation research
Opportunities for, for example, gene flow may vary in both time and space, introducing variation in estimates of migration rates. For example, if two species would have utilized multiple glacial refugia throughout their ranges, in which opportunities for gene flow within and between the species were not homogeneous, it could introduce variation in the estimates of, for example, past gene flow (e.g., Schuler et al., 2019;Wielstra et al., 2021). Further, should the populations of one or both of the species be in contact with a third in a certain portion of their ranges, then they may inadequately represent the history of their species (e.g., Rautsaw et al., 2020).
Finally, it is also reasonable to hypothesize that species with low N e may be particularly sensitive to geographical variation between both intra and interspecific populations, resulting in higher variation in parameter estimates when using different pairs of populations. This would be due to increased genetic drift, which is known to drive quick differentiation between populations (e.g., Hoeck et al., 2010;Kvie et al., 2019).
In order to obtain an accurate picture of both past and recent demographic events, we suggest that when designing a speciation genomic study, it may be beneficial to sample multiple locations throughout the ranges of the studied species (as is routinely done when studying parallel evolution), sequencing only a few individuals per location. This approach would be especially valuable in cases where the ranges of two species overlap in only one or few areas (e.g., Mettler & Spellman, 2009; this study). In conclusion, we found that the mode of divergence and most parameters were consistently estimated by the four different population pairs used in this study.
However, as this study represents only one instance where variation in parameter estimates has been studied and reported, more studies with other taxa are needed to confirm our findings and aid in the establishment of guidelines regarding sampling for speciation history reconstruction.

O PE N R E S E A RCH BA D G E S
This article has earned an Open Data Badge for making publicly available the digitally-shareable data necessary to reproduce the reported results. The data is available at https://doi.org/10.5061/ dryad.j6q573ngd.

DATA AVA I L A B I L I T Y S TAT E M E N T
The raw sequencing data has been deposited at the European