Population genomic analyses reveal hybridization and marked differences in genetic structure of Scurria limpet sister species with parapatric distributions across the South Eastern Pacific

Abstract The study of sister species that occur in parapatry around biogeographic transition zones can help understand the evolutionary processes that underlie the changes in species composition across biogeographic transition zones. The South Eastern Pacific (SEP) coast is a highly productive coastal system that exhibits a broad biogeographic transition zone around 30–35°S. Here, we present a comparative genome‐wide analysis of the sister species Scurria viridula and Scurria zebrina, that occur in parapatry and whose poleward and equatorward range edges intersect in the 30–35°S SEP biogeographic transition zone. We sampled 118 specimens sourced from nine sites from Tocopilla (22°S) to Chiloé (41°S) including one site where both species overlap and analyzed over 8000 biallelic single nucleotide polymorphisms. We found evidence of hybridization between these species in the contact zone and found significant but contrasting population structures for both species. Our results indicate that the genetic structure in S. viridula, which is currently expanding its range poleward, follows a simple isolation by distance model with no traces of natural selection (no evidence of outlier loci). In contrast, S. zebrina, which has its equatorward range edge at the transition zone, displayed a pronounced genetic break approximately at 32–34°S, along a region of marked environmental heterogeneity in association with a semi‐permanent coastal upwelling regime. For S. zebrina we also found 43 outlier loci associated with this genetic break, with a significant proportion of them clustering in a single linkage group. This marked difference in the presence of outlier loci between species suggests that they could be responding differently to local environmental challenges found at their overlapping geographic range edges, thus providing important new insights about genomic changes around biogeographic transition zones in sister species and the forces that shape genetic diversity in intertidal marine species.


| INTRODUC TI ON
The world's oceans are in stark contrast to most terrestrial ecosystems as they are embedded in a water matrix with very few physical barriers that can prevent dispersal and where most organisms have at least one mobile phase during their life cycle that is suited for long-distance dispersal (Hellberg, 2009;Kinlan et al., 2005;Lester et al., 2007;Shaw et al., 2019). Despite this high potential for dispersal, species distribution is far from being homogeneous, which is attested by the rich number of biogeographic transitions described to date (Spalding et al., 2007). Transition zones between biogeographic regions are paradigmatic, representing good systems for studying the origin of marine biodiversity (Angert et al., 2020;Dawson et al., 2002;Haye et al., 2014;Johannesson et al., 2020;Kerr & Alroy, 2021). These zones often display important changes in species composition, and for species that are present across them, changes in species abundance, or changes in the genetic composition (genetic structure) are also commonly reported (Bowen et al., 2016;Golla et al., 2020). In this sense, comparative studies of sister species that are in contact over a transition zone offer a good opportunity to explore the historic and evolutionary processes associated with their population divergence (Johannesson et al., 2020).
The South Eastern Pacific (SEP) extends over 6000 km from northern Peru to the southern tip of South America. It is a heterogeneous seascape characterized by an intense and spatially structured coastal upwelling circulation (Aravena et al., 2014) that makes it one of the most productive marine ecosystems on the planet (Thiel et al., 2007). This region is also characterized by a complex biodiversity distribution that inspired several authors to define different biogeographic units (see Lara et al., 2019). Possibly the most widely accepted division is the one by Camus, 2001, with two biogeographic provinces with distinct species assemblages separated by a transition zone. The Peruvian Province extends from 4 to 30°S, the transition zone called the intermediate area expands from 30 to 42°S and the Magellanic Province expands from 42°S down to 56°S (Camus, 2001). Much attention has been given to the biogeographic break located at 30°S that separates the Peruvian province from the intermediate area. This break is the distributional boundary for many species (both equatorward and poleward) and has been documented as an effective barrier to gene flow for some species that have distributions that cross this break (for examples see Haye et al., 2014Haye et al., , 2019Montecinos et al., 2012;Tellier et al., 2009) but not for others (such as in Cárdenas et al., 2009;Rojas-Hernandez et al., 2016). In addition, long-term surveys have shown that environmental characteristics, particularly upwelling, not only are highly heterogeneous along the Chilean Pacific coast but also change dramatically across the 30°S latitude (Aravena et al., 2014;Broitman et al., 2018;Lara et al., 2019;Torres et al., 2011). Thus, populations of wild benthic organisms inhabiting this region can experience differences in their exposure to coastal upwelling. These observations have fueled different hypotheses regarding the role of this break as a barrier to gene flow and expansion of species distribution. Yet the mechanisms and underlying evolutionary processes behind these patterns remain unclear. It is in this geographic context that the limpets of the genus Scurria are found.
The genus Scurria (Gray, 1847) belongs to the most ancestral group of living Gastropods, the order Patellogastropoda (Nakano & Ozawa, 2007;Nakano & Sasaki, 2011). Patellogastropod limpets are abundant inhabitants of intertidal and subtidal rocky shores throughout the global oceans and play important ecological roles in littoral marine ecosystems (Aguilera et al., 2013;Espoz et al., 2004;Nakano & Ozawa, 2007). The genus Scurria encompasses 8 described species based on morphology and DNA (16S) data, all endemic to the SEP (Espoz et al., 2004), but a recent unpublished work using six molecular markers suggests the presence of 10 species in this group (Asorey, 2017). Here, we present the results of a comparative population genetic survey of Scurria viridula (Lamark, 1819) and Scurria zebrina (Lesson, 1830). These sister species diverged approximately 15 My ago (Nakano & Ozawa, 2007) and occur in parapatry along the SEP coast of Chile. S viridula is distributed along the Peruvian province between 12 and 33°S, while S. zebrina is distributed along the transition zone between 31 and 42°S ( Figure 1). Both species overlap on a narrow zone of approximately 250 km around the 31-33°S transition zone (Aguilera et al., 2020). Previous studies indicate that this overlap is recent and due to the southwards range expansion of S. viridula in the last 50 years (Rivadeneira & Fernández, this genetic break, with a significant proportion of them clustering in a single linkage group. This marked difference in the presence of outlier loci between species suggests that they could be responding differently to local environmental challenges found at their overlapping geographic range edges, thus providing important new insights about genomic changes around biogeographic transition zones in sister species and the forces that shape genetic diversity in intertidal marine species.

K E Y W O R D S
intertidal, parapatry, population genomics, RADseq, Scurria, South Eastern Pacific 2005). Both species occupy a similar ecological niche but have been found to segregate spatially at a fine scale in the zone where they cooccur (Aguilera et al., 2013). In this study, we test for the presence of hybrids among these two species in the contact zone. We also describe and compare the genetic structure of these sister species in an effort to begin understanding the evolutionary history that resulted in the current distribution of these species across a transition zone (Sousa & Hey, 2013).

| Sample collection and study
We collected muscle tissue samples from randomly selected S. vir-  Table 1.

| Ethics statement
This research was undertaken in agreement with the policies and procedures of the Universidad Austral de Chile bioethical committee. Samples were collected under permit R. Ex N 2036, 2019 from the Subsecretaría de Pesca y Acuicultura (SUBPESCA) of the Chilean government.

| RAD sequencing
DNA was extracted using the GenJet Genomic DNA purification kit (Thermo Scientific™, Waltham, MA, USA) following the manufacturer's protocol. Genomic DNA was shipped to Floragenex (Oregon, USA) where libraries were prepared using the restriction enzyme F I G U R E 1 Map showing the location of the distribution and sampling sites for this study SphI, fragment size selection of 300-500 bp, and individual 10 bp barcodes for each sample. Two libraries were constructed and sequenced using Illumina 4000 technology, 101 bp single end.
Read length after trimming and filtering was 91 bp. Only reads with an average phred-score >20 in a 10 bp window were kept for further analyses. Sequences with >1 bp ambiguity in barcodes or with one or more ambiguities in the restriction site were removed.
Demultiplexed reads were aligned to the reference genome of the congeneric species Scurria scurra (unpublished data) using bow-tie2 (Langmead & Salzberg, 2012) with "-sensitive" settings. Aligned read files were then used to generate a catalog of RAD loci using ref_map.pl in STACKs version 2.53. We then used the populations module in Stacks to calculate population genetic statistics and to filter loci. We kept only loci present in 95% of the samples, with a minor allele frequency of at least 3% and a maximum observed heterozygosity of 0.6. Only one SNP per locus was kept using the flag "-write-random-snp". Individuals with an average coverage across loci below 5x were removed from the dataset. We produced three datasets with these settings: one including samples from both species, one including only S. viridula samples, and one including only S. zebrina samples. In both single species datasets, all putative hybrids were not included for analyses. Genetic summary statistics were estimated using the population module of STACKs.

| Potential hybrids
Since these two species became in contact recently, we evaluated if there were genetic signals of hybridization. We expected that if hybrids exist, these should be found within the overlapping distribution area as Scurria limpets are considered to have low dispersal potential (Haye et al., 2014). To identify potential hybrids between species we analyzed the complete dataset (data from all individuals from both species) and performed a principal components analysis (PCA) using the function "glPca" from the Adegenet package (Jombart & Ahmed, 2011). We used LEA (Frichot & François, 2015) to estimate the ancestry of individuals under the hypothesis of two species (K = 2) with alpha = 10. We also used the program NewHybrids v1.1 (Anderson & Thompson, 2002) to estimate the posterior distribution that individual samples fall into different hybrid classes. Because NewHybrids cannot handle large numbers of loci, we filtered the full dataset and removed all loci with missing data, loci with minor allele frequency <0.4 (since the number of samples per species was similar, we expected a minor allele frequency ~0.5 for loci in which different alleles were fixed in each species). We also removed loci that were less than 500 kb apart.
The filtered dataset used included 225 loci. Individuals of S. viridula from the northernmost sampling site (TO) and individuals of S. zebrina from the southernmost sampling site (CH) were assigned as pure individuals for each species, respectively. We used the R package ParallelNewhybrids (Wringe et al., 2017) to run NewHybrids in R. We ran the program five times using Jeffrey's prior for both allele frequencies (theta) and mixing proportions (pi) with a burn-in of 20,000 followed by 50,000 sweeps.

| Population structure
We performed a PCA analysis for each of the two species separately as described in the previous section. We also ran LEA to estimate the most likely number of genetic clusters for each species by running the snmf algorithm for values of K = 1 to K = 10. For each species, TA B L E 1 Sampling information and genetic diversity summary statistics for each sampling location are estimated from species-specific datasets we also tested for Isolation by Distance by running a Mantel test to estimate the correlation between a pairwise F ST distance matrix and a Euclidean geographic distance matrix. Mantel tests were performed using the hierfstat R package with 1000 permutations for significance. We also performed an analysis of molecular variance (AMOVA) using the "poppr.amova" function from the poppr R package (Kamvar et al., 2014).
We tested for loci that exhibited significant deviations from neutral expectations in each species. We did this by performing two different outlier tests in each dataset. First, we used the R package "OutFLANK" (Whitlock & Lotterhos, 2015). Genetic diversity values in S. zebrina were on average two times higher than in S. viridula.

| Isolation by distance
There was a strong and significant correlation between the genetic distance and geographic distance matrices for S. viridula (Mantel r = 0.968, p = .0223). For S. zebrina, the correlation between genetic and geographic distance matrices was significant but weaker (Mantel r = 0.522, p = .008). A visual inspection of the correlation between genetic and geographic distance matrices clearly shows that for S. viridula genetic distance between populations increases linearly with geographic distance. For S. zebrina, this relationship is not linear ( Figure 6).

| Outliers
The

| DISCUSS ION
Our results indicate that two sister species with parapatric distributions, S. viridula and S. zebrina, appear to be hybridizing in the geographic zone where they overlap. Despite that both species display significant genetic structures within their geographic ranges, S. zebrina has twice the genetic diversity of S. viridula. Our results also indicate that the genetic structure of these two species is shaped by different evolutionary mechanisms. The genetic structure in S. viridula is consistent with a simple neutral genetic divergence model with isolation by distance, while S. zebrina appears strongly influenced by divergent natural selection (with several outlier loci identified by two different methods) with a genetic break at 32-34°S. Below, we discuss these results in the geographic context where these species are found. We also discuss the perspectives and novel opportunities this system offers to study hybridization and to better understand  Ntuli et al., 2020;Zardi et al., 2007).
Our results strongly suggest that these two species hybridize in the geographic area where they co-occur. While our sampling effort is not sufficient to characterize the degree and frequency of hybridization, our results do suggest that in this area, hybrid individuals with different degrees of introgression are rather common.
Our results also indicate that introgressions seem to be occurring in both directions. These results suggest that neither of the species has developed efficient mechanisms of reproductive isolation.
According to Rivadeneira and Fernández (2005)   introgression suggests that hybrids of these two species are fertile and reproductive isolation has not yet been achieved. Similar introgression patterns have been also reported for other congeneric marine gastropods in places where they occur in sympatry (Costa et al., 2020;Galindo et al., 2021;Hirase et al., 2021). Scurria, like most Patellogastropoda (Kolbin & Kulikova, 2011) are broadcast spawners and as such, the number and diversity of possible reproductive barriers might be limited compared with other organisms with different reproductive strategies (Nydam et al., 2017). Further studies that include laboratory crosses, genetic field surveys with higher sampling effort, and experiments to measure the performance of hybrids are needed to test these hypotheses.
The biogeographic zone between 30 and 35°S is characterized by a marked transition in oceanographic conditions, with a weak and persistent upwelling regime to the north and a variable strong upwelling to the south of this point and down to 39°S (Broitman et al., 2018;Hormazabal et al., 2004;Lara et al., 2019). This transition is also a biogeographic and phylogeographic break for numerous organisms including algae, intertidal barnacles, gastropods, and crustaceans (Ewers-Saucedo et al., 2016;Haye et al., 2014Haye et al., , 2019Sánchez et al., 2011;Thiel et al., 2007). Within both species, the major observed genetic partition clearly separated the region where both species overlap (30-34°S) from the rest of the samples. These genetic transitions coincide to some extent with two known biogeographic and phylogeographic breaks at 30 and 35°S, respectively (Lara et al., 2019). First, our results indicate that there is a significant genetic structure in S. viridula that separates Paposo (25°S) from the remaining sites to the south (29-31°S). This genetic transition coincides approximately with the 30°S biogeographic break. We note, however, that for S. viridula, the genetic composition of samples between 29 and 31°S is homogeneous and differentiation further north follows neutral expectations and isolation by distance. Our sampling design indicates that a genetic break in this species, if present, lies somewhere between 25 and 29°S. Yet, the clear pattern of isolation by distance and the lack of selection footprints suggests that genetic differentiation for S. viridula is rather gradual and could be explained simply by the limited dispersal potential in these organisms (Pinsky et al., 2017). Further sampling between 25 and 25°S would help elucidate if this is a genetic break or if genetic differentiation is indeed gradual. For S. zebrina, the genetic break occurs between 31 and 34°S and close to the biogeographic break around 35°S previously reported for benthic macroinvertebrates (Lancellotti & Vásquez, 1999) and that appears to be associated with major Andean river outflows (Lara et al., 2019). A phylogeographic break at 35°S has been reported in the ascidian Pyura chilensis (Quesada-Calderón et al., 2021) and in the beach-dwelling isopod Excirolana hirsuticauda . The latter species, E. hirsuticauda, has two phylogeographic breaks/transitions, one at ca. 30°S and the other at 35°S, coincident with the phylogeographic breaks for S. viridula and S. zebrina, respectively. It is interesting to note that the genetic break for S. zebrina around 35°S displays clear signals of natural selection. However, an alternative hypothesis (coupling hypothesis) is that intrinsic pre-or post-zygotic genetic incompatibilities are in fact responsible for this genetic differentiation pattern but are coupled with exogenous barriers associated with ecological selection (Bierne et al., 2011). A more detailed sampling scheme around this genetic break would help to further explore this idea. Taken together, these results add to the growing body of the literature that shows that this coastal area around 30-35°S is a natural laboratory for the study of speciation in marine organisms.
Patterns of genetic diversity and genetic structure differed between species and suggest that these two species, despite their ecological similarities, differ considerably in the main evolutionary mechanisms that are shaping their current genetic structure. S. zebrina displayed consistently higher within population genetic diversity and significant signals of divergent natural selection in shaping genetic structure. In contrast, genetic diversity in S. viridula was half of what was observed in S. zebrina, and genetic structure followed neutral expectations. It is interesting to note that for both species within population genetic diversity diminishes gradually from north to south. Lower within population genetic diversity is often associated with the edge of the range of distribution (Eckert et al., 2008) and lower population sizes (Frankham, 1996;Willi et al., 2006). Thus, these results suggest that ancestral populations of both species originated at the northern end of their respective distributions and have expanded towards the south. Under this scenario, these species may have originated in allopatry and became in contact recently following the southwards expansion of S. viridula (Rivadeneira & Fernández, 2005). However, further analyses that test different demographic history alternatives will be required to test this hypothesis. Higher genetic diversity observed in S. zebrina suggests that this species might have larger effective population sizes. Previous studies have shown that both species have similar census sizes along most of their geographic distributions but that densities of S. zebrina increase threefold around 32-33°S (Aguilera et al., 2013(Aguilera et al., , 2020. Interestingly, this area is adjacent to the genetic break we report for this species. This genetic break in S. zebrina (34-35°S) displays clear footprints of divergent selection, including 43 outlier loci. We also found that a large proportion of these outliers was concentrated in one of the linkage groups, a pattern that is consistent with the idea of divergent selection operating over a narrow genomic region (Nosil et al., 2009). In addition, we observed that for all outlier loci, one allele is fixed or near fixation at the locality north of the break compared to other localities further south. Previous studies have shown that the region between 30 and 33°S is characterized by strong betweensite differences in patterns of environmental variability associated with coastal upwelling circulation such as sea surface temperature, pco 2 , and pH (Broitman et al., 2018;Lardies et al., 2021;Meneghesso et al., 2020;Navarrete et al., 2005). Taken together our results strongly suggest that the Huentelauquén population of S zebrina (31.6°S) could have adapted to the heterogeneous environmental conditions of this transition zone, which would explain its high densities, high genetic diversities, and natural selection footprints. We note, however, that it is also possible that these highly divergent regions could be shaped by differential introgression resulting from the hybridization of these two lineages and complex demographic histories (Duranton et al., 2018). Further studies with better genome coverage (i.e., whole-genome re-sequencing) will help distinguish these scenarios.
Previous studies that have evaluated phenotypic plasticity and responses of these two species to different environmental conditions from localities in this geographic zone where they overlap (30-33°S) have shown that S. viridula is more phenotypically plastic than S. zebrina in the region where both species reach similarly higher population densities when compared to the center of their ranges (Aguilera et al., 2020;Broitman et al., 2018;Lardies et al., 2021).
In contrast, S. zebrina samples from Huentelauquén (31.6°S) were highly differentiated from other populations further south, away from this transition zone and this differentiation involves several loci that appear to be under divergent selection. Indeed, the phenotypic response of range edge populations at Talcaruca for S. zebrina, were significantly different from populations inside the range (Broitman et al., 2018). Our genetic results agree with the heterogeneous between-site phenotypic responses reported previously for these species. Specifically, in studies of geographic variation in phenotypic plasticity of morphological and physiological traits of both species, which appear to correlate to the heterogeneous seascape of carbonate conditions and sea surface temperature across this biogeographic break in the SEP (Broitman et al., 2018;Lardies et al., 2021).
Finally, further studies that evaluate phenotypic plasticity outside this transition zone for these two species and further genetic studies that reconstruct the demographic history of divergence of these lineages could help better understand the links between phenotypic responses and genetic structure.

ACK N OWLED G M ENTS
PS-A acknowledge support from Fondo Nacionanal de Ciencia y Tecnología (FONDECYT 1190710 and 1200636) and Millenium Nucleus NUTME (ANID NCN19_056), BRB, PAH, and MAL acknowledge support from Millennium Institute SECOS (ANID ICN2019_015) and FONDECYT 1190444. BRB was further supported through FONDECYT 1181300 and 1221669 and Millenium Nucleus UPWELL (ANID NCN19_153). We thank M. H. Oróstica for his valuable comments on the initial version of this manuscript. We thank Felipe Pontigo, Paula Ramirez, and members of the Saenz and Chango labs for assistance in the field.

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interests.

DATA AVA I L A B I L I T Y S TAT E M E N T
Individual fastq data files are available at the SRA repository of NCBI under Bioproject number PRJNA825578. Raw SNP data for all three datasets used in this study is available at DRYAD (https://doi. org/10.5061/dryad.k6djh 9w8p).