East is East and West is West: Population genomics and hierarchical analyses reveal genetic structure and adaptation footprints in the keystone species Paracentrotus lividus (Echinoidea)

The Atlanto‐Mediterranean edible purple sea urchin, Paracentrotus lividus, is a commercially exploited keystone species in benthic communities. Its browsing activity can deeply modify the littoral landscape, and changes in its abundance are of major conservation concern. This species is facing nowadays contrasting anthropogenic pressures linked to predator release, exploitation and sea warming. Management of this key species requires knowledge of its genetic structure, connectivity and local adaptation. Our goal was to assess the current global status of the species under a genomic perspective.


| INTRODUC TI ON
The population structuring of living organisms is shaped by the combination of different evolutionary forces such as migration, genetic drift or selection. Historically, the selection component has been underestimated in relation to other processes in population genetic analyses. However, the rise of genomic studies during the last decade has opened new windows for research including the analysis of outlier loci to look for genomic signals of local adaptation or associated to environmental cues (Benestan, Ferchaud, et al., 2016). The genotyping of thousands of markers spanning the whole genome has increased the power to detect even subtle genetic patterns previously unnoticed using markers such as microsatellites or mitochondrial DNA sequences (Bradbury et al., 2015(Bradbury et al., , 2010. Furthermore, the analysis of outlier loci has revealed additional levels of structuring in many cases (Milano et al., 2014;Vandamme et al., 2014) showing trade-offs between selection and dispersal (Carreras et al., 2017).
This additional improvement of population genomics, when compared to conventional population genetics, is especially relevant in marine organisms, as they include widely dispersing species and feature large population sizes rendering genetic differentiation across populations difficult to detect (Lamichhaney et al., 2012;Xuereb et al., 2018). For this reason, the inclusion of adaptive genetic variation is necessary to identify population structuring mediated by local adaptation to environmental conditions (Sandoval-Castillo, Robinson, Hart, Strain, & Beheregaray, 2018). The use of genome-wide markers increases the chance of identifying regions under selection and has been recommended for delineating conservation and management units of natural populations (Funk, McKay, Hohenlohe, & Allendorf, 2012).
The assessment of genomic signals of selection on non-model organisms is usually performed in two different ways (Ahrens et al., 2018). On one hand, outlier analyses (OAs) are based on the search of genome-wide markers showing outlier values of genetic differentiation (F ST ) among the studied populations as they are considered to be candidate loci for local adaptation (Carreras et al., 2017).
However, the power to find such outlier loci depends, by definition, on the mean values of genetic differentiation found among populations. Thus, if different levels of genetic differentiation exist (e.g. a deep genetic break separating two groups of genetically differentiated populations) the power to detect outliers and signals of local adaptation at the lower levels (i.e. within groups) may be compromised and obscured by patterns occurring between genetic groups. On the other hand, environmental association analyses (EAAs) use environmental variables to find genomic signatures associated to them, and thus potentially under selection (Benestan, Quinn, et al., 2016).
Again, these environmental variables can show different patterns in the different genetic groups (that likely occur in different geographic zones), and thus, only the associations related to the major breaks may be revealed. If the overall genetic structure is not accounted for in the assessment of genomic signals of selection, the results are likely to be partial. Under these circumstances, a hierarchical approach combining the analyses at different levels is fundamental to get out the most of genetic data (Carreras-Carbonell, Macpherson, & Pascual, 2006). Although several constraints have been described for the analysis of candidate loci under selection using either OAs or EAAs (Ahrens et al., 2018), the impact of hierarchical genetic structuring on the search of selection signatures at different scales has not been addressed, and this may be a critical issue, especially on species with wide distribution ranges.
In this study, we applied a genomics approach to assess the population structure of the edible purple sea urchin Paracentrotus lividus (Lamarck, 1816). This species has a wide distribution range across the Mediterranean Sea and north-east Atlantic. This sea urchin is a key component in benthic communities due to its grazing activity that regulates the density of macrophyte populations (Agnetta et al., 2015;Benedetti-Cecchi, Bulleri, & Cinelli, 1998;Palacín, Giribet, Garner, Dantart, & Turon, 1998;Sala, Boudouresque, & Harmelin-Vivien, 1998;Wangensteen et al., 2011). Changes in its abundance are therefore of major concern for the conservation of littoral communities. This species is nowadays facing contrasting anthropogenic pressures. While overfishing of predators can promote high densities of sea urchins (Hereu, Zabala, Linares, & Sala, 2005;Sala et al., 1998), this species is also exploited for their roe (Barnes & Crook, 2001), and it has shown early signs of collapse in some areas. This decline is probably the result of the combined impact of over-exploitation, climate change due to its temperature tolerances (Privitera, Noli, Falugi, & Chiantore, 2011;Yeruham, Rilov, Shpigel, & Abelson, 2015) and possible competition with other species (Rilov, 2016;Yeruham et al., 2015). A sound knowledge of the genetic structure, connectivity and local adaptation of this key species is necessary for a proper management.
The adults of P. lividus are known to have limited mobility (Palacín, Giribet, & Turon, 1997) but their long pelagic larval duration (PLD) of 20-40 days (Pedrotti, 1993), heterogeneity in settlement (Hereu, Zabala, Linares, & Sala, 2004;Tomas, Romero, & Turon, widely in Paracentrotus lividus, the species is sensitive to dispersal barriers, displays isolation by distance and faces local selective pressures associated to environmental conditions, all of which can render it more vulnerable than previously thought.

K E Y W O R D S
benthic communities, engineer species, genotyping by sequencing, oceanographic discontinuities, outlier analysis, population genomics, sea urchin 2004) and broadcast spawning behaviour (Minchin, 1992) suggest that this species may have low genetic differentiation among localities. Previous studies assessing the population structure of P. lividus using different sets of mitochondrial and nuclear DNA markers found significant genetic differentiation only across two known oceanic discontinuities, the Almería-Oran front separating the Atlantic and the Mediterranean populations (Calderón, Giribet, & Turon, 2008;Duran, Palacín, Becerro, Turon, & Giribet, 2004), and the Adriatic-Ionian front (Maltagliati, Giuseppe, Barbieri, Castelli, & Dini, 2010). Temporal variability among different cohorts of the same localities was also assessed suggesting little temporal variation, with some exceptions associated with abnormalities of circulation patterns (Calderón, Palacín, Palacín, & Turon, 2009;Calderón, Pita, Brusciotti, Palacín, & Turon, 2012) or chaotic genetic patchiness (Couvray & Coupé, 2018). However, it has been suggested that P. lividus might present higher levels of structuring across the Atlantic and Mediterranean than previously suggested (Penant, Aurelle, Feral, & Chenuil, 2013). This potential structuring at finer scale was evaluated within the Adriatic Sea with a population genomics approach using 2b-RAD coupled with Lagrangian simulations (Paterno et al., 2017). This first genomic study showed differentiation between the Adriatic-Ionian populations and two external populations (France and Tunisia), although no fine-scale structuring was found within the Adriatic. Furthermore, this study did not provide a general picture for P. lividus, given that its spatial scope did not cover the wide range of the species' distribution.
In this context, we used genotyping by sequencing (GBS) on 241 individuals of P. lividus sampled from 11 localities spanning its Atlanto-Mediterranean distribution range. The specific aims of this study are (a) to determine the global population structure of this keystone species using a panel of genome-wide markers, (b) to test the existence of selective pressures by identifying candidate loci under selection between and within genetic groups, (c) to relate the signal of these candidate loci to environmental variables and (d) to ascertain adaptation drivers by relating candidate loci under selection to known functions through gene annotation. Finally, we discuss the implications of our results in the context of management and conservation considering the species' current exploitation and foreseeable population regression due to environmental alteration and the complexity of its distribution range.

| Genotyping by sequencing and loci selection
A GBS protocol (Elshire et al., 2011) was carried out at the "Centre Nacional d'Anàlisi Genòmica" (CNAG). The genome was digested using the restriction enzyme ApekI (selected based on fragment size distribution in a preliminary assay), and the resulting fragments were ligated to individual barcodes and to common adapters. DNA fragments were paired-end sequenced (2 × 100 bp) in an Illumina HiSeq 2000 platform using 96 multiplexed individuals in each lane of a flow cell. Each individual was sequenced two or three times in order to increase sequence depth, totalling 10 lanes for this study.
The raw Illumina sequences were analysed using the GIbPSs toolkit, designed to analyse data from massive sequencing of nonmodel organisms (Hapke & Thiele, 2016). GIbPSs can handle pairedend sequences; it does not require the same sequence length at different loci and implements an indel detection method to flag potentially problematic loci that would include alleles of different sizes.
All sequences were trimmed to 80 bp to discard the lower quality bases at the ends, and the 5' restriction sites were removed. The default values for filtering were modified to increase stringency, and thus, sequences shorter than 32 bp, with N's, or with a phred score quality lower than 22 in a sliding window of 5 bp, were discarded.
A minimum overlap of 5 bp was required to assemble forward and reverse sequences of a paired-end read, and thus, the maximum length of a locus after assembling both paired-end sequences was 152 bp. Staggered read pair problems arose with loci shorter than the read length because the two restriction sites were very close.
In these cases, forward and reverse reads corresponded to exactly the same fragment including the restriction sequence at both ends.
Depending on the length of the fragment, the initial 80 bp trimming may not completely cut off the restriction enzyme sequence at the 3' end of each read, while the restriction sites at the 5' ends were removed. In these cases, the program could not overlap the sequences and appended them. Those duplicate sequences were identified, and only the forward read was kept for further analyses.
Locus search with GIbPSs was first carried out separately for each individual. Identical sequences in a given individual were grouped in the same cluster (svar in the terminology of the program). Different svars were grouped for each individual into loci using the default distance settings. A minimum depth of at least 5 sequences was necessary for a locus to be deemed as valid. We defined as alleles of individual genotypes those svars representing more than 30% of the total number of reads in a locus. Thus, heterozygote individuals were those having two alleles above this threshold as genotypes based on an uneven distribution of alleles were removed. Loci were identified as haplotype loci, based on the procedure used by GIbPSs and other software for "de novo" loci identification. Thus, all the polymorphic sites of the different sequences (svars) of each locus were included for identifying alleles. We used haplotype loci as previous studies have shown that working with sequence variants (multiallelic) instead of using only one SNP per locus (biallelic) is more informative (Casso, Turon, & Pascual, 2019;Ryman et al., 2006 haplotype loci dataset and for the requirements to use SNPs in some programs (see below) we also selected from each retained haplotype locus only the first SNP, using the option -uS 2 of GIbPSs, to obtain a new dataset, henceforth called the SNP loci dataset.

| Population structure
Hardy-Weinberg equilibrium (HWE) was assessed for each locus in each sampled locality with the "pegas" R package (Paradis, 2010 to avoid using the combination of paralogous loci in the same locus as they are expected to be consistently out of HWE across populations (Benestan et al., 2015).

Pairwise differences between localities were calculated with
Nei's estimator of F ST (Nei, 1987) in the "hierfstat" R package, using 999 permutations to calculate the respective p-values (Goudet, 2005) with the B-Y FDR correction. The results were represented with heatmaps and dendrograms with the "gplots" R package (Warnes et al., 2016). Observed and expected heterozygosities, allelic richness and F IS values per sampling location were calculated using the R packages "genetics" and "hierfstat" (Goudet, 2005; Warnes, Gorjanc, Leisch, & Man, 2019). In order to identify isolation by distance (IBD), a Mantel test was performed with the "vegan" R package (Dixon, 2003;Oksanen et al., 2019) using the shortest distance by sea between localities measured with Google Earth.
A discriminant analysis of principal components (DAPC) was performed with the "adegenet" R package (Jombart, 2008). We retained a total of N/3 PCs, being N the number of individuals, and 3 discriminant functions using localities as prior grouping information. Additionally, we used the Bayesian information criterion (BIC) to detect the optimal number of genetic clusters. We performed a k-means clustering from k = 1 to k = 10, using 10 5 iterations, and, in this case, we retained 250 PCs as the cumulative variance explained by the eigenvalues of the PCA was saturated at this value.
The results obtained were used to separate our samples per cluster that were represented in new DAPCs to better identify the genetic structure within each genetic cluster. Again, we retained a total of N/3 PCs and 3 discriminant functions using localities as prior grouping information. Additionally, population structuring was also evaluated using STRUCTURE, which implements a Bayesian clustering method to identify the most likely number of populations (K).
We used the strategy and parameters described in the literature (Evanno, Regnaut, & Goudet, 2005), and thus, we carried out 10 runs per each value of K ranging from 1 to 10. We used the model of correlated allele frequencies and a burn-in of 50,000 followed by 500,000 Markov Chain Monte Carlo steps. We estimated the ad hoc statistic ΔK in order to infer the most likely number of populations using STRUCTURE HARVESTER (Earl & vonHoldt, 2012). The 10 runs of STRUCTURE for the most probable K were averaged using CLUMPP version 1.1.2 (Jakobsson & Rosenberg, 2007). To test the resolution of detecting the population genetic structure of the SNP loci dataset compared to the haplotype loci dataset, we also carried with the SNP loci dataset a DAPC analysis, computed pairwise F ST values among localities and performed a STRUCTURE analysis to identify the number of genetic groups as described above.

| Identification of loci candidate for local adaptation
Four different sample groupings were used to detect candidate loci among the haplotype loci dataset based on outlier F ST (OAs): "All populations," "Atlantic versus Mediterranean," "Atlantic" and "Mediterranean." In the "All populations" grouping, all samples were included and grouped by sampling locality to obtain a general overview of the outlier loci across the whole geographica range sampled.
In the "Atlantic versus Mediterranean" grouping, all the samples were also included but they were combined in the two groups as identified by the k-means method in DAPC belonging to the Mediterranean and the Atlantic clusters (see Section 3). This analysis was set in order to detect outlier loci differentiating the Atlanto-Mediterranean transition. Finally, each cluster was analysed independently with their samples grouped by locality in order to explore the outliers of the internal structuring within each group ("Atlantic" or "Mediterranean").
In the four groupings, outliers were identified with Arlequin The F ST matrices of all subsets of outliers were compared to environmental variables that could explain the genetic structuring found.
Both monthly mean sea surface temperature and salinity data were obtained from Copernicus Marine Environment Monitoring Service database (CMEMS, http://marine.coper nicus.eu, product identifier: MULTIOBS_GLO_PHY_REP_015_002) using the R package "ncdf4" version 1.16 (Pierce, 2017) from January 1993 to December 2016 (288 months) and averaged. For each abiotic variable, we computed distance matrices using the pairwise differences in the mean, maximum, minimum and range of the variable. These distance matrices were compared individually with the F ST matrices using partial Mantel tests as implemented in the R package "vegan" (Dixon, 2003;Oksanen et al., 2019) and using the geographic distance among populations as a controling variable.
We used a redundancy analysis (RDA) (

| Regionally missing loci
We looked for geographic patterns with regionally missing loci, defined as the haplotype loci that did not pass the <30% missingness filter and were therefore discarded from the final dataset. For each one of the four identified areas (Atlantic, transition zone, western Mediterranean and eastern Mediterranean, see Section 3, Figure 2), we performed independent analysis using the non-filtered loci dataset. For each one of these partial datasets, we applied the same filters used for the whole dataset, as some of the filters may be dependent on the samples analysed. For all loci present in at least one of the partial datasets but discarded in the whole dataset, we estimated their frequency (percentage of individuals with these loci per population). We plotted their frequency across all localities using a heatmap and dendrogram with the "gplots" R package (Warnes et al., 2016).

| BLAST of candidate loci for local adaptation
The sequences of the candidate loci for local adaptation, found with either OAs (outlier loci) or EAAs (association loci), were blasted against the genome of Strongylocentrotus purpuratus in the NCBI database (https ://blast.ncbi.nlm.nih.gov/Blast.cgi) using the blastn option. We also blasted the sequences of all the regional missing loci that were completely absent in some of the partial datasets. The scaffold accession number, nucleotide position, name of the gene where the sequence is located (if applicable) and position (exon, intron or intergenic) were recorded for matches with E-values below 1 × 10 −4 . Sequences with a positive blastn hit (i.e. below the E-value threshold) were analysed using Blast2GO 4.1.5 software (Gotz et al., 2008). First, a blastx algorithm was applied using the NCBI Blast service with a taxonomy filter for "sea urchin" (taxid 7625) and the same Evalue of 1 × 10 −4 used for the blastn analysis. Positive matches were mapped and annotated using the default parameters of the program. Sequences with a blastn hit for which an ontology term could not be found were manually checked using UniProt to identify potential functions as described for other echinoderms. We used the GO terms of the annotated genes to produce treemaps of the biological samples using REVIGO (Supek, Bošnjak, Škunca, & Šmuc, 2011).

| Locus identification
For the 241 assayed specimens, 4,613,169 reads were obtained on average per individual (minimum 800,000 reads, Table 1

| Population genetic diversity
The selected loci had different lengths, from 32 to 152 bp (123.6 ± 34.3, mean ± SD), and the number of polymorphic positions in each locus was variable (25.9 ± 13.7). Therefore, the total number of alleles per locus had a wide range, with 64 loci having more than 100 alleles and 9 loci having only 2 alleles ( Figure S1).
The mean number of alleles per locus was 31.9 ± 21.8. However, the mean number of alleles found within a given locality was more homogeneous, with an average of 7.3 ± 0.2 alleles per locus (  (Table 1). Interestingly, a total of 5 loci were missing in all individuals of the two Atlantic localities, and an additional 6 loci were missing in all individuals from Crozon (French Atlantic).
All other markers were found in at least some individuals from all localities.

| Population structure and connectivity
F ST pairwise distances were low but significant after FDR correction in 37 out of 56 comparisons (Figure 2a; Table S1). Two clusters were identified following the Bayesian information criterion (BIC) that could be associated to Atlantic and Mediterranean affinities ( Figure S2a; Table S2). All the individuals from the Atlantic localities were assigned to one cluster. The localities from the Alboran Sea (TA and LH) had a mixed composition, with a decreasing abundance of Atlantic-type individuals as we move eastwards following the coastline, which is consistent with the idea that this is a transitional area. All other localities were made up exclusively of  individuals of the second cluster. Therefore, we will refer to all the individuals of cluster 1 as the Atlantic cluster and all the individuals of cluster 2 as the Mediterranean cluster (Table S2) Figure S4).

Moreover, the number of clusters inferred by STRUCTURE (K = 4)
was not clearly observed in the assigned individual probabilities ( Figure S4).
Considering all localities and the haplotype loci dataset, genetic differentiation significantly increased with geographic distance (Mantel test, R 2 = .618, p = .002). Furthermore, this relationship was significant, albeit with a lower correlation coefficient (R 2 = .514, p = .003) when only Mediterranean localities were considered, indicating that the significant isolation by distance detected is not only due to the genetic distances separating the Atlantic and the Mediterranean populations. Furthermore, the Mantel tests plots indicated that these results are unlikely to be caused by the presence of two differentiated groups but to a progressive isolation by distance ( Figure S5).

| Signals of selection
The number of outlier loci detected by OAs was larger when comparing all samples grouped by locality (Table 2). Arlequin software and Lositan software were more conservative and generally detected fewer outlier markers than BayeScan, and thus, they were used for outlier selection in most searches (Table 2). After retaining the loci of the two more conservative programs, we did not detect any outlier locus presenting lower F ST than expected under neutrality, and thus possibly under balancing selection. Furthermore, the different groupings not only revealed different number of outliers but also the outliers found were different in many cases (Figure 3).
Most outlier loci found using all the samples grouped by locality ("All populations" outlier dataset) were also detected when comparing the Mediterranean and Atlantic clusters. Nevertheless, only c. 50% of outlier loci found within the Mediterranean and within the Atlantic clusters were also found in the "All populations" outlier dataset. The outlier loci identified when comparing the two clusters were mostly different to those identified within clusters (Supplementary Dataset). Mean global F ST value per locus was always larger when using outlier than no outlier loci for any the three different programs used (Arlequin, Bayescan and Lositan) and by the four different outlier analysis performed ( Figure S6).
As expected, the range and the mean values of pairwise genetic distances were larger using any set of outlier markers than using all loci (Table 2). Pairwise genetic differences using all loci (Figure 4a) and the three sets of outlier loci (Figure 4b (Table S5). Nonetheless, this set of outliers showed a marked differentiation between western and eastern Mediterranean localities, as well as within the latter (Figure 4d; Table S5). DAPC representations obtained with the different sets of loci ( Figure S7) were basically similar to those obtained with all loci (Figure 2 and Figure   S3), albeit the locality closer to the Gibraltar Strait (TA) appears separated in the representation of the Mediterranean cluster with the outliers found in the groupings of "All localities" and "Atlantic versus Mediterranean." The 31 outliers found in the Mediterranean grouping showed overall less resolution, but allowed a clearer and significant differentiation among the Turkish and Adriatic localities as well as between the eastern and western Mediterranean localities ( Figure S7; Table S5).
Several pairwise population distances for several environmental variables were significantly correlated to F ST matrices based on outlier loci after controlling for the effect of geographic distance with partial Mantel tests (Table S6) Between 50% and 71% of the candidate loci identified with the different analysis (OAs and EAAs) and groupings had positive blastn hits (E-value below 1 × 10 −4 ) to Strongylocentrotus purpuratus NCBI sequences, while 27% of the regional missing loci (Table 2) A total of 195 out of the 285 candidate loci found to be associated with a gene were located completely or partially in exonic regions. Around one third of the loci with a blastn hit had associated GO functions in all cases (Table 2). When considering all samples, GO functions were related mainly to apoptotic process, lysosomal transport and organization, response to oestrogen, protein ubiquitination and nucleobase-containing compound catabolism anatomical structure development among others ( Figure S9a, Supplementary Dataset). When considering only the Mediterranean samples, similar functions appeared, such as apoptotic process, protein ubiquitination, anatomical structure development, while new functions appeared, such as vesicle-mediated transport or response to mechanical stimulus ( Figure   S9b). Regarding genes potentially preventing random mating, we found 1 gene coding for a receptor for egg jelly that was an outlier when considering either all populations or the Atlantic and TA B L E 2 Comparative results of the analysis of candidate loci for selection  (12) Note: The first row shows the results of the original dataset with all loci for comparison. The other rows show the results of the analyses of candidate loci for selection using three different methodologies as stated in the first column: outlier analysis based on F ST (OAs), environmental association analysis (EAAs) using RDA and regional missing loci. Different groupings were also used. For OAs: "All localities" includes all samples grouped by locality, "Atlantic versus Mediterranean" includes all samples grouped by their membership to the Atlantic or Mediterranean clusters, "Mediterranean cluster" includes all samples of the Mediterranean cluster grouped by locality, and "Atlantic cluster" includes all samples of the Atlantic cluster grouped by locality. The individuals in the Atlantic and Mediterranean clusters are those defined by the k-means clustering analysis with k = 2. For EAAs: "All localities" includes all localities and "Mediterranean" includes the Mediterranean localities excluding the transition zone. For each analysis, we provide the number of outlier loci identified, the programs used (the two most conservative in each case). For the OAs, we also provide the maximum and mean pairwise F ST . The number of outlier loci with BLAST hits is given in the last column, and those to which a candidate function could be assigned are indicated in parentheses.
Mediterranean clusters and a sperm-associated antigen, found by RDA when considering all samples (Supplementary Dataset).

| D ISCUSS I ON
Using thousands of genome-wide markers, we detected so far unnoticed patterns of genetic structure among populations of P. lividus spanning most of its distribution range. We detected a gradient of differentiation following a longitudinal dimension, overlain by a major differentiation at the Atlantic-Mediterranean transition.
Furthermore, specific analyses using candidate markers for adaptation enabled us to document population structure potentially driven by selection also at fine-scale level, using either OAs or EAAs.
Different candidate markers for selection were detected when searched between and within subsets of genetically grouped individuals (Atlantic and Mediterranean clusters), likely indicating different selective pressures that would remain otherwise undetected without the hierarchical analysis approach followed in the present study, considering first all localities and subsequently analysing each identified genetic group. According to the genomic signals of association to environmental variables, adaptation to maximum salinity and temperature appeared as an important driver of the transition between Atlantic and Mediterranean basins, while temperature range associated signals were detected within the Mediterranean.
We also highlight how the use of the complete sequence information in the selected loci give a more detailed picture of genetic structure than using a single variable position (SNP) per locus.

| The Atlanto-Mediterranean transition
The major signal of genetic structuring detected in our data is the shift generated by the separation between Atlantic and Mediterranean populations. The existence of a genetic break associated with the oceanographic barrier to dispersion represented by the Almería-Oran front had been previously detected in P. lividus (Calderón et al., 2008;Duran et al., 2004) and is a common feature of many species spanning the Atlanto-Mediterranean transition (Pascual, Rives, Schunter, & Macpherson, 2017;Patarnello, Volckaert, & Castilho, 2007). However, our study unveils subtler patterns than previously found. The break between the Atlantic and the Mediterranean is not a clear-cut one. Rather, there is a transitional zone across the Gibraltar Strait through the Alboran Sea, as observed in other species (Pascual et al., 2016;Schunter et al., 2011). The Alboran Sea has a complex hydrology, with a system of gyres (Millot, 2005 suggests that the Atlanto-Mediterranean break is also a major driver for the overall genetic structuring caused by selection. This is reflected by the fact that these two outlier subsets correlate with the same environmental variables, such as maximum temperature and salinity. The impact of these variables in the structuring of the whole dataset is also found in the analysis of association loci with F I G U R E 3 Venn diagrams showing the number of candidate loci for adaptation identified using the different groupings and methodologies. The methods include "Out," as the detection of outlier loci based on F ST and "RDA" as the detection of association loci based on redundancy analysis. The groups include "All," by using all samples grouped by locality; "AvM," by using all samples grouped as belonging to either the Atlantic or Mediterranean clusters; "M" by all samples of the Mediterranean grouped by locality; and "A" by using all samples of the Atlantic grouped by locality Ion transport has been frequently related to osmoregulation (Guh, Lin, & Hwang, 2015;Marshall & Grosell, 2005). Environmental values obtained from our sampling sites showed drastic changes from west to east, correlating with well-known longitudinal gradients of oceanographic parameters (UNEP/MAP, 2012). Even at small geographic scales, salinity values can change from c. 36.2‰ at the Atlantic side of the Gibraltar Strait to over 37.5‰ at the east of the Almería-Oran front (Brankart & Brasseur, 1998;Hidalgo et al., 2015;Pascual et al., 2016). These gradients effectively pose selective pressures on a group known to have limited abilities to regulate their osmosis and ion concentrations (Binyon, 1972). Behavioural changes related to different salinities have been described in echinoderm species (Agüera, Schellekens, Jansen, & Smaal, 2015;Talbot & Lawrence, 2002) indicating that salinity may be a strong selective pressure for echinoderms. Interestingly, we found a locus under selection (in both the "all localities" outliers and the "Atlantic versus Mediterranean" outliers) that matched a receptor protein for egg jelly (REJ7, locus 1,361,733 in Supplementary Dataset). The REJ proteins are involved in sperm-egg binding (Gunaratne et al., 2007) and are thus potential sources of assortative mating dependent on compatible protein types and potential drivers for structuring or even speciation (Vilela-Silva, Hirohashi, & Mouro, 2008). For this reason, sequencing of some REJ proteins has been used to test for population structuring associated with egg-sperm recognition in sea urchins (Muller, Heyden, Bowie, & Matthee, 2012), in a manner similar to that described for the bindin proteins (Calderón, Turon, Turon, & Lessios, 2009 in all localities regardless of their situation in the distribution area indicating that Atlantic and eastern Mediterranean populations are not marginal and thus not more prone to genetic drift. The frequency patterns of these regional missing loci can be the result of historical processes, mutations appearing at a given locality and migrating towards neighbouring localities and further maintained by adaptive processes. The blast results pointed to several genes associated to ciliary movement, and once again to ion transport. As the same biological processes found in the candidate loci explaining the Atlanto-Mediterranean transition seem to also impact these regional missing loci, selection determining their frequency seems a plausible explanation. This within Mediterranean structuring also shows some within sea signals of the role of selection. The candidate loci found with either OAs and RDA when considering only the Mediterranean populations resulted on new sets of loci that are not driving the Atlanto-Mediterranean transition (Figure 3) and that showed a clearly different pattern of genetic variation (Figures 4d and 5b). The power of detecting outlier loci by OAs relies on the baseline F ST defined by neutral markers (Ahrens et al., 2018). Thus, the presence of a strong barrier to dispersion, such as the Almería-Oran front, increases the maximum values of F ST defined by the neutral differentiation and thus the threshold to define outliers ( Figure S5). A similar effect could be found while looking for association loci with RDA, as local environmental patterns may be obscured by general gradients when performing EAAs, and thus, the results are influenced by the scale and design of sampling (Ahrens et al., 2018). Consequently, the associated loci within the Mediterranean remained undetected until we analysed this region independently. Our results indicate that, in order to extract all information from the data, it is not enough to perform a candidate loci analysis considering all populations studied. Instead, a careful hierarchical analysis from the global, considering all localities, to the partitioned datasets, following genetic groups, allows a more in-depth study of the selective pressures acting between and within groups. In our case, the loci detected within the Mediterranean had BLAST hits that corresponded to functions of the cell cycle and DNA repair. These functions can be related to response to DNA damage (Bartek & Lukas, 2007;Ermolaeva & Schumacher, 2014). Which kind of stress induces these selective responses cannot be ascertained at present. In any case, it is remarkable that no lysosomal ion transport or organization functions were found, thus suggesting that within the Mediterranean, the role of salinity on the overall structuring might be much less important than when considering the whole distribution of the species, with temperature range emerging as more highly impacting population differentiation.

| Conservation and management implications
Population genomic results are key for conservation and management of ecologically relevant species. In the case of P. lividus, changes in the population dynamics can have a double-sided effect on benthic ecosystems. On one hand, it controls the abundance of macrophytes, so any increase in its densities can result in loss of habitat-forming algal species and generation of barren grounds (Agnetta et al., 2015;Sala et al., 1998). On the other hand, P. lividus at low densities can control opportunistic algal proliferation (Palacín et al., 1997;Bulleri, Benedetti-Cecchi, & Cinelli, 1999), generating patches for settlement of organisms and keeping the overall diversity. Thus, the density of this species should be finely tuned to keep an ecological balance, and human impacts are altering this balance in both senses. While overfishing of predators result in increased sea urchin densities in some areas, P. lividus populations face several threats in others. For instance, P. lividus is intensely harvested in many countries. Episodes of mortality have been attributed to global warming (Girard, Clemente, Toledo-Guedes, Brito, & Hernández, 2012;Yeruham et al., 2015), and the tropicalization of the Mediterranean may also be favouring the competing sea urchin Arbacia lixula to the detriment of P. lividus populations (Gianguzza et al., 2011;Pérez-Portela et al., 2019;Templado, 2014;Wangensteen, Dupont, Dupont, Casties, Turon, & Palacín, 2013;Wangensteen, Turon, Turon, Casso, & Palacín, 2013). Management of this key species requires knowledge of its genetic structure, connectivity and local adaptation. Population genomics has shown that genetic interchange occurs widely but has some limitations: the species is sensitive to dispersal barriers and displays isolation by distance. It has also been shown that a variety of selective pressures can be acting on the genetic differentiation of the populations, salinity changes likely prominent among them but also temperature at regional level.
In addition, although having a long PLD, there is not a panmictic population, and thus not a single genetic pool in the area studied.
Consequently, then, P. lividus could be more vulnerable than previously thought, and current population declines in large areas (Rilov, 2016;Yeruham et al., 2015) should raise concern and promote management measures.

ACK N OWLED G EM ENTS
We thank Emma Cebrian and Sandra Garcés for help with the specimens' collection, and Silvia Frias for help during DNA extractions. This work was supported by the Spanish Government projects ChallenGen CTM2013-48163 and PopCOmics CTM2017-88080 (MCIU, AEI/FEDER, UE). This is a contribution from the Consolidated Research Group "Benthic Biology and Ecology" SGR2017-1120 (Catalan Government).

DATA AVA I L A B I L I T Y S TAT E M E N T
Raw read data from all individuals will be submitted to an NCBI SRA Bioproject upon acceptance. A list of genotypes of the haplotype loci dataset for all individuals, including sampling location, will be available in an ARLEQUIN file as a Supplementary Dataset upon acceptance.