What DNA barcodes reveal: microhabitat preference, hunting strategy and dispersal ability drive genetic variation across Iberian spider species

The current rate of species loss calls for immediate actions to preserve biodiversity and ecosystem functioning. Cataloguing species richness and composition, and revealing how diversity is geographically distributed are the first steps towards designing efficient conservation strategies. Here, we aim to determine diversity patterns and potential drivers of taxonomic and genetic diversity and population structure of Iberian spiders. We used a community level perspective, analysing more than 3000 DNA barcode sequences representing ~370 spider species dwelling in white‐oak forest habitats across the Spanish National Park network. By combining and comparing morphological and DNA barcode‐based species delimitation methods, we assessed their performance and identified putative factors behind cases of incongruence. Our findings uncovered potential overlooked diversity as suggested by the geographic patterns of genetic variation and put a red flag on those taxa that may be undergoing overlooked evolutionary or ecological processes. Spider functional traits associated with foraging strategy, microhabitat preference, ballooning ability and circadian activity explained the observed patterns of population structure across species but did not explain variation in genetic diversity. Overall, our study represents a major step forward in the understanding of large‐scale diversity patterns in Iberian spiders at the community level and provides relevant information to guide future conservation strategies of the so‐far largely overlooked invertebrate diversity.


Introduction
Current biodiversity loss, mostly driven by human activities, is a major concern at the global scale (Barlow et al., 2016;Doherty et al., 2016;Socolar et al., 2019). Species are becoming extinct at a rate a 1000 times faster than the natural background rate of extinction (Pimm et al., 2014).
In this context, understanding diversity patterns and the processes that shape them is of utmost importance for developing more efficient conservation strategies. The first step towards this goal is cataloguing the species present at a given area. However, this may be a daunting task, especially for megadiverse and extremely abundant groups such as arachnids, insects and molluscs (Peters & Wassenberg, 1983;Chapman, 2009), whose morphologic identification is very time-consuming and requires the increasingly more difficult to find specialised taxonomic expertise. DNA barcoding approaches are widely used to overcome the limitations of identifying large numbers of specimens (Blagoev et al., 2015;Astrin et al., 2016) by using a small fragment of DNA as a species identifier. The animal DNA barcode is a fragment of around 658 bp from the cytochrome c oxidase subunit I (COI) mitochondrial gene, which has been shown to effectively discriminate between related species (Barrett & Hebert, 2005). DNA barcoding is accelerating diversity inventories by automatising the identification process and provides finer-scale taxonomic resolution than morphological identification. Moreover, DNA barcoding requires little taxonomic expertise when the reference DNA barcode library is already available and provides species names to different life stages or even remnants of individuals (Victor et al., 2009;Zeale et al., 2011). Also, DNA barcode libraries are essential for studies using high-throughput sequencing technologies, such as metabarcoding, which rely entirely on such libraries to assign species names to the obtained sequences (Hajibabaei et al., 2011;Yu et al., 2012).
DNA barcoding has not only been used for species identification but also for preliminary delimitation of species boundaries. Morphological circumscriptions may disagree with molecularbased delimitations. These incongruences may convey different specific ecological and evolutionary processes, such as the presence of cryptic species, deep population structure, introgression/ hybridization events, incomplete lineage sorting (ILS), or even misidentifications. Multiple criteria and approaches for delimiting species boundaries based on DNA barcodes or other single markers have been devised (Casiraghi et al., 2010). For instance, the distance-based 'DNA barcode gap' identifies thresholds or cut-off values comparing the differences between the lowest interspecific genetic distance and the highest intraspecific distance by conducting pairwise comparisons among all available sequences. Other more sophisticated species delimitation methods include the Barcode Index Number (BIN) system, which is an example of a genetic distance algorithms method implemented in the Barcode of Life Data System (BOLD) (https://www.boldsystems.org/index.php), whereas the Poisson Tree Processes method (PTP) (Zhang et al., 2013) and its multi-rate version mPTP (Kapli et al., 2017) are phylogenyaware approaches.
Intraspecific genetic diversity is a measure of the overall heritable variation within a species. Estimates of intraspecific genetic diversity are affected by demographic fluctuations in effective population size, mutation rates and coalescence time (Ellegren & Galtier, 2016;Salinas-Ivanenko & Múrria, 2021). For instance, habitat stability favours large effective population sizes and thus increases genetic diversity (Papadopoulou et al., 2009;Múrria et al., 2015). The population structure describes how intraspecific genetic diversity is spatially distributed within and across populations. Several factors can affect population structuring, such as population bottlenecks, life history, selection or gene flow (Charlesworth & Willis, 2009;Slatkin, 1987). Isolation by distance (IBD) analyses are commonly used to elucidate the spatial patterns of genetic variation (Schäfer et al., 2001;Pope et al., 2006). IBD is found when the relationship between genetic differentiation among populations and geographic distance is positive. Alternative patterns indicate that different underlying processes are acting on the group of study. For example, high levels of population genetic differentiation regardless of the geographic distance may uncover overlooked species, while the lack of genetic differentiation across large areas may indicate high levels of genetic flow as a result of high connectivity or dispersal ability (Fig. 1).
Spiders are among the most diverse orders of arthropods with 128 families and more than 49 000 species described (World Spider Catalog, 2020). Spiders play a key role as top predators of arthropods in most terrestrial ecosystems (Nyffeler & Birkhofer, 2017) and are in turn a valuable component of the diet of other predators such as birds (Ramsay & Houston, 2003). Because of their species-richness and abundance, morphological identification of spiders is a daunting task. Even more challenging, immature stages are usually not identifiable due to the lack of developed genitalic structures, which are the main source of species diagnostic characters. In this context, the use of DNA barcoding approaches for accelerating spider inventories has become increasingly popular (e.g. Astrin et al., 2016;Tyagi et al., 2019;Kennedy et al., 2020).
Spiders provide a diveristy of measurable functional traits, which may impact their genetic diversity and population structure. While some spiders are very poor dispersers and have restricted ranges, others are capable of long-distance dispersal. For instance, ballooning is a spider-specific strategy of passive aerial dispersal consisting in the use of a thread of silk as a parachute, which is then carried by wind currents and atmospheric electric fields (Morley & Robert, 2018). Several studies found ballooning dispersal behaviour to be related to small body sizes (Dean & Sterling, 1985;Greenstone et al., 1987) and vegetation structure (Blandenier, 2009). Dispersal by ballooning is associated with high gene flow and weak population structure, even in mountainous landscapes (Botham et al., 2020) and can influence the occurrence and the composition of spider communities (Bonte et al., 2003a(Bonte et al., , 2004Jiménez-Valverde et al., 2010). In contrast, the level of habitat specialisation seems to be negatively related to the ballooning ability (Bonte et al., 2003b), as a high tendency to balloon increases the risk of reaching potential unsuitable habitats and therefore should increase population structuring. Habitat specialisation has also been found to have a weak or no contribution to explain variation in distribution patterns (Bonte et al., 2004).
Although several studies have investigated genetic diversity and population structure in spiders for describing phylogeographic processes (Ramírez & Chi, 2004;Settepani et al., 2014), they mostly focused on a single or few species. Conversely, phylogenetic studies usually consider large numbers of species, but the intraspecific analyses are either weak or absent. The novelty of our study is to relate the geographic distribution of intraspecific genetic variation to functional traits using a multi-species approach. Therefore, our purpose is not to taxonomically redefine the species studied, but to put a red flag on those groups that may be undergoing interesting evolutionary or ecological processes and whose taxonomy would be worth revising. Here, we aim to determine the diversity patterns at the community level and the drivers of taxonomic and genetic diversity of Iberian spiders. The specific objectives are (i) to build a DNA barcode reference library of Iberian spiders to facilitate and automatise spider identification; (ii) to assess the ability of molecular tools used at the community level to delimit Iberian spiders species compared to morphology, and to investigate cases of incongruence; (iii) to uncover morphologically overlooked diversity (i.e. cryptic diversity) assessing geographic patterns of genetic variation; and (iv) to determine relations between spider functional traits and patterns of intraspecific nucleotide genetic diversity and population structure across species.

Samplings and study area
We collected the studied spiders in May-June 2013 and 2014. The sampling design consisted in 16 1 -ha plots distributed in white oak forests in six national parks across the Iberian Peninsula (Fig. 2). We used the semi-quantitative COBRA (Conservation Oriented Biodiversity Rapid Assessment) sampling protocol (Cardoso, 2009) for capturing spiders. COBRA combines timed direct capture, foliage beating, vegetation sweeping and pitfall trapping (active for 2 weeks) (see details in the study by Crespo et al., 2018).

DNA extraction, amplification and sequencing
We stored the spiders in absolute ethanol at À20 C, and we used a ZEISS Stemi 2000 stereomicroscope and the 'Araneae: Spiders of Europe' database to identify most of the species (Nentwig et al., 2017). We performed individual DNA extractions and amplified a 658 bp fragment of the animal DNA barcode, the mitochondrial gene cytochrome c oxidase subunit I (COI), by PCR as detailed in the study by Crespo et al. (2018), using LCO1490 and HCO2198 (Folmer et al., 1994) as the referred combination of primers. PCR products were cyclesequenced in both directions at Macrogen Inc. (Spain). There was no evidence of nuclear copies of mitochondrial DNA (NUMTs), and in the cases with presence of non-target sequences (proteobacteria or contamination), we repeated the amplification and sequencing. We uploaded all sequences to BOLD and Genbank (Bensons et al., 2017) (accession numbers MH630465-MH630994, MT215600-MT215716, MT607653-MT607961, MW996898-MW999147 and MZ824404).

Species delimitation analyses
To assess the ability of our DNA barcode library to unequivocally assign a sequence to a particular species, we performed an initial barcode gap analysis for every species by comparing the highest intraspecific uncorrected P-distance to the lowest interspecific (nearest neighbour) distance using MEGA7 v7.0.26 (Kumar et al., 2016). The existence of a barcode gap, i.e., a separate distribution of the frequency of intraspecific and interspecific distances, guarantees that unknown sequences can be readily assigned to a species. Alternatively, a small or absent barcode gap blurs species boundaries, making it difficult or impossible to associate a given sequence to a unique species. Albeit fast, the DNA barcode gap approach may suffer from inaccuracy or lead to spurious results (Spooner, 2009;Elias et al., 2009;Hamilton et al., 2014). Xysticus ferrugineus Menge, 1876, Alopecosa cuneata (Clerck, 1757) and Phrurolithus nigrinus (Simon, 1878) contained only one individual so their barcode gap could not be computed.
We also performed molecular species delimitation analyses using two contrasting methods, namely the graph-theory-based BIN system and the character-based mPTP model. BIN follows a Refined Single Linkage (RESL) algorithm combined with Markov clustering. Taking into account all records in the BOLD database, all sequences >500 bp were automatically assigned to BIN clusters, which included highly similar sequences and represent molecularly delimited candidate species. Alternatively, new BINs were created if a good match was not found (Ratnasingham & Hebert, 2013). The mPTP delimitation relies on the relative branch length information provided by a gene tree to identify a transition point between inter-and intra-specific divergences, i.e., distinguishing putative speciation from coalescence events. We inferred a maximum-likelihood gene tree from all the COI sequences. We aligned them using the online version of MAFFT v7 (Katoh et al., 2019) with the G-ins-I algorithm. We used the IQ-TREE software v1.6.1 (Nguyen et al., 2015;Chernomor et al., 2016) to infer a maximum likelihood tree using the edge-linked partition model and the GTR + I + G model of nucleotide substitution, as it is the most complex and parameter-rich model and the large number of sequences allowed for a correct estimation of the parameters. The analysis was partitioned by codon position. We assessed branch support using ultrafast bootstrap with 1000 replicates. We ran phylogenetic analyses remotely at the CIPRES cluster (Miller et al., 2010), and we ran the mPTP model locally using the programme mPTP v 0.2.4 (Kapli et al., 2017, available at https://github.com/Pas-Kapli/mptp). We first identified the minimum branch length (0.000888), which was subsequently used to delimit clusters and assess their support by means of three Markov Chain Monte Carlo sampling chains ran for 50 million generations, discarding the first 2 million MCMC steps as burnin.
We compared the morphologically defined species to the results of the two molecular delimitation methods. We classified molecular clusters in 'match', 'split', 'merge' and 'mixture' depending on whether they assigned specimens to the same morphological species, split specimens of the same species into more than one cluster, included all the specimens of more than one species, or partially combined specimens from different species, respectively.

IBD analyses
We assessed patterns of IBD for all species involved in 'splits' in both BIN and mPTP by plotting the uncorrected pairwise genetic distance between pairs of individuals against their geographic distance. For this analysis, we assigned unassigned specimens due to their short sequence length to their closest BIN based on the topology of the COI tree. This was the case of Eratigena picta (Simon, 1870) MD321, which was assigned to the BIN BOLD:ADM0499, Ero aphana (Walckenaer, 1802) MD323 to BOLD:AAO2255, Paidiscura pallens (Blackwall, 1834) MD1405 to BOLD:AAO0598, Paidiscura pallens MD1410 to BOLD:AAO0598, Zodarion styliferum (Simon, 1870) MD3518 to BOLD:ADM0571 and Zodarion styliferum MD3519 to BOLD:ADM0571.
We used IBD to assess whether the groups resulting from cases of molecular split of morphologically delimited species were randomly distributed across sites, reflected population structure within a species or, alternatively, supported the existence of co-occurring independent evolutionary lineages (incipient species). To do so, we assigned each morphospecies to 'allopatric', 'sympatric' or 'no-speciation' split categories based on the patterns recovered in their IBD plots (see Fig. 1). In the cases of allopatric or sympatric splits, specimens were revised in search for possibly overlooked morphological differences.

Genetic diversity, population structure and functional traits
We ran the analyses of intraspecific genetic diversity and population structure only for species with five or more individuals captured in at least two parks (Table 1). For the intraspecific genetic diversity, we calculated the nucleotide diversity (π) of each species as the average number of nucleotide differences per site with the nuc.div function in the package 'pegas' (Paradis, 2010) in R v.3.6.3 (R Core Team, 2020). We assessed the level of genetic structure among populations (Phi PT ) by means of an Analysis of MOlecular VAriance (AMOVA) performed also with 'pegas'. We used the package 'ape' (Paradis & Schliep, 2018) to compute the input matrix of Euclidean genetic distances for the AMOVA analysis, setting the model to raw and pairwise deletion to false and we assessed the significance by performing 10 000 permutations. Negative AMOVA values (intra-population differentiation higher than inter-population differentiation) can have different interpretations, but they are usually the result of artefacts due to relatively small sample sizes and we treated them as 0 (Meirmans & Hedrick, 2011).
We investigate the relationship between intraspecific genetic diversity and nine functional traits. These traits included four morphological measurements, namely (i) width and (ii) height of the prosoma and (iii) length of the fang and (iv) of the first tibia. We made the latter two proportional to prosoma width to avoid correlation, and we standardised all morphological traits by substracting the mean and dividing by standard deviation. The remaning traits corresponded to five ecological traits, namely (i) the dispersal ability by ballooning (three categories), (ii) circadian activity (diurnal or nocturnal), (iii) foraging strategy (seven categories), (iv) trophic specialisation (generalist or specialist) and (v) preferred microhabitat or vegetation layer as inferred by the capture methods (four categories from ground  Table S1 for trait codification, and see Macías-Hern andez et al., 2020b for trait relevance and justification). We used the R package 'ade4' (Dray & Dufour, 2007) to generate a Gower dissimilarity matrix to which we applied a principal coordinates analysis (PCoA) to represent the overall functional space including all species. We controlled for the effect of phylogenetic relatedness in statistical models using the time-calibrated phylogeny in the study by Macías-Hern andez et al. (2020a). This phylogeny was inferred from six genes and topological constraints based on transcriptome information (Fernandez et al., 2018) for 1450 species of spiders, including those in the present study. We pruned the phylogenetic tree retaining only our species of interest using the function drop. tip in the package 'ape'.
In the PCoA, we tested the phylogenetic trait conservatism with Pagel's lambda (λ) (Pagel, 1999) using the position of each species on the PCo1 and PCo2 axes with the phylosig function in the R package 'phytools' (Revell, 2012). Values of λ close to 1 indicate that there is a strong phylogenetic signal, while values close to 0 indicate a low phylogenetic signal (i.e. trait states are more randomly distributed across the phylogeny). We assessed the Spearman correlation of each trait on the PCo1 and PCo2 axes using the cor function in the R package 'stats' (R Core Team, 2020). We ran phylogenetic generalised least squares models (PGLS) using the function pgls in the R package 'Caper' (Orme et al., 2013) to test the correlation between population structure Phi PT (obtained from AMOVA analyses) and the position on the PCo1 and PCo2 axes, as well as between the nucleotide diversity π and the position on the PCo1 and PCo2 axes, by taking into account the phylogenetic non-independence between species.

Results
We successfully sequenced 3207 barcodes corresponding to 1344 distinct haplotypes. They belonged to 371 morphospecies out of the 377 (98.4%) previously identified using morphology (see Crespo et al., 2018 for details) and to 39 families (70% of those reported in the Iberian Peninsula). The morphospecies in the dataset were represented by nine individuals on average and 75 of them (20.2%) were singletons. Most haplotypes were exclusive of a single individual, but 499 haplotypes were recovered in more than one individual. The number of individuals sharing the same haplotype ranged from 2 to 47 (in Theridion harmsi Wunderlich, 2011). Most species showed a clear barcode gap (Supporting Information Fig. S1), insofar as the average maximum intraspecific divergence was 2.07%, while average distance to the nearest neighbour was 9.73%.

BIN analysis
Only 95 COI sequences could not be assigned to any BIN because they did not comply with the minimum length requirements, which left 13 species and four families out of this analysis. The BIN system clustered the remaining Graphic summary of the ecological traits. Silhouettes depict some traits used in this study, namely dispersal by ballooning, preferred microhabitat or vegetation layer and foraging strategy. Foraging strategy includes five types of web (orb webs, space webs, sheet webs, tube webs and sensing webs such as those of trap-door species) and two types of non-web hunting strategies (ambush hunters and active hunters). Spiders and webs are not to scale. sequences into 441 BINs, in contrast with the 358 identified morphospecies. The correlation between the number of BINs for a species and the number of specimens analysed was significant, but weak (r s = 0.1, P < 0.001). The BIN/ morphospecies ratio for the 35 families ranged between 1 and 2, with an average of 1.3. There was a perfect match between 269 morphospecies (75.1%) and a particular BIN, while 70 morphospecies (19.6%) were split into more than one BIN. The average number of BINs per species in 'split' cases was 2.1, ranging from 2 to 6. The cases of 'merge' were uncommon, affecting only 16 morphospecies (4.5%) and in all the cases, they involved pairs of species. There was one single case (0.8%) of 'mixture', which involved specimens of three closely related species of the Theridion melanurum Hahn, 1831 species complex, which shared six BINs. Isolation by distance (IBD) plots of the genetic distance (%) in relation to the geographic distance (km) of representative morphospecies split in both delimitation methods. The plots illustrate cases of allopatric patterns (a and b), sympatric patterns (c) and no apparent geographic pattern (d). Intraand inter-cluster distances according to BIN delimitation are represented with triangles and circles, respectively. Intra-and inter-cluster distances according to mPTP delimitation are represented in blue and orange, respectively.

mPTP analysis
The inferred phylogenetic tree (Fig. S2) was well resolved, with supports being in general higher towards the tips. For the 34 families represented by more than one taxon, 24 were recovered as monophyletic. At the morphospecies level, all except 16 were monophyletic. The mPTP model sorted the 3207 sequences (371 morphospecies) into 365 clusters. Again, the correlation between the number of clusters and the number of sampled specimens per species was significant, but weak (r s = 0.07, P < 0.001). The ratio between the number of clusters and morphospecies for the 39 families ranged from 0.25 to 1.71, with an average of 0.9. Cases of 'match' represented 66.6% of the morphospecies, whereas 'merges' (24%) and 'splits' (8.9%) were less common, and 'mixtures' were marginal (0.5%). The number of cases of 'merge' for the mPTP model were higher than in BIN analyses. These 'merge' cases were found in 26 pairs, seven triads, two tetrads and one pentad. However, 15 of the 'merge' cases included morphospecies from different genera, and three of them were even from different families, which hints to some sort of artefact. The mPTP model found fewer 'splits' than the BIN delimitation, affecting 33 morphospecies that were divided in two or more clusters. The average number of clusters per species in cases of 'split' was 2.3, ranging from 2 to 6. The mPTP model found one single case of 'mixture', which surprisingly involved the distantly related linyphiids Microneta viaria (Blackwall, 1841) and Neriene clathrata (Sundevall, 1830).

IBD analyses
The total number of morphospecies consistently involved in 'split' cases in both BIN and mPTP methods was 27. In general, all these morphospecies were split in the exact same way by both delimitation methods, except for: Aelurillus luctuosus (Lucas, 1846), Drassodes lapidosus (Walckenaer, 1802), Eratigena picta, Evarcha falcata (Clerck, 1757) and Segestria senoculata (Linnaeus, 1758). Eleven out of the 27 morphospecies showed an 'allopatric' pattern ( Fig. 1), whereas 13 species showed a 'sympatric' pattern. The remaining three species did not conform to any of the former patterns. Figure 4 illustrates the IBD plots of four specific cases (see Supplementary Material Fig S3 for the IBD plots of all 27 species). Eratigena montigena (Fig. 4a) and Phrurolinillus tibialis (Fig. 4b) were examples of geographic divergence and allopatric splits, with different genetic clusters corresponding to different localities. For E. montigena, the pairwise distances between distant localities (5.5-7.2%) were above the average maximum intraspecific divergence (2.07%), while they were low within the same or nearby localities (0-0.9%) and corresponded to intracluster distances. Similarly, the pairwise distances of P. tibialis within each plot ranged from 0% to 0.5%, while the distances between its two populations, barely separated by 28 km, were much higher (6.7-7.1%). The species Nuctenea umbratica exemplified a pattern of 'sympatric' divergence (Fig 4c), with low within-cluster (0-1.4%) but high between-cluster pairwise divergences (10.6-11.2%) found within the same locality. Finally, Drassodes lapidosus provided an example of the absence of a clear pattern in the distribution of genetic variability (Fig 4d).
For 67 species, at least five individuals were distributed in at least two parks. The unidentified morphospecies Lathys sp.1 lacked some functional trait information and was removed from further analyses. The remaining 66 species were used for testing the relationship between functional traits and intraspecific genetic variation. PGLS models found a significant negative correlation between genetic population structure (Phi PT ) and PCo1 (estimate = À0.56, P = 0.008, r 2 = 0.09). This result indicates a negative gradient of genetic structuring from spiders that live on the ground level and are active hunters to spiders that build space webs and live in higher layers of vegetation. Similarly, we also found a positive correlation between genetic population structure and PCo2 (estimate = 0.65, P = 0.04, r 2 = 0.05). In this case, the positive gradient goes from diurnal to nocturnal species, which are poor dispersers and show high genetic structuring (Fig. 6). Nucleotide diversity (π) was not significantly correlated with either PCo1 (P = 0.08) or PCo2 (P = 0.72).

A barcode library for Iberian spiders
Our study is the largest DNA barcoding study of spiders performed in the Iberian Peninsula to date (3207 barcodes generated) and contributed DNA barcodes of 133 species not previously available. The 371 species for which we provide DNA barcodes represent a quarter of the total number of species recorded in the Iberian Peninsula (Branco et al., 2019). This is especially remarkable given that the Iberian spider diversity is exceptionally large, Spain ranking third among the most diverse countries in Europe, after France and Italy (Nentwig et al., 2017). In addition, Iberian spiders (and communities) are highly endemic (Branco et al., 2019;Malumbres-Olarte et al., 2020). Almost all the analysed species (98.9%) possessed haplotypes that were unique to them, which highlights the need of a local DNA barcode reference library for the effective identification of Iberian species. Moreover, the majority of morphospecies (94%) presented a barcode gap, which is critical for species identification and delimitation by DNA barcoding. In the few cases where our DNA barcode library was unable to provide a species-level identification, it still narrowed it down to two closely related species. These results support the ability of a DNA barcode library to provide accurate identifications for Iberian spiders.
The COI phylogenetic tree (Fig. S2) was well resolved and supported at the species level, with only 16 out of the 371 morphospecies recovered non-monophyletic. As expected from a fast-evolving single mitochondrial marker (Bidegaray-Batista & Arnedo, 2011), deeper nodes were poorly resolved and less supported.

Comparative performance of alternative single markers approaches to species delimitation
Most of the morphospecies matched a single molecular cluster, approximately three quarters in BIN and two-thirds in mPTP. However, while the BIN approach had a 'splitter' tendency (358 morphospecies in 441 BINs), the mPTP model tended to increase the 'merge' cases, generating fewer clusters than predefined morphospecies (371 morphospecies in 365 mPTP clusters). Single-locus species delimitation methods are known to overestimate species diversity because they likely reflect population structure rather than actual species (Carstens et al., 2013;Hawlitschek et al., 2018;Miralles and Vences, 2013). However, the BIN approach has been found to perform better in terms of correspondence with morphologically delimited species (Blagoev et al., 2013;Huang et al., 2019). If one assumes that morphological variation accurately reflects genome-wide divergence (Muster and Michalik, 2020), then the BINs approach is the single-gene delimitation method that provides a more accurate estimate of the actual species diversity. Also, the proportion of splits in the BIN approach (19.6%) was very similar to the values reported in a similar study conducted on Canadian spiders (19.3%) (Blagoev et al., 2015).
Conversely, the cases of BINs with merged species were infrequent and biologically feasible, inasmuch all the species pairs merged were extremely similar and only recognisable by subtle genitalic differences (Crespo et al., 2018). For example, females of the species Agyneta rurestris and A. pseudorurestris are morphologically indistinguishable, and males only differ by a little tooth on the lamella of the copulatory bulb. In fact, the specific status of these species has been questioned (Bosmans et al., 2010). In contrast, the mPTP model merged morphospecies that were clearly diagnosable. Several studies conducted in organisms such as golden orb-web spiders (Čandek et al., 2020), trapdoor spiders (Huey et al., 2019) or frogs (Correa et al., 2017) have reported a tendency of the mPTP model to overmerge. Surprisingly, in a few cases, the mPTP model clustered together morphospecies from different genera or even families for unknown reasons. The PTP methods have been shown to be sensitive to factors such as the accuracy of the input tree, population sizes, divergence times or the ratio of population size to divergence time, and to ongoing gene flow (Luo et al., 2018). In our study, the cases of merged morphospecies could be partially explained by the assumption of reciprocal monophyly inherent to the tree-based methods of species delimitation (Fujisawa & Barraclough, 2013), such as mPTP. For instance, the single case of mixture in the mPTP model involved Neriene clathrata, whose individuals were nested within those of Microneta viaria, rendering the last species paraphyletic. The mPTP split M. viaria in six MOTUs, one of them also containing N. clathrata sequences.
Finally, the only case of 'mixture' using the BIN method involved three species of cobweb spiders that have been recently shown to have been involved in old introgression events .

Uncovering geographic patterns of intraspecific genetic variation across species
A total of 27 morphospecies were consistently split by both BIN and mPTP methods. Interestingly, none of those species were frequent ballooners (see Carvalho & Cardoso, 2014), about half were rare ballooners, and the rest were classified as occasional ballooners. Ballooning is the main means of long-range dispersal in spiders, so it is not surprising that most of the species presenting distinct intraspecific population structure showed a low tendency to balloon.
A negative correlation between morphological disparity and genetic differentiation could hint at the potential existence of cryptic species (Struck et al., 2018). Although single molecular markers may not accurately reflect genome-wide patterns of genetic differentiation, they may identify situations that deserve further scrutiny. Here, we focused on the cases of splits of single morphospecies into two or more genetic clusters, using IBD plots to reveal the underlying mechanisms of the geographic patterns of genetic distribution. Distinguishing those mechanisms is critical because genetic differentiation in overlapping (sympatric) taxa may indicate reproductive isolation, while genetic divergence between geographically or ecologically distant populations may be better interpreted as population structure rather than species differences.
We identified 13 sympatric and 11 allopatric splits, while in three cases, there were no clear geographic patterns of intraspecific genetic diversity distribution. Among the sympatric splits, we found examples of potential overlooked species. For instance, we observed a single female of the orb weaver N. umbratica co-occurring with other conspecifics yet showing high genetic divergence (Fig. 4c). This specimen turned out to be the smallest individual (3.6 mm of prosoma length, the rest ranging from 3.9 to 5.8) and showed slight differences in the epigyne. Other examples of potential overlooked species were found among splits that showed allopatric patterns. For example, we observed that populations of Eratigena montigena from the northern parks (Picos de Europa) split into a different cluster from those in the southern parks (Monfragüe and Cabañeros). Morphological re-examination revealed variation in the male palp tibial apophysis and the shape and thickness of the distal end of the conductor that matched the north-south split. Further sampling in intermediate regions and additional sequencing of nuclear genes would be required to confirm the existence of two recently diverged species.
Allopatric differentiation was also found in Sierra Nevada, which suggests a potential role of this mountain system as a phylogeographic break. The guardstone spider Phrurolinillus tibialis and the ant spider Selamia reticulata split into different clusters corresponding to the southern (PS1) and northern (PS2) slopes of Sierra Nevada (in the case of Selamia, the last cluster also included individuals from the central parks). Although the two plots were barely 30 km apart, they were separated by elevations above 2500 m, which may constitute an insurmountable barrier for the dispersal of these taxa.

Population structure is associated with functional traits
Although genetic diversity (i.e. nucleotide diversity of mtDNA) and functional traits were decoupled, we uncovered a significant correlation between those traits and population structure. In general, active hunters, ground-dwellers and nocturnal species tend to be more genetically structured, while species living in higher layers of the vegetation, those mostly using space webs to capture prey, those active at daylight and good ballooners, were less genetically structured. Our analyses revealed two groups of highly correlated functional traits. The first group of traits included the active hunter and space web foraging strategies and preferred microhabitats 'ground level', 'herbaceous' and 'very low canopy'. The fact that the active hunting strategy co-varied with the ground dwelling habitat probably reflects a higher efficiency of this foraging strategy in covering larger areas in the ground. Conversely, space webs seemed to co-vary with 'herbaceous' and 'very low canopy' microhabitats found at the medium vegetation level. Although some webs such as those of some theridiids are located almost at ground level and have threads that attach to the ground and that aim at wandering prey, most spatial webs are built at a certain distance from the ground and are intended to capture flying insects (Cardoso et al., 2011). The second group of traits included the circadian activity and the dispersal ability by ballooning. A higher tendency to balloon was associated with diurnal spiders, possibly because high air temperatures and low wind velocities are particularly important for the initiation of ballooning dispersal (Vugts & Van Wingerden, 1976;Reynolds et al., 2007). Although whether wind speed is higher or lower during daytime is not clear and might depend on the season (Pérez et al., 2004;Emeis et al., 2007), temperature is usually warmer during daylight, which may explain the association between ballooning and diurnality. We did not recover a significant relationship between ballooning tendency and small body size. This is hardly surprising inasmuch most studies supporting this relationship were based on the measurements of the individuals dispersing by ballooning, in many cases including immatures (Dean & Sterling, 1985;Greenstone et al., 1987), while we used the measurements of the adult stage of each species. Ballooning dispersal is not restricted to spider species with a small adult size because the juvenile stages of spiders with large adult sizes can also balloon (Humphrey, 1987;Bonte et al., 2003b).
The two groups of highly correlated functional traits showed significant phylogenetic signal. These results are congruent with the observation that most of these traits are known to be highly conserved at the genus or even family levels (Cardoso et al., 2011).
The deep population structure recovered in active ground dwelling hunters could be explained by the high habitat fidelity of ground species. Such habitat fidelity would result in more fragmented suitable habitat patches, and consequently less connected populations, as has been observed in the purse-web spider Atypus affinis Eichwald, 1830 (Pétillon et al., 2012). In fact, Malumbres-Olarte et al. (2020) found ground-dwelling species to have higher levels of endemicity than species living higher in the vegetation. Interestingly, the ballooning capacity was found to be positively correlated with the PCo1, just like vegetation layers 'herbaceous' and 'very low canopy' and opposite to the tendency to live on the ground. Blandenier (2009) found ground-dwelling spiders to be present in ballooning captures almost in the same proportion as spiders living in the vegetation, although ground spiders were rare in the case of species from forested areas. Bell et al. (2005) suggested that species living at ground level may be less exposed to wind and convective currents compared to spiders living higher in the vegetation, but it is also possible that ground-dwelling spiders simply climb to higher vegetation layers when they need to disperse by ballooning. Also, spider populations appear to be more genetically structured in species that are nocturnal and do not use ballooning for dispersal and less genetically structured in species active at daylight. There is little evidence of a direct connection between the circadian activity and population structure. Most likely, whether spiders are diurnal or nocturnal explains population structure indirectly by affecting the ballooning ability. As the main means of long-range dispersal in spiders, ballooning is known to contribute importantly to the connectivity and gene flow among their populations (Bonte et al., 2003a;Dimassi et al., 2016, but see Ramirez & Haakonsen, 1999), thus affecting the levels of genetic population structure and the level of endemicity (Malumbres-Olarte et al., 2020). Although many spiders are able to disperse by walking, mostly in search of mating partners, their impact to connect populations at the geographical scale of this study (from 0.11 to 721 km) is probably negligible. The positive relationship between ground dwellers and population structure was further supported by the fact that species involved in 'splits' mostly included poor or no ballooners, so the lower levels of airborne dispersal would contribute to reducing the gene flow between populations.
Our former discussion was based on the assumption that mtDNA information accurately reflects population history and demography, which may not always be true. Mitochondrial genes, such as COI, are maternally inherited, so they may only reflect the population structure of females in species with different dispersal behaviours between sexes. Additionally, ecological and evolutionary processes such as incomplete lineage sorting or hybridization may negatively affect the results of single-gene population structure analyses, which could be minimised by using additional markers. Also, the modest number of localities and of individuals sampled per species could also affect the estimates of genetic diversity and population structure and ultimately the mechanisms described therein.

Conclusions
Here, we present the most extensive DNA barcoding study conducted on Iberian spiders to date. The assembled DNA barcode library will not only ease the identification of single specimens, but it will also facilitate the automatisation of the identification of bulk samples using high-throughput sequencing (HTS) approaches (e.g. metabarcoding, genomic skimming) for largescale cataloguing projects. Because of the emergency for the rapid inventory and monitoring of the declining biodiversity, the use of HTS approaches is becoming more widespread. However, morphologically validated reference sequences are still essential to secure identification and to link the wealth of genetic data to name-based heritage information. Here, the distancebased delimitation method (BIN) outperformed the characterbased method (mPTP) in terms of matching to morphospecies reference. The patterns of geographic mtDNA variation revealed by IBD plots allowed us to uncover instances of overlooked diversity but also of potential phylogeographic breaks. Our multi-taxon comparison of mtDNA variation suggested that functional traits of spiders, specifically active hunting and ground dwelling, as well as nocturnal activity and ballooning ability may partially explain population structure. The largescale patterns of genetic variation in spider communities across the Spanish National Park network revealed in our study and their relationship to species traits will contribute important information to evaluate and guide future conservation and management strategies. thank Luis C. Crespo for contributing insights into the interpretation of the results, and Xavier Vivancos for providing valuable advice on the analyses with R. The authors thank the two anonymous reviewers whose comments/suggestions helped improve and clarify this manuscript. MD was supported by an APIF predoctoral grant from the Universitat de Barcelona (2017). This study was funded by the Organismo Aut onomo de Parques Nacionales (OAPN #485/2012) to MA. Additional support was provided by 2017SGR73 from the Catalan Government.

Conflict of Interest
All the authors declare that they have no conflict of interest. There are no disputes over the ownership of the data presented in the paper and all contributions have been attributed appropriately.

Data Availability Statement
The data that support the findings of this study are available from the corresponding author upon request.

Supporting information
Additional supporting information may be found online in the Supporting Information section at the end of the article.

Fig. S1
Barcode gap for the 371 species of Iberian spiders with two or more sampled individuals. Maximum intraspecific divergence is plotted against nearest-neighbour distance. Grey triangles represent morphospecies consistently merged in both BIN and mPTP, while black circles show the rest of possible combinations. Points above the diagonal line indicate species with a barcode gap.  Isolation by Distance (IBD) plots of the genetic distance (%) in relation to the geographic distance (km) of all morphospecies split in both delimitation methods. Intra-and intercluster distances according to BIN delimitation are represented with triangles and circles, respectively. Intra-and intercluster distances according to mPTP delimitation are represented in blue and orange, respectively.
Table S1 Categorization of the traits used in this study and their corresponding states.