Biodiversity, biogeography, and connectivity of polychaetes in the world's largest marine minerals exploration frontier

The abyssal Clarion‐Clipperton Zone (CCZ), Pacific Ocean, is an area of commercial importance owing to the growing interest in mining high‐grade polymetallic nodules at the seafloor for battery metals. Research into the spatial patterns of faunal diversity, composition, and population connectivity is needed to better understand the ecological impacts of potential resource extraction. Here, a DNA taxonomy approach is used to investigate regional‐scale patterns of taxonomic and phylogenetic alpha and beta diversity, and genetic connectivity, of the dominant macrofaunal group (annelids) across a 6 million km2 region of the abyssal seafloor.


| INTRODUC TI ON
In the abyssal depths (3000-6000 m) of the central and eastern Pacific Ocean lies the world's largest exploration frontier for marine minerals, the Clarion-Clipperton Zone (CCZ). The CCZ is a 6 million km 2 region between the Clarion and Clipperton Fracture Zones, characterised by areas with high densities of polymetallic nodules that are rich in a range of industrially important metals, such as the battery metals cobalt and nickel. As of 2022, there are 17 active exploration contracts for marine minerals in the CCZ, regulated by the International Seabed Authority (ISA). The ISA has also approved 13 Areas of Particular Environmental Interest (APEIs) as part of the Regional Environmental Management Plan (REMP) (ISA, 2021;Lodge et al., 2014).
Commercial-scale nodule mining is predicted to have significant impacts on abyssal benthic ecosystems Washburn et al., 2019). Direct impacts include the removal and destruction of biota with associated habitat loss (Jones et al., 2017), while indirect impacts include modification of the water column and sediment through the generation of sediment plumes (Gillard et al., 2019;Smith et al., 2020). It has been known for some time that biodiversity at the abyssal seafloor is surprisingly high, given the low food availability, compared to other sedimented marine habitats (Hessler & Jumars, 1974). Previous local and regional studies of the dominant polychaete fraction of the abyssal Pacific macrofauna (Glover et al., 2002) supported the idea that species diversity in the Pacific was higher than other regions, but a lack of genetic data has precluded accurate studies of regional diversity and connectivity, essential for the further development of the ISA REMP (ISA, 2021).
Despite the growing importance of the CCZ, only a limited number of studies have attempted to place the biodiversity of the abyssal Pacific into a broader global context (Christodoulou et al., 2019;Cordier et al., 2022;Glover et al., 2002;Lejzerowicz et al., 2021).
A recent analysis of polychaete and crustacean diversity from published quantitative studies showed wide variation in both abundance and diversity, likely driven by sampling biases as well as environmental variables (Washburn, Menot, et al., 2021). A major issue is the lack of consistent taxonomies, created by the problem that most of the species are undescribed, and there are no available regional field guides or identification keys (Glover et al., 2018). The ability to rapidly obtain DNA barcodes from the studied specimens and make the data available in open repositories (Glover et al., 2016) is now enabling these comparisons to be made, whilst open taxonomic works (e.g., Wiklund et al., 2017Wiklund et al., , 2019 should eventually enable morphological field guides to be created. Alongside this improvement in taxonomic methods, it is now also becoming possible to study other facets of abyssal diversity, such as beta diversity Brix et al., 2020;Christodoulou et al., 2020) and phylogenetic diversity Christodoulou et al., 2019;Macheriotou et al., 2020). Beta diversity, as it was originally defined, is species turnover, typically over an environmental or spatial gradient (Whittaker, 1972). It is now commonly apportioned into true species turnover (replacement of species) and nestedness (in which assemblages represent a subset of a broader species pool) (Baselga, 2010). These new analyses are providing data on regional shifts in species composition and lineage turnover, critical for setting conservation priorities, and to evaluate ecological and evolutionary dynamics of communities in an ecosystem where the mechanisms supporting high diversity are still poorly understood (McClain & Schlacher, 2015). biodiversity. Connectivity analyses were based on haplotype distributions for a subset of the studied taxa.
Results: DNA taxonomy identified 291-314 polychaete species from the COI and 16S datasets respectively. Taxonomic and phylogenetic beta diversity between sites were relatively high and mostly explained by lineage turnover. Over half of pairwise comparisons were more phylogenetically distinct than expected based on their taxonomic diversity. Connectivity analyses in abundant, broadly distributed taxa suggest an absence of genetic structuring driven by geographical location.
Main Conclusions: Species diversity in abyssal Pacific polychaetes is high relative to other deep-sea regions. Results suggest that environmental filtering, where the environment selects against certain species, may play a significant role in regulating spatial patterns of biodiversity in the CCZ. A core group of widespread species have diverse haplotypes but are well connected over broad distances. Our data suggest that the high environmental and faunal heterogeneity of the CCZ should be considered in future policy decisions.

K E Y W O R D S
abyssal, beta diversity, biodiversity, biogeography, connectivity, deep-sea mining, phylogeny, polychaeta, polymetallic nodules, population genetics The traditional hypothesis for abyssal sites is that there is a broadly distributed regional species pool of 'cosmopolitan' taxa for which turnover is low or driven by local-scale niche partitioning, together with a very long list of extremely rare taxa that appear to have high turnover with distance (Glover et al., 2002;McClain, 2021). Increased use of genetic barcoding has, however, shown many of these 'cosmopolitan' species to actually be cryptic species complexes (e.g., Havermans et al., 2013). This traditional biogeographic hypothesis has been shown to be the case in the CCZ, with geographic distance driving species turnover in polychaetes (at least in rare species), with nestedness remaining low . Recent molecular work in the CCZ has also revealed high levels of phylogenetic diversity in abyssal brittle-stars (Christodoulou et al., 2019), and high phylogenetic rarity in nematode assemblages (Macheriotou et al., 2020).
Population genetic studies are likewise key to providing insights into the drivers of biodiversity patterns in the abyss. There are only three original studies of soft-sediment dwelling macrofaunal connectivity in deep-sea systems, which all suggest species can range >1000 km in some habitats, but also show high genetic variability (relative to shallow-water examples) (Bober et al., 2018;Etter et al., 2011;Janssen et al., 2019). In the CCZ, there is no evidence that this genetic heterogeneity is driven by physical barriers to gene flow (Janssen et al., 2019). For a common nodule-dwelling demosponge species in the eastern CCZ, evidence suggests that observed genetic connectivity might be explained by predominant oceanographic currents (Taboada et al., 2018).
In this paper, we present and analyse the largest dataset yet accrued for annelid polychaetes across the CCZ, all identified through integrative DNA taxonomy based on species hypotheses linked to openly databased and archived reference material that is available in an associated open-access taxonomic data paper (Wiklund et al., in press). We focus our investigations on the three following areas: (1) baseline levels of biodiversity in the seafloor communities in a regional context (to measure and evaluate potential future impacts); (2) regional biogeographic patterns (critical for the determination of potential protected areas); and (3) population connectivity between regions (needed to determine the resilience and recovery of the system post-impact). As our analyses provide new information in all three areas, they will be essential for the planning of conservation strategies in this vast and commercially strategic region.

| Study areas
All study areas are located within the Clarion-Clipperton Zone (CCZ) in the northeast equatorial Pacific Ocean, at depths from 3932 to 5055 m (Figure 1). The CCZ displays gradients in environmental conditions, including surface primary productivity and seafloor sediment characteristics (Washburn, Jones, et al., 2021), with changes occurring across an east-west and north-south axis. Mining exploration contracts from the ISA, as well as reserve areas and areas protected from mining, extend from 115°W to approximately 158°W, and from 22°N to 2.5°N. The exploration contract areas considered in this study were granted by the ISA to: the Federal Institute for Geosciences and Natural Resources of Germany (BGR, Germany); InterOceanMetal Joint Organisation (IOM, a country conglomerate involving Cuba, Bulgaria, Poland, the Russian Federation, Slovakia, and the Czech Republic); Institut Français de Recherche pour l'Exploitation de la Mer (IFREMER, France); UK Seabed Resources Ltd (UK-SRL, United Kingdom); Global Sea Mineral Resources NV (GSR, Belgium); and Ocean Mineral Singapore (OMS, Singapore). The ISA has also designated 13 APEIs as part of the REMP for the CCZ, of which two sites in APEI-6 were considered for diversity analyses.
One specimen which was opportunistically sampled from APEI-7 was included in connectivity analyses ( Figure 1).

| At-sea sampling
Complete description of the DNA taxonomy pipeline used in the collection of samples new to this study is provided in Glover et al. (2016). Abyssal benthic specimens were collected from UK-1, GSR, OMS, and APEI-6 using a variety of oceanographic sampling gear including box cores, epibenthic sledges (EBS), remotely operated vehicles (ROV), and multi-cores. Live-sorted specimens were stored in individual microtube vials containing an aqueous solution of 80% non-denatured ethanol, numbered, and kept chilled until return to the Natural History Museum, London, UK.
Genetic data from a total of 1866 specimens were included in diversity analyses, with 1177 specimens having COI sequences, and 1101 specimens having 16S sequences. Specimen IDs, location, and accession numbers can be found in Appendix S2.

| Species delimitation
Molecular species delimitation was performed on the 16S and COI datasets separately. The distance-based molecular species F I G U R E 1 (a) Map of the nodule exploration contract areas, reserved areas, and Areas of Particular Environmental Interest (APEI) in the Clarion-Clipperton Zone (CCZ), central Pacific Ocean, showing the areas considered in this study (in colour). APEI-7 was only considered for connectivity analyses. (b) Enlarged map indicating the sites considered in diversity analyses. Shapefiles are courtesy of the International Seabed Authority, Kingston, Jamaica. delimitation methods, automatic barcode gap discovery (ABGD; Puillandre et al., 2012) and CD-HIT-Suite (Huang et al., 2010), were used to separate specimens into molecular operational taxonomic units (MOTUs, hereby referred to as species). Distance-based approaches work on the assumption of a nucleotide distance threshold, below which specimens are considered conspecific and above which they are considered to belong to different species. These methods detect the point at which this 'barcode-gap' occurs and sort sequences into putative species based on this threshold. ABGD analyses were undertaken on the web interface (https://bioin fo.mnhn.fr/abi/publi c/abgd/abgdw eb.html) using the Kimura-2parameter model for both COI and 16S datasets. A range of relative gap width values (0.5-1.5) were tested and a value of 0.8 was used for analyses. CD-HIT-Suite analyses were undertaken on the web interface (http://weizh ong-lab.ucsd.edu/cdhit_suite/ cgi-bin/ index.cgi?cmd=cd-hit-est), using a pre-defined sequence similarity threshold of 97% (Janssen et al., 2019). The results of delimitations were compared against morphological species identifications of specimens new to this study to check for consensus, and as the ABGD results were most congruous with morphological delimitations these delimitation results were used for subsequent analyses.
Feeding guilds were determined at a family level (from sequence ID) following Jumars et al. (2015).

| Species diversity
All data analyses were conducted using R v.4.0.2 (R Core Team, 2020), on both COI and 16S datasets separately. To compare diversity between areas, rarefaction curves were computed based on the total number of individuals and the total number of species from each sampled area. Based on these data, the expected number of species was calculated for 20 individuals (ES(20)) from each area, and 150 individuals (ES(150)) for all areas combined. Standard Shannon (H′) and Simpson (D) diversity indices were also computed. Non-parametric species richness estimators were calculated for each area to represent regional diversity (Basualdo, 2017), using the package 'SpadeR' (Chao et al., 2016). While rarefaction compares observed richness among samples, richness estimators evaluate the total richness of a community (Shen et al., 2003).

| Beta diversity and phylogenetic diversity
The New Normalised Expected Species Shared (NNESS) (Gallagher, 1996) index was calculated and used to perform a distance decay analysis. Distance decay looks for a correlation between taxonomic similarities and geographic distance among pairs of areas. NNESS values were computed from the probabilities of species occurrences in random draws of m individuals, using the package 'ness' (Menot, 2019). Low values of m give high weight to dominant species, whereas high values of m give a higher weighting to rare species. The ideal m value is the one providing the highest Kendall correlation between the similarity matrixes for m = 1 and m = m max. Significance was tested using non-parametric Mantel tests between NNESS and geographic distance matrices, using Spearman Rank correlation coefficients and 999 permutations for obtaining p-values.
Taxonomic (species) beta diversity (TBD) was quantified for all pairs of areas using the Jaccard (β Jac ) dissimilarity index from presence-only data. This index was decomposed into its components: turnover (β jtu ; replacement of species), and nestedness (β jne ; assemblages with few species represent subsets of richer sites) (Baselga, 2010). To account for the phylogenetic relatedness between species, phylogenetic beta diversity (PBD) was also computed for all pairs of areas from two ultrametric Bayesian trees produced utilising the separate 16S and COI datasets, using the unweighted UniFrac metric of community dissimilarity (UF; derived from the Jaccard dissimilarity index) (Lozupone & Knight, 2005). Unifrac distance was calculated for both phylogenetic trees, which measures the phylogenetic distance between sets of taxa in terms of the overall branch length unique to each sample. This metric was also decomposed into its turnover (UF tu ) and nestedness (UF ne ) components. It is worth noting that nestedness in this framework is not an absolute measure of how nested two assemblages are, but rather a measure of the dissimilarity caused by richness gradients among nested assemblages (Baselga, 2010;Leprieur et al., 2011).
The turnover fraction of both metrics (β jtu and UF tu ) was used to perform hierarchical clustering and non-metric multidimensional scaling (nMDS), with clustering methods as in Bribiesca-Contreras et al. (2019). A null model approach was used to test whether PBD was explained by variations in TBD alone. The evolutionary relationships between samples were randomised 1000 times by rearranging the tree tip labels while keeping species diversity constant. Standardized effect size (SES) was computed as in Leprieur et al. (2012), which indicates if communities are more (≥1.96) or less (≤−1.96) phylogenetically diverse than expected based on TBD alone.

Non-parametric Mantel tests between input distance matrices
were based on Spearman Rank correlation coefficients and 999 permutations for obtaining p-values. Pairwise correlations were tested between (a) UF and geographic distance (great circle distance) between areas, (b) UF tu and geographic distance between areas, and (c) UF ne and geographic distance between areas.

| Biogeographic analyses
Since sampling effort was different between areas, species abundance data was 'chord' transformed to explore differences in relative abundance (Legendre & Borcard, 2018). After transformation, nonmetric multi-dimensional scaling (nMDS) ordination was performed with Bray-Curtis distance (Bray & Curtis, 1957). An UpSet plot was used to show the distribution of unique shared species across all sampled areas (Lex et al., 2014).

| Connectivity analyses
To investigate genetic diversity and connectivity patterns across the eastern CCZ and adjacent areas, relatively common species across the different localities were chosen (Appendix S3). The six species selected for connectivity analyses were: Bathyglycinde cf. profunda (Hartman & Fauchald, 1971) (Rouse & Pleijel, 2001). For these species, there were at least 10 sequences available for each of the 16S and COI genetic markers, which were analysed separately.
For the six species and the two sets of markers, haplotype number (h), haplotype diversity (Hd), nucleotide diversity (π), and number of segregating sites (S), were estimated using the R packages 'pegas' (Paradis, 2010) and 'ape' (Paradis & Schliep, 2019) (R script can be found in Stewart et al., 2023). Mismatch distribution plots, indicating the frequencies of pairwise distances, were generated for both 16S and COI for each of the six species using the 'pegas' function 'MMD'.

| Alpha diversity
A total of 1866 samples were included in the diversity analyses, with 1101 samples represented by 16S data and 1177 samples represented by COI data. ABGD results were most congruous with morphological delimitations and so these delimitation results were used for subsequent analyses. Based on this species delimitation, the 16S dataset contained 314 species, while the COI dataset contained 291 species. 51% of the 16S species were represented by a single specimen (singletons), and 52% of the COI species were singletons.
Both sampling and sequencing were uneven between regions, most notably in the BGR area from which there were 459 COI sequences, but only 96 16S sequences (Table 1). Non-parametric extrapolation analyses (Chao and Jackknife estimators) predicted a total number of species across all areas between 473.9 ± 17.9 and 580.7 ± 30.9 for the 16S dataset, and between 441.9 ± 17.4 and 538.8 ± 30.1 for the COI dataset (Table 1). For both datasets, the highest number of extrapolated species was predicted in UK-1B (16S: 201-277, COI: 155-305). Both IFREMER and BGR had notably lower estimated species numbers in the 16S dataset than in the COI dataset (Table 1) (150)) values are presented in Table 2.

| Taxonomic and phylogenetic beta diversity
Both 16S and COI datasets showed CCZ polychaete communities to be characterised by high levels of taxonomic beta diversity (Appendix S4). Taxonomic turnover between regions explained 59% of the UF tu variation for the 16S dataset, while taxonomic turnover explained 83% of the UF tu variation in the COI dataset.
However, within the COI dataset, even when the turnover of species was complete (β Jtu = 1), UF tu still discriminated between regions with different degrees of phylogenetic dissimilarity (Appendix S1, Figure 1). When there was zero nestedness in the species composition between regions (β Jne = 0), the regions could be phylogenetically nested (Appendix S1, Figure 1).

TA B L E 1
Observed and estimated species richness (±SE) for each individual sampled area in the Clarion Clipperton Zone, and for the areas as a whole (Total).

| Biogeographic patterns
Cluster analyses using the turnover components of TBD and PBD found different patterns within both the 16S dataset ( Figure 5a) and the COI dataset (Figure 5c). For the 16S dataset, hierarchical clustering recovered 4 optimal clusters for β Jtu and 6 for UF tu (Figure 5a), and these clusters varied in composition. In comparison, clustering of the COI dataset recovered 3 optimal clusters for β Jtu and 4 for UF tu (Figure 5c). Ordination of Bray-Curtis distance between areas, produced from chord-transformed taxonomic data, produced similar clustering patterns for both datasets, with APEI-6-NE separated furthest from all other areas (Figure 5b,d).
The distance decay of similarity showed no statistically significant correlation between NNESS and geographic distance for either dataset when all data were included (Figure 6a,b). However, for the COI data, when the analysis was restricted to the better sampled sites up to 500 km apart, there was a significant negative correlation between similarity and distance (Figure 6d; Pearson's Correlation Coefficient, t = −3.89, p < .05). For the 16S dataset, total PBD had a significant positive relationship with geographic distance (Appendix S1, Figure 2). However, no significant relationships were found between UF tu and UF ne with distance (Appendix S1, Figure 2). For both datasets, phylogenetic turnover was high, and phylogenetic nestedness low between areas regardless of distance.

| Genetic connectivity
The six species selected to explore patterns of regional genetic diversity have been reported from several localities in the CCZ, mainly eastern (APEI-6, BGR, GSR, IFREMER, IOM, OMS, and UK-1), but Paralacydonia cf. weberi was also collected in the western CCZ in APEI-7, separated by over 3000 km. So far, four of these species seem to be restricted to the CCZ. However, the Bathyglycinde cf.
profunda specimens analysed here share COI haplotypes with specimens collected in the Cape and Guinea Basins in the Atlantic Ocean (Böggemann, 2009)

| Biodiversity in the CCZ
Species are a fundamental unit of biodiversity and evolution (Sites & Marshall, 2004). However, there is a dearth of knowledge on the taxonomy and distribution of deep-sea species owing largely to a lack of data. Here, a molecular species delimitation approach was applied to explore levels and patterning of polychaete diversity across a 3 million km 2 region of the eastern Pacific. There have been few papers that have compared macrofaunal biodiversity in the CCZ with other regions, although diversity is often hypothesised to be high (e.g., Janssen et al., 2015). We provide here some lines of new evidence that do support this, at least for macrofaunal polychaetes (individuals >300 μm). Species estimators predict >550 polychaete species in the region (from an actual species number of 315), higher than reported elsewhere .
Additionally, when we place the results in a global context (Table 3), it is apparent that diversity in the CCZ samples is 40%-50% higher than comparable abyssal samples in the Atlantic and higher than in other bathyal or abyssal habitats globally (Feder et al., 2007;Glover et al., 2001Glover et al., , 2002Hilbig et al., 2006;Neal et al., 2011;Schaff et al., 1992).
These results must, however, be treated with caution, as studies referenced in Table 3 used morphological methods to determine species, whilst here we have used a DNA-based delimitation method that is likely to detect many cryptic species (Knowlton, 1993  These areas also had the largest numbers of shared species, as well as similar family and feeding-guild compositions. Significant positive correlations between polychaete abundance and POC flux at the seafloor across the CCZ have been previously reported Washburn, Menot, et al., 2021). In low productivity systems, such as the abyssal seafloor, models predict that food chain length is positively correlated with resource availability (Post, 2002).

TA B L E 2
Estimates of polychaete diversity on a regional scale adapted from Neal et al. (2011).

| Beta diversity and phylogenetic diversity in the CCZ
When all data were examined, although beta diversity and phylogenetic diversity were high for all sites, we did not find a statistically significant linear relationship between geographic distance and species similarity (measured by NNESS). However, when COI data was examined for sites only up to 500 km distance, there was a negative relationship between geographic distance and similarity. One explanation for this is that the closer sites (e.g., BGR, OMS, and UK-1) are better sampled, and hence the signal is better resolved. This supports previous findings by Bonifácio et al. (2020), who report a significant negative relationship between NNESS similarity and geographic distance when excluding APEI-3, a site on the other side of the Clarion Fracture. It could also be the case that APEI-6-NE, one of the furthest sites from all others and also located on the other side of the Clarion Fracture, was causing most of the low NNESS values between sites. Therefore, by removing sites with pairwise distances above 500 km, we potentially remove the confounding factor of the fracture zone. The nMDS results also placed APEI-6-NE as the most distinct from other sites, further suggesting that the Clarion Fracture potentially acts as a biogeographic barrier.

F I G U R E 3 UpSet plots of unique polychaete species between sampled areas for the (a) 16S dataset, and (b) COI dataset. Coloured bars
show the total number of species identified in each area. Black dots show intersections between areas, with the bar above representing the number of species unique to that intersection.
The inverse relationship between shared species and distance could also result if a subset of rare species have restricted ranges, while more abundant species tend to be widely distributed (Washburn, Menot, et al., 2021). Such patterns have been observed in other better studied ecosystems, in which rarity is often correlated with small species ranges (Pimm et al., 2014). Another explanation is that the relationship is significant, but non-linear, with species similarity decreasing linearly between sites up to around 500 km before levelling off owing to a currently unknown phenomenon. This relationship, once completely resolved, could be used to model extinction risk at regional scales in the CCZ.
While these findings may suggest that dispersal limitation plays the main role in controlling assemblage structure across the CCZ, the high genetic turnover (UF tu ) regardless of distance between sites found here suggests that environmental filtering and niche-based processes may also play a role in species distributions.
Patterns of PBD are almost completely unknown for the deep sea (Janssen et al., 2015), with nearly all analyses to date considering only turnover in taxonomic composition. A spatially extensive analysis of taxonomic, functional, and phylogenetic beta diversity in deep-sea bivalves found rates of distance decay of similarity with environmental distance that were 8-to 44-fold steeper than with spatial distance between sites (McClain et al., 2012). However, both Janssen et al. (2015) and McClain et al. (2012) only considered total PBD, which hinders the assessment of the relative roles of neutral and niche-based processes in shaping PBD patterns (Leprieur et al., 2011(Leprieur et al., , 2012. This is because spatial turnover and nestedness are two antithetical phenomena impacted by different environmental and biological processes (Leprieur et al., 2009). Beta diversity, and more recently PBD, is increasingly acknowledged as key to understanding local and regional diversity dynamics, which translate into conservation-relevant insights into large-scale diversity maintenance (Socolar et al., 2016;Winter et al., 2013). This represents a significant area requiring more study across the CCZ, if biodiversity is to be effectively protected.
When compared against a null model, non-random patterns of lineage turnover were found for 61%-77% of pairwise comparisons. Higher PBD than expected based on TBD can be attributed to several processes, including: the replacement (turnover) of entire phylogenetic lineages as a response to environmental gradients, past speciation and extinction events, dispersal limitation, niche-based processes, or a combination of all four (Graham & Fine, 2008). Similar results have been documented between communities delimited by strong environmental gradients (Graham et al., 2009), or large geographic distances (Fine & Kembel, 2011;Morlon et al., 2011). The relative influences of deterministic and stochastic processes in regulating patterns of biodiversity remain of fundamental interest to studies of deep-sea evolution and ecology (Rex & Etter, 2010). Integration of phylogenetic information into diversity and community-ecology frameworks can enhance our understanding of the varying roles of ecological and evolutionary processes in shaping observed patterns of alpha and beta diversity.
The phylogenetic distinctness of APEI-6-NE in both datasets, when combined with observed low alpha diversity, suggests the presence of deterministic factors shaping the community composition at this site. We also found no significant relationship between geographic distance and PBD, suggesting that the high SES values between regions were driven by environmental rather than spatial gradients. Environmental heterogeneity across the CCZ is well documented (Menendez et al., 2019;Mogollón et al., 2016;Volz et al., 2018Volz et al., , 2020Washburn, Menot, et al., 2021), and this is likely to result in environmental filtering, whereby local communities experiencing different environmental conditions (e.g., POC flux, nodule coverage, sediment geochemistry) contain different phylogenetic lineages. Further analyses involving localscale environmental data are therefore vital for elucidating the processes driving observed patterns in taxonomic and phylogenetic turnover and will be important in effective conservation of the CCZ abyssal fauna.

| Biogeographic patterns and species ranges
The development of evidence-based environmental management in the CCZ is reliant on data evaluating the biogeography and connectivity of species at a regional level (Glover et al., 2016). This study presents the largest published analysis of polychaete community data, using internally consistent taxonomy, from across the CCZ, which allows us to make some preliminary observations as to the biogeography of the region. Species ranges varied greatly, with some polychaete species ranging nearly 5000 km (between the most F I G U R E 5 Clustering analyses of polychaete species composition. (a) tanglegram of hierarchical clustering results from β jtu and UF tu of the 16S dataset; (b) 2D non-metric multi-dimensional scaling results based on Bray-Curtis distance between chord-transformed data of the 16S dataset; (c) tanglegram of hierarchical clustering results from β jtu and UF tu of the COI dataset; (d) 2D non-metric multi-dimensional scaling results based on Bray-Curtis distance between chord-transformed data of the COI dataset. Branches of dendrograms are coloured based on results of cluster analysis.
distantly sampled areas APEI-7 and UK-1), while over 50% of species were found only at a single site.
The APEI network was originally designed to capture the full range of seafloor habitats within the CCZ, to create reserve areas that are biogeographically representative of the wider region (Wedding et al., 2013). Here, a very preliminary assessment as to the representativeness and appropriateness of APEI-6 to preserve polychaete biodiversity can be conducted. Between 37.5% and 70% of the species identified in APEI-6-SW and -NE were unique to the area, and both areas formed distinct taxonomic and phylogenetic clusters.
Caution must be taken when interpreting these results, however, because limited sampling was conducted at these two sites, which may influence the community patterns observed. Differences between sites may also be significantly influenced by the Clarion Fracture which bisects APEI-6 approximately through the centre. APEI-3, which largely sits to the north of the Clarion Fracture, has been shown to host significantly different communities of polychaetes , nematodes (Hauquier et al., 2019), and ophiuroids (Christodoulou et al., 2020). It is therefore possible that the Clarion fracture acts as a physiographic barrier, affecting the dispersal, and therefore taxonomic composition of communities either side.
Geochemical data suggest that APEI-6 is different from UK-1 sites ~500 km away (Menendez et al., 2019). Both nodules and sediments from sites within APEI-6 displayed significant chemical differences from those in UK-1, including higher levels of iron and cobalt (Menendez et al., 2019). The presence of nodules plays a key role in structuring metazoan communities and diversity within the CCZ, proving hard substrate for the growth of sessile mega-and macrofauna such as cnidarians, polychaetes, and sponges, as well as  habitat for meiofauna such as nematodes, tardigrades, and harpacticoid copepods (De Smet et al., 2017;Miljutina et al., 2010;Pape et al., 2021;Simon-Lledó et al., 2019). It has been suggested that sediment shear strength can influence macrofaunal diversity patterns (Chuar et al., 2020); however, it is currently unknown if subtle changes in nodule geochemistry influence biodiversity and community composition. The available biological and geochemical evidence suggests that some areas of APEI-6 are partially representative of the exploration areas to the south, yet are different in several key characteristics including oxygen penetration depth, sediment grain size, and nodule shape and size . However, thus far, sampling of APEIs has been very limited (Washburn, Menot, et al., 2021) and increased sampling is necessary to fully assess the representativeness of the complete network of 13 APEIs.

| Population connectivity in the CCZ
We have found a high degree of connectivity between populations of macrofaunal species, but it should be noted that these samples are limited to the more abundant and potentially more widespread species. Barriers to dispersal, such as unidirectional current regimes or geography, and differences in habitats and ecosystems are factors that shape the spatial distribution of species (Slatkin, 1987).
In the deep sea, bathymetry has been suggested as an additional barrier more important to geneflow than geographical distance (see Taylor & Roterman, 2017), as seen in the case of scavenging amphipods (Havermans et al., 2013). The lack of evidence for population structure in the polychaete species investigated here suggests that the broadly similar depths of the CCZ abyssal plain favours gene flow at the time scales relevant for mitochondrial data (Hellberg et al., 2002). These results also indicate that the polychaete species investigated here have the potential for long range dispersal, as found in other common abyssal polychaete genera Prinospio and Aurospio (Guggolz et al., 2020). This contrasts with studies on connectivity and gene flow from annelids at shelf depths and non-abyssal deep-sea sites such as hydrothermal vents, often indicating population structure at 500 km geographical scales (David & Cahill, 2020;Kojima et al., 2012;Vrijenhoek, 2010;Williams et al., 2016).
We also report high levels of genetic diversity in the sediment dwelling species; which when following the conventional understanding of coalescence would indicate shrinking populations (Kuhner, 2009). We interpret this pattern in our data to result from under-sampling, since it seems unlikely that the relatively stable abyssal-plain environment would cause several populations to decline. In contrast with the other sediment-dwelling species, Neanthes goodayi, a nodule-dwelling species, shows low genetic diversity for COI, the most variable genetic marker. Contrastingly, Nicomache cf.
benthaliana did not show low genetic diversity, despite also inhabiting nodules. With the assumption that nodule-dwelling species experience the same general biological and physical factors such as currents, depths, and productivity, this difference indicates that nodule habitat endemism provides a geomorphic factor that impacts species genetic variation through factors such as the patchy nature of the habitat (Zeng et al., 2020). Information provided here on genetic diversity and gene flow should, however, be taken with caution, as it only uses two mitochondrial genetic markers. Other genetic markers, such as microsatellites or genome-wide scans (e.g., RADseq: restriction-site associated DNA sequencing), may be able to provide finer-scale resolution for genetic connectivity studies (Andrews et al., 2016).
We cannot report on the population connectivity of the rare species that represent over half of the individuals collected, since in most cases, these are represented by a single specimen. Basic biological knowledge would suggest that this vast number (hundreds) of species do have viable populations; however, we do not know how they are distributed in space. There is the possibility that they have viable populations elsewhere in the abyss and consist of sink populations supported by larval dispersal from elsewhere (Hardy et al., 2015;Rex et al., 2005). Contrastingly, they may have local populations that go through boom-bust cycles dependent on food availability, and should this be the case, they are not protected from extinctions in the face of nodule mining. At present, the only known solution to this problem is increased sampling, but more detailed genomic analyses may also provide some help in understanding their demographic histories (e.g., Hohenlohe et al., 2021). From a management perspective, we can hypothesise that the abundant and well-connected species so far studied may well be able to recolonise defaunated regions, assuming environmental conditions are suited. For the rare species (the majority), we must await further data.

| CON CLUS ION
Significant policy decisions on deep-sea mining are on the near horizon, and lawmakers need access to accurate biodiversity data to make informed choices. Results presented here provide an insight into both taxonomic and phylogenetic beta diversity for macrofauna inhabiting a potential deep-sea mining zone, suggesting that biodiversity in the abyssal Pacific is higher than in other comparable abyssal settings. The turnover of lineages and species was consistently high between sites, regardless of geographic distance, and comparison against a null model identified that phylogenetic turnover was typically higher than would be expected based on differences in species composition. These results suggest that environmental filtering rather than dispersal limitation plays a greater role in regulating spatial patterns of biodiversity in the CCZ, highlighting the importance of considering the biogeochemical representativeness of designated protected areas if they are to succeed in preserving the CCZ fauna.
Taken together, these results advance our understanding and give new insights into the mechanisms and processes influencing these abyssal communities. The greater our understanding of the processes driving patterns of biodiversity, the greater chance we have | 743 STEWART et al.
of being able to effectively mitigate the effects of potential environmental change caused by nodule mining.

ACK N O WLE D G E M ENTS
We thank the masters, crew and technical staff on the RV Melville,

CO N FLI C T O F I NTER E S T S TATEM ENT
The authors declare there is no conflict of interest.

PEER R E V I E W
The peer review history for this article is available at https://publo ns.com/publo n/10.1111/ddi.13690.

DATA AVA I L A B I L I T Y S TAT E M E N T
F I G U R E 7 TCS haplotype networks based on COI and 16S sequences of six polychaete species. Circles are proportional to the number of samples and all regions are colour coded. White filled circles indicate missing haplotypes. Branches are not scaled to the number of nucleotide substitutions; substitutions between haplotypes are indicated by vertical bars along the branches. Images of species are not to scale.