Life goes on: Spatial heterogeneity promotes biodiversity in an urbanized coastal marine ecosystem

Both human populations and marine biodiversity are concentrated along coastlines, with growing conservation interest in how these ecosystems can survive intense anthropogenic impacts. Tropical urban centres provide valuable research opportunities because these megacities are often adjacent to mega‐diverse coral reef systems. The Pearl River Delta is a prime exemplar, as it encompasses one of the most densely populated and impacted regions in the world and is located just northwest of the Coral Triangle. However, the spatial and taxonomic complexity of this biodiversity, most of which is small, cryptic in habitat and poorly known, make comparative analyses challenging. We deployed standardized settlement structures at seven sites differing in the intensity of human impacts and used COI metabarcoding to characterize benthic biodiversity, with a focus on metazoans. We found a total of 7184 OTUs, with an average of 665 OTUs per sampling unit; these numbers exceed those observed in many previous studies using comparable methods, despite the location of our study in an urbanized environment. Beta diversity was also high, with 52% of the OTUs found at just one site. As expected, we found that the sites close to point sources of pollution had substantially lower diversity (44% less) relative to sites bathed in less polluted oceanic waters. However, the polluted sites contributed substantially to the total animal diversity of the region, with 25% of all OTUs occurring only within polluted sites. Further analysis of Arthropoda, Annelida and Mollusca showed that phylogenetic clustering within a site was common, suggesting that environmental filtering reduced biodiversity to a subset of lineages present within the region, a pattern that was most pronounced in polluted sites and for the Arthropoda. The water quality gradients surrounding the PRD highlight the unique role of in situ studies for understanding the impacts of complex urbanization pressures on biodiversity.


| INTRODUC TI ON
Human activities, whose impacts are felt globally, regionally and locally, are fundamentally reshaping patterns of biodiversity.While much attention has been given to the impact of global climate change on ecosystems, local and regionally pervasive stressors also heavily influence patterns of marine species distributions across a variety of scales (Halpern et al., 2008).This is particularly true along the coasts, where sewage, sedimentation (caused by coastal development, riverine inflow, land reclamation and dredging), aquaculture and agricultural and industrial effluents have collectively degraded water quality, shifted species distributions and altered food webs (Altieri et al., 2017;Carpenter et al., 2008;Folke et al., 2004;Orth et al., 2006;Otaño-Cruz et al., 2019).
Coastal population density is three times higher than the global average, with many megacities situated where rivers meet the sea (Konishi, 2000).This urbanization produces strong and complex pollution gradients, providing opportunities to study the effects of multiple stressors on biodiversity under the global umbrella of climate change (Leray et al., 2021).Although widespread, the impacts of urbanization on community-level species richness and phylogenetic diversity are poorly understood generally (Halpern et al., 2008;Lu et al., 2020), and are particularly understudied in coastal marine environments (McDonald et al., 2020).
As such, studies of biodiversity near coastal cities, which experience the complexities of integrating industry, human population growth, coastal development, food security and environmental conservation, are both broadly relevant and challenging (Bindoff et al., 2022).
The need to study marine diversity around tropical coastal cities is especially urgent, given the enormous biodiversity, much of it uncharacterized, that their waters potentially shelter and the limited ability of many of these cities to control local impacts.Many of these cities are sited in locations that once hosted thriving and highly diverse coral reef ecosystems, which are particularly sensitive to human impacts.The region surrounding the Pearl River Delta (PRD, Guangdong Province, China), hosting a population of ~120 million people, provides an important example (Duprey et al., 2020).
The coastal ecosystems of the South and East China Sea are among the most highly impacted on Earth (Halpern et al., 2008), yet marine and freshwater taxa are the focus of only 7% and 20% of recent urban biodiversity studies, respectively (reviewed in McDonald et al., 2020).Within the relatively small geographical footprint of the PRD, there exist large gradients in coastal infrastructure and environmental pollution derived from diverse sources: water quality degradation over the last several decades has led to the extirpation of some or all coral species close to urbanized areas, while speciesrich coral communities remain in less impacted regions nearby (Cybulski et al., 2020;Duprey et al., 2020).This variability allows a unique insight into how human impacts related to urbanization structure coastal marine biodiversity.
Urbanization can affect marine biodiversity through both physico-chemical and biological-ecological processes.An organism must be able to tolerate the physical and chemical conditions of its environment, possess the ecological traits to compete for limiting resources and avoid predation.Fewer species are expected to have the physiologies and ecological traits required to cope with highly disturbed environments, which can lead to decreased species diversity through trait-based filtering of community members (Chalmandrier et al., 2022).Where tolerance traits are restricted to particular phylogenetic lineages, environmental filtering can also drive decreased phylogenetic diversity (Thakur et al., 2017;Webb et al., 2002).
Conversely, the impact of ecological selection on biodiversity can have the opposite effect on phylogenetic structure.Where environmental filtering is less pronounced, ecological processes, specifically competition, tend to drive trait diversification so that a broad range of functional, and presumably phylogenetic, diversity will be maintained (Helmus & Ives, 2012).In these cases, phylogenetic structure will be patterned as overdispersion, that is more evenness than expected among the branches of the phylogenetic tree, to minimize trait overlap (Chase & Leibold, 2004;Webb et al., 2002).Thus, gradients of human activities that alter habitat quality (e.g.pollution had substantially lower diversity (44% less) relative to sites bathed in less polluted oceanic waters.However, the polluted sites contributed substantially to the total animal diversity of the region, with 25% of all OTUs occurring only within polluted sites.
Further analysis of Arthropoda, Annelida and Mollusca showed that phylogenetic clustering within a site was common, suggesting that environmental filtering reduced biodiversity to a subset of lineages present within the region, a pattern that was most pronounced in polluted sites and for the Arthropoda.The water quality gradients surrounding the PRD highlight the unique role of in situ studies for understanding the impacts of complex urbanization pressures on biodiversity.

K E Y W O R D S
ARMS, biodiversity loss, COI, DNA metabarcoding, environmental filtering, phylogenetic structure, urbanization, water quality and ocean sprawl) and/or impact ecological networks (e.g.resource extraction) can have complex effects on community assembly, biodiversity and ecosystem functions (Todd et al., 2019).
Most studies of the impacts of anthropogenic stressors on ecosystem processes have focused on larger organisms, such as fish and coral on coral reefs.However, the bulk of eukaryotic marine biodiversity is cryptic, being comprised of very poorly known communities of marine micro-metazoans, particularly annelids, mollusks and arthropods, that are difficult to identify and sample (Carvalho et al., 2019;Pearman et al., 2020).Here we used long-established, standardized biodiversity samplers (Autonomous Reef Monitoring Structures, ARMS) to examine how the biodiversity of these communities varied with human impacts in an area of intense coastal urbanization.These samplers, when combined with molecular approaches to identify community members, are ideal for characterizing these typically neglected portions of marine biodiversity.They have been deployed in a wide array of settings, including coral reefs around the world (Carvalho et al., 2019;Pearman et al., 2016;Plaisance et al., 2021;Ransome et al., 2017;Thomasdotter et al., 2023;Timmers et al., 2020), but less commonly in anthropogenically disturbed areas (Pennesi & Danovaro, 2017).They thus provide data that is otherwise unavailable and can contribute to the ultimate goal of more efficient and scientifically informed management of these and other urban marine ecosystems (Airoldi et al., 2020).

Al
We deployed six ARMS at each of seven sites across the Hong Kong Special Administrative Region (HKSAR) located to the east of the Pearl River Delta (Figure 1a,b; Table S1).All ARMS were deployed near the shore at approximately 4 m depth.The habitat types were primarily defined by hard coral, large rocks, sand and/or rubble, with the exception of the soft sediment habitat at SSW and the absence of live coral at CI (Table S1).To maximize the potential to collect species that are seasonally restricted, we deployed ARMS in two temporally overlapping sets.One set (n = 3 ARMS) was deployed from January 2018 to January 2019 and another set (n = 3 ARMS) from July 2018 to July 2019.
Three sites that are well documented for having chronic eutrophication were chosen to represent 'high impact' (Duprey et al., 2016(Duprey et al., , 2017)): San Shek Wan (SSW) is directly exposed to high nutrients and sedimentation from the Pearl River (Zhou et al., 2012); Peng Chau (PC) lies near the outflow of Hong Kong's largest wastewater treatment plant; and Centre Island (CI) is located deep within Tolo Harbour, where polluted groundwater drives ubiquitous red tides (Hua et al., 2008).An additional three sites, located in areas further from urbanization and relatively more flushed by marine waters, were chosen to represent relatively 'low impact': Tung Ping Chau (TPC), Bluff Island (BI) and Cape D'Aguilar (CDA).These locations host diverse coral communities, including nutrient-sensitive indicator taxa such as corals in the genus Acropora.A seventh site, Sham Wan (SW), lies in a transition zone at the edge of the Pearl River influence and was considered 'medium impact' (Figure 1b) (Lai et al., 2016).
These qualitative assessments of environmental conditions were then quantified using water data that are collected monthly by the HKSAR Environmental Protection Department (EPD) at designated monitoring stations throughout the region.The EPD reports monthly water quality parameters from three depths (surface water: 1 m depth, middle water: half the sea depth and bottom water: 1 m above the seabed) at locations slightly offshore.To characterize the environmental conditions of our study sites during ARMS deployments, we extracted data from January 2018 to August 2019 for each of the nearest monitoring stations or calculated the mean of the 2-3 nearest stations (Figure 1b; Table 1).For each station, data from one depth were selected according to the depth of the ARMS (3-4 m) relative to the maximum depth at the monitoring station (Table S2).
Briefly, the ARMS structure was taken apart by detaching the individual plates in a container with the seawater in which they were retrieved and transported.Motile organisms were removed first by hand and then through sieves in three size fractions: 106-500 μm, 500 μm-2 mm and >2 mm.The plate surfaces were then photographed, and sessile organisms were scraped off each plate and homogenized in a blender.DNA metabarcoding was used to assess bulk diversity in the two smaller motile fractions and from the scraped sessile fraction.Each metabarcoding sample was rinsed thoroughly over a 40μm mesh, first with filtered seawater and then with 90% EtOH.Samples were then transferred to 95% EtOH and stored at −20°C.Given that they are a small fraction of total diversity, only a subset of large motile specimens (those >2 mm) and selected sessile specimens representing unique morphotypes were collected for barcode vouchering to help identify sequences from metabarcoding.These samples were photographed alive, identified to the lowest taxonomic group possible, and a tissue subsample was preserved in 95% EtOH.
The motile bulk fractions were prepared for DNA extractions with a decanting step (to remove sediments).Motile and sessile fractions were weighed to retain up to 10 g of raw material.We used a modified version of the DNeasy PowerMax Soil Kit (Qiagen) protocol for DNA extractions of the bulk fractions.To avoid shearing of genomic DNA, we replaced step 4 of the protocol with an overnight proteinase K digestion step.We further purified motile DNA extracts using a DNeasy PowerClean Pro CleanUp kit following the manufacturer's protocol.To increase DNA concentrations where necessary, we used Sodium Acetate 0.3 M in EtOH to precipitate the DNA, rinsed twice with 70% EtOH, air dried and re-suspended in smaller volumes of 10-50 μL of TE.The cytochrome c oxidase subunit I (COI) gene was targeted as a proxy for metazoan diversity (the bulk of the sequences) with the primer sets mlCOIintF and jgHCO2198 (Leray et al., 2013).Extracted DNA from each sample was amplified in triplicate using 10-30 ng of template DNA in 20 μL volumes with the reaction mixture containing 1 μL of each primer (10 pmol/μL), 1.4 μL of 50X dNTP Mix (10 mM ea.), 2 μL of 10X Advantage 2 PCR Buffer (Takara Bio, USA) and 0.4 μL of 50X Advantage 2 Polymerase Mix (Takara Bio, USA) with the addition of 0.5 μL Bovine Serum Albumin (BSA) (20 mg/mL).After a denaturing step of 1 min at 95°C, 16 cycles consisting of 10 s at 95°C, 30 s at 62°C and 1 min at 72°C were performed, followed by 20 cycles consisting of 10 s at 95°C, 30 s at 46°C and 1 min at 72°C before the final elongation at 72°C for sequencing with a minimum of 100,000 reads per sample.Raw reads are deposited at Figshare under the private link: https:// figsh are.com/s/ 9fe2f 15033 4506e 9fa10 , and will be updated to publicly available project DOIs upon manuscript publication.For tissues subsampled from large motile individuals, genomic DNA was extracted using the standard phenol-chloroform protocol.Amplification of the COI barcode region and Sanger sequencing followed standard protocols (Geller et al., 2013).Raw FastQ files are available at figshare (https:// doi.org/ 10. 6084/ m9.figsh are.24862782 & https:// doi.org/ 10. 6084/ m9.figsh are.24862863) and will be made publicly available on BOLD with a project DOI upon manuscript publication.

| Environmental data
To assess the differences in water quality among sites, a total of 12 variables representing the most important aspects of seawater quality were selected: 5-day biochemical oxygen demand, ammonia nitrogen, dissolved oxygen, faecal coliforms, orthophosphate phosphorus, phaeo-pigments, total inorganic nitrogen, total Kjeldahl nitrogen, total nitrogen, turbidity, salinity and volatile suspended solids (Kwong et al., 2022; Table 1).Data from the 18month deployment period were averaged, scaled across sites and visualized with Euclidian distance-based cluster analysis in a heatmap using the R package ComplexHeatmap.We also considered the full set of data as monthly means for each site and calculated a dissimilarity matrix based on Euclidian distance.We tested this matrix for multivariate homogeneity of group dispersions across impact groups and sites, and then tested differences between sites using a PERMANOVA (vegan; Dixon, 2003) with impact and site (nested within impact) as fixed factors and month included as a random factor.We follow up with post-hoc analysis using the pairwiseAdonis package and Bonferroni correction for multiple comparisons, where appropriate.Finally, we visualized the data as monthly means in ordination space using nonmetric multidimensional scaling and plotting with the R package ggord (Kwong et al., 2022).

| Metabarcoding-COI
Sequences from the three fractions were analysed in parallel.Primers were trimmed using the programs cutadapt (Martin, 2011).The forward and reverse sequence reads were truncated using DADA2 (Callahan et al., 2015) to 200 and 150 base pairs, respectively, due to a drop in base pair quality scores past these sequences' length (phred score <30).The DADA2 pipeline was then used to merge paired ends, dereplicate, infer amplicon sequence variants (ASV) and remove chimeras.Sequences were then filtered based on their amino acid translation.Using MASCE (Ranwez et al., 2011), we aligned our sequences to the Moorea BIOCODE library, a curated reference barcode dataset containing 7675 COI sequences from 30 animal phyla, and translated their amino acid sequence using the invertebrate protein translation code.Sequences that had zero stop codons, zero frameshifts, zero insertions and no more than three deletions were retained (Leray et al., 2013).The remaining ASVs were clustered using VSEARCH (Rognes et al., 2016) into OTUs based on 97% sequence similarity.OTUs were assigned to taxonomic groups using a combination of iterative approaches.First, query sequences were compared to a custom-built database including the CO-ARBitrator database (Heller et al., 2018), sequences from GEOME (Deck et al., 2017) and 735 barcodes generated herein from large motile macrofauna in VSEARCH within QIIME2 (Bolyen et al., 2019).A minimum percent match of 95% sequence similarity was used for taxonomic assignment to species level where possible.Remaining unassigned OTUs were subjected to a second pass using VSEARCH on the same database with an 80% minimum percent match.OTUs that remained unidentified with VSEARCH were compared to the whole NCBI NT database (retrieved May 2021) using BLAST searches (word size = 7; max e-value = 5e-13) and assigned the taxonomic information of the lowest common ancestor of the top 100 hits.Assignments were then manually curated.Ultimately, the taxonomic assignments at the 95% level were so few (see Section 3) that all subsequent analyses were conducted at the phylum level, that is OTUs per phylum.Statistics and figures were created using the 'phyloseq' package in R (McMurdie & Holmes, 2013).Sampling effort across ARMS and sites was assessed with rarefaction curves using the R package 'vegan'.All OTU counts were transformed to presence/ absence (0/1) data prior to analysis.

| OTU richness and diversity
After merging data at the ARMS level (merging sessile and two motile fraction data), OTU accumulation curves were generated using the 'specaccum' function within VEGAN.Means and standard deviations were based on the exact method with 500 permutations.To contextualize our diversity data with other regions, we included published data from ARMS diversity surveys, which use the same standardized sampling and analysis protocols, conducted in the Red Sea (Pearman et al., 2018), the West Atlantic (Leray & Knowlton, 2015) and Singapore (Ip et al., 2022).Given the nature of these large datasets, estimates were extracted from published figures using Web Plot Digitizer (version 4.5).
To understand patterns in alpha diversity, we assessed OTU richness at both the ARMS and site level.The proportional distribution of unique OTUs among phyla, including those unassigned to a phylum, was visualized for each ARMS using ggplot2.OTU data from all ARMS collected within a site were merged for visual comparisons of the cumulative number of unique OTUs found at a given site.To compare alpha diversity by impact, we used a linear mixed effects model with impact as a fixed effect and site and ARMS_ID as random effects (with ARMS_ID nested within site) in the R package 'nlme' (Pinheiro et al., 2023).Since site-level replication for the medium-impact designation was lacking, only high-and low-impact sites were compared.This analysis was repeated with and without the inclusion of unassigned OTUs.The linear mixed effects model was also used to compare patterns of OTU richness within each of the three richest phyla: Annelida, Arthropoda and Mollusca.
Beta diversity differences between sites and their impact status were assessed with the phyloseq package.A Jaccard dissimilarity matrix and principal coordinate analysis ordination were generated from the presence/absence of OTUs on each ARMS and visualized with respect to site and deployment season with ggplot2.We generated a distance matrix of OTU presence/absence data based on Jaccard distances and tested this matrix for multivariate homogeneity of group dispersions across impact groups and sites.We then tested differences between sites using a PERMANOVA (vegan) with impact and site (nested within impact) as fixed factors.We followed up with post-hoc analysis using the pairwiseAdonis package and Bonferroni correction for multiple comparisons, where appropriate.We also visualized, using an upset plot, how broadly OTUs were distributed by examining the taxonomic profiles (phylum level) of OTUs specific to a single site and those found within a particular impact group.

| Community phylogenetic structure
Phylogenetic diversity was analysed within the three most diverse phyla: Arthropoda, Annelida and Mollusca.OTU sequences were subset by phylum assignment, re-imported to Qiime2 and aligned using the MAFFT alignment tools (Katoh & Standley, 2013).The Qiime2 alignment mask was used to mask highly variable positions before a phylogenetic tree was inferred using the Fasttree program and rooted at its midpoint.Since phylogenies were built from a coding gene, the mean nearest taxon distance (MNTD) was used to detect phylogenetic structure arising near branch tips, as opposed to ancestral structure detected by the mean pairwise distance (MPD).
In R, cophenetic distances were calculated for each tree, and the package picante (Kembel et al., 2010)  proportional occurrence that MNTD obs was ranked higher or lower than MNTD null .We expected that environmental filtering would cause significant clustering at the site and ARMS level, as measured by a negative z-score (and p > .05).We expected that ecological processes would lead to significant evenness at the site and/or ARMS level as measured by a positive z-score (and p > .95).

| Site characterization
In situ monitoring broadly confirmed the site assignments as high, medium and low impact based on standardized data from multiple measured parameters.In general, water quality was relatively poor at CI, SSW and PC compared to TPC, BI and CDA, with SW, the medium-impact site, sharing characteristics of each (Figure 1c; Figure S2).Heatmap clustering (Figure 1c) of annual mean parameters grouped high-impact sites SSW and PC into one cluster and all other sites into a second.Within the second cluster, CI was the most distant outgroup.Low-impact sites BI and TPC grouped closest, with CDA being the next most similar.The medium-impact site, SW, clustered closest with, but outside of, the low-impact site cluster.Relative to low-impact sites, high-impact sites were mainly characterized by high levels of faecal coliforms, turbidity, volatile suspended solids, nitrogen (though this varied by form), with low dissolved oxygen and salinity (Figure 1c).We confirmed multivariate homogeneity of group dispersions for Site (F (5,66) = 1.85; p = .12)and Impact (F (1,70) = 2.51; p = .12)group data.PERMANOVA analysis of the dissimilarity matrix generated from the full set of monthly water quality data confirmed that there were significant differences between impact groups (F (1,71) = 2.83; p = .001)and sites (Impact: Sites = F (4,71) = 1.98; p = .003;Table S3).Post-hoc comparisons following PERMANOVA do not allow for interaction terms or nesting; however, including the site factor alone, significant pairwise differences were mostly restricted to between groups.TPC and BI were significantly different from each of the high-impact sites, SSW, PC and CI.CDA was significantly different from PC but not the others (p < .05;Table S4).The only pairwise comparison that pointed to a within-impact group difference was that between PC and CI (p < .05;Table S4).Subsequent NMDS visualization confirmed that high-and low-impact sites tended to cluster separately, driven by differences in faecal coliforms, nitrogen, turbidity, DO and salinity, but that CI and the medium-impact site SW had some months of overlap with each group (Figure S2).

| OTU richness and diversity
Of the 42 ARMS deployed, one from SW, the medium-impact site, could not be relocated and was assumed to have been removed by currents or during fishing activities.Sequencing was successful for 119 of 123 size fraction samples; one sessile sample spilled during processing, and three sessile samples failed to amplify despite repeated attempts (Tables S1 and S5).An average of 127,313 reads per sample were generated.After removing singletons, filtering out low quality, non-functional protein translations, Blast+ assigned bacteria and six contaminant sequences from our negative controls, our data yielded 7184 OTUs.Among the three size fractions, the sessile portion tended to have the highest OTU richness as well as the highest number of unique OTUs (Table S5).
Taxonomy was assigned to 3617 of the 7184 OTUs after the three threshold passes: 576 OTUs to a species at the 95% similarity level, 2159 OTUs to a phylum based on 80% similarity and 882 OTUs to a phylum using the LCA approach.These assignments represented 31 eukaryotic phyla, 21 of which were metazoans.The most species-rich phyla were Arthropoda ( 1281OTUs), Annelida (623 OTUs) and Mollusca (372 OTUs) (Figure 2).
Rarefaction curves were near saturation for most sites, indicating sufficient sequencing depth, with the exception of ARMS collected at SW (Figure S3).In fact, SW hosted the highest mean number of OTUs per ARMS (Figure 2; Table S5) and the highest estimated diversity (Figure S3).Comparisons of high-versus lowimpact sites showed that, whether unassigned OTUs were included or not, OTU richness varied significantly with impact (All OTUs: β = 191.5;SE 55.0, p = .02;Assigned OTUs: β = 100.4;SE 23.9; p = .01;Figure 2; Table S5).OTU richness was 44% higher at low-impact sites (mean OTUs per site = 1149) relative to highimpact sites (mean OTUs per site = 799; Figure 2).Similar patterns were seen when comparing the mean number of OTUs per ARMS between high-and low-impact sites (Table S5).Isolating the three most OTU rich phyla showed that arthropod OTU and annelid OTU richness within these groups was significantly lower at high-impact sites (Arthropoda: β = 45.4;SE 13.6; p = .03;Annelida: β = 21.4;SE 6.7; p = .03;Figure S4a,b), but this was not the case for mollusks (β = −4.7;SE 5.6; p = .45;Figure S4c).
The PCoA of Jaccard distances demonstrates that ARMS communities at high-impact sites differed substantially in composition from the three low-impact sites and the medium-impact site (Figure 3).
ARMS from high-and low-impact sites clustered separately in community space along the first principal component, which accounted for 12.7% of the variation in the data (Figure 3).High-impact sites were further separated from each other along the second principal component, which accounted for an additional 8.7% of the variation, while low-impact sites had much lower variance along this axis.
OTUs were highly site specific, with 52% and 63% of OTUs found only at a single site in assigned OTUs and all OTU datasets, respectively (Figure 4).Among the assigned OTUs, 1449 (37%) were present only within low-impact sites, while 964 (25%) were present only within high-impact sites; these impact-and site-specific OTUs represented a range of phyla ('unique' in Figure 4).There were 110 OTUs categorized as indicator species of low-impact sites, that is OTUs present at all low-impact sites and no high-impact sites.Indicator species of high-impact sites were fewer at just 22 OTUs ('indicator' in Figure 4).

| Community phylogenetic structure
Patterns in community phylogenetic structure, as measured by MNTD, varied among the phyla examined (Figure 5; Figures S6 and  S7).For arthropod OTUs, site-level MNTD analysis showed significant phylogenetic clustering throughout both high-and low-impact sites (3/3 high-impact and 3/3 low-impact sites, p < .05; Figure 5a), but not at Sham Wan (SW; 0/1 medium-impact site).Phylogenetic clustering of arthropod communities was also detected broadly at the ARMS level, with significantly negative MNTD values measured at 17 of 18 high-impact, 3 of 5 medium-impact and 15 of 18 lowimpact ARMS (p < .05; Figure 5b; Figure S7c).Annelid OTUs showed significant clustering at all high-impact sites (p < .05; Figure 5a), but single high-impact site, SSW (Figure 5a), while the ARMS level analysis detected clustering in three ARMS from PC, a high-impact site (Figure 5b; Figure S7a).Positive z-scores, a sign of significant evenness in phylogenetic structure, occurred most often among mollusks from low-impact sites, but this was only found to be significant for a single ARMS (TPC-W2; Figure 5b).

| DISCUSS ION
Coastal urban centres can have intense impacts on nearby marine ecosystems, which historically host high levels of biodiversity (Costello & Chaudhary, 2017;Halpern et al., 2008;Myers et al., 2000;O'Hara et al., 2021;Pandolfi et al., 2003).Using globally standardized ARMS samplers combined with metabarcoding of the 'cryptobiome' (the small and/or hidden communities of organisms), we found that biodiversity within a highly urbanized coastal marine ecosystem rivalled or exceeded levels found at similar spatial scales within coastal marine habitats examined globally (Al-Rshaidat et al., 2016;Carvalho et al., 2019;Pearman et al., 2018;Ransome et al., 2017) (Figure 6).As expected, nearness to point sources of human impacts within the region was correlated with substantially reduced biodiversity, but all sites maintained a similar proportion of OTUs per phyla.Furthermore, being highly distinct from less polluted sites and from each other, these sites contributed substantially to the total animal diversity of the region.Phylogenetic analysis of the three major phyla suggests that environmental conditions can act as a filter, restraining the diversity of arthropods and annelids, but not mollusks, found at a particular site to a subset of lineages present within the region.Thus, metabarcoding of ARMS communities provides a critical perspective on the complex effects of urbanization in situ with important information for the management and conservation of biodiversity in the face of global coastal urbanization.

| Overall biodiversity patterns
Our comparison of biodiversity across human-impacted sites in the PRD, and across several tropical and subtropical locations elsewhere, is derived from using standardized sampling techniques The Hong Kong Register of Marine Species currently has recorded ~5400 marine species in the region (Astudillo et al., 2023), whereas we detected >7000 OTUs from just 41 small benthic samplers placed at seven locations.This highlights the massive amount of previously unaccounted biodiversity that can be revealed by metabarcoding the 'cryptobiome' of small and/or hidden organisms within marine ecosystems.Second, arthropods and annelids are consistently the most significant contributors to species richness within benthic marine communities (Al-Rshaidat et al., 2016;Carvalho et al., 2019;Ip et al., 2022;Leray & Knowlton, 2015;Pearman et al., 2018Pearman et al., , 2020;;Ransome et al., 2017).Third, most sequences detected using metabarcoding approaches cannot be readily identified.Among all OTUs in this study, ~53% remained unassigned even at the phylum level.This proportion is higher than, but in line with, results from other ARMS-based metabarcoding studies and highlights the huge gap in our accounting and understanding of marine biodiversity.
Furthermore, only 7% of the sequences from this study matched previously barcoded species at a 95% similarity level.Among those species matches, over 15% were COI species barcodes generated by our own efforts in sequencing macrofauna, despite these sequences making up an exceedingly small proportion of the global database.
While diversity patterns followed the same trends with and without the inclusion of unassigned OTUs, the poor representation of regional taxa within major publicly available databases demonstrates the need for more targeted biodiversity and barcoding efforts within understudied regions.
Despite natural freshwater inputs generally decreasing salinity within high-impact sites, coastal marine communities around the PRD were once more spatially homogenous, with corals being abundant across the seascape (Cybulski et al., 2020;McCorry & Blackmore, 2000).However, rapid urbanization of the upper Pearl River Delta and subsequent peaks in sewage-associated eutrophication, with relatively little change in salinity and temperature regimes over the last 40 years, have been clearly linked to the decline in corals nearest to the estuary (Duprey et al., 2020).This degradation of water quality has driven a change to more distinct coral communities within the region, including the almost complete extirpation of environmentally sensitive Acropora species near point sources Beta diversity of OTUs.PCoA analysis illustrating dissimilarities in community composition based on Jaccard dissimilarity matrices.Analysis was conducted at the ARMS level, and ellipses were drawn around sites grouped by impact using normal data ellipses computed using a multivariate t-distribution and a confidence interval of 95%.Sites are distinguished by symbol colour and deployment dates by shape of symbol.
of pollution (Cybulski et al., 2020;Duprey et al., 2020;McCorry & Blackmore, 2000).ARMS metabarcoding data confirmed that diversity differences in the benthic communities within high-and low-impact sites remain pervasive, though the exact environmental drivers of community-level changes are difficult to disentangle and likely differ among taxa (Figure S2).In more recent years, including during the ARMS deployments, water quality data show clear improvements from pollution levels previously recorded at our highimpact sites (Duprey et al., 2017).This provides an opportunity for future community-level studies on ecosystem resilience and recovery within the context of intensifying climate change.
Relative to low-impact sites, high-impact sites had lower OTU richness (Figure 2b).However, despite being 44% less diverse than low-impact sites, 30% of regional OTUs (27% of assigned OTUs) were found exclusively within the high-impact sites (unique in Figure 4), identifying them as a significant biodiversity reservoir.Surprisingly, our medium diversity site, SW, was by far the most OTU rich, despite the data coming from five ARMS at a single site (1 ARMS was lost).
Without additional replication, it is difficult to identify what aspects of this location support such high levels of biodiversity.
Not only did high-and low-impact sites form distinct clusters in community space, but beta analysis also revealed higher variation in community composition among high-impact sites relative to low-impact communities (Figure 3).Environmental stressors, including nutrient pollution, can drive increases in beta diversity in other systems as well.The microbiomes of corals and other hosts under stress tend to be more variable than those in a healthy state.
Recently termed the 'Anna Karenina principle', the observation that the microbial diversity of healthy individuals 'look alike', while the microbiomes of unhealthy individuals are more varied, with 'each unhealthy in its own way' (Zaneveld et al., 2017) may also be applicable to community biodiversity and ecosystem health.Indeed, eutrophication has been identified as one of the most significant and widespread threats to aquatic ecosystem stability (Smith & Schindler, 2009).

| Evidence for environmental filtering and differences among phyla
Our expectation was that the environmental conditions, particularly at high-impact sites, would limit phylogenetic diversity within those sites and lead to significant phylogenetic clustering.This was generally true; clustering, as indicated by negative z-values in MNTD analysis, was widespread among the high-impact sites (Figure 5).
However, evenness, predicted to be more common at low-impact sites due to the effects of competition, was rarely observed, and the intensity and consistency of this signal varied among phyla (Figure S7).
In arthropods, significant phylogenetic clustering within an ARMS or site was common at both high-and low-impact sites.The existence of clustering even at low-impact sites may reflect the fact that anthropogenic impact within low-impact sites was still pronounced with, for example, higher nitrogen concentrations than is typical in oceanic coral reef habitats (Duprey et al., 2016).The ubiquity of observed phylogenetic clustering herein suggests that at least some groups of arthropods are particularly sensitive to environmental stress.
Annelids, like arthropods, showed significant evidence of environmental filtering at high-impact sites but much less evidence of filtering at low-impact sites.Marine annelids constitute a large proportion of benthic community biomass, including a broad spectrum of functional and morphological groups (Dean, 2008) F I G U R E 6 OTU accumulation curves.OTU richness per ARMS was used to compare the diversity of community data sets in several studies using ARMS as biodiversity collectors in the Pearl River Delta (PRD; this study), the Red Sea (Pearman et al., 2018), the West Atlantic (Leray & Knowlton, 2015) and Singapore (Ip et al., 2022).For the PRD, we also plotted OTU data from lowimpact and high-impact sites separately for comparison.Error bars on PRD data represent the standard deviation over 500 permutations.For each location, the accumulated richness of all OTUs, including those without taxonomic assignments, is shown.
study (Figure 2).Annelids have been shown to be useful indicators of ecosystem responses to changes in environmental conditions with groups that differ in sensitivity to organic nutrient pollution as well as heavy metal toxicants (Dean, 2008;Sobczyk et al., 2020).
While many annelids in this study were found to be site specific, 21 annelid OTUs were identified as potential indicators of low human impacts broadly, that is found at all three low-impact sites and no high-impact sites (indicator species in Figure 4).However, only three OTUs were identified as high-impact indicators, suggesting that the most important environmental features for structuring annelid diversity varied more among high-impact sites.Improvements in the taxonomic assignments of OTUs are needed to shed light on those specific environmental drivers and add to a broader understanding of annelid functional and phylogenetic groups.
Conversely, mollusks showed little significant patterning in phylogenetic structure across sites or impact groups (Figure 5).As with other phyla, there was a trend for high-impact sites to have negative MNTD values indicative of clustering, but in contrast to the other phyla, many low-impact sites had positive MNTD values indicative of evenness, and few values were statistically significant in either direction for either ARMS or sites (Figure 5).At the phylum level, the diversity of form and function within the Mollusca is unparalleled (Sigwart et al., 2021).When highly diverse groups are analysed, evenness among shorter branches of closely related species and genera can be overshadowed by clustering at deeper levels (Silvertown et al., 1999(Silvertown et al., , 2001;;Webb et al., 2002Webb et al., , 2006)).Data on the diversity and distribution of Strombidae, a large group of Indo-Pacific gastropods, indicated that species richness was a poor predictor of the morphological diversity within a site (Roy et al., 2001).Interestingly, the OTU richness of mollusks, when analysed separately, was not significantly reduced within high-impact sites.
As the taxonomic and functional resolution of metabarcoding assignments improves, these datasets may provide further insight on ecological and evolutionary mechanisms that underpin community assembly and perhaps more clarity on why phyla show distinct patterns.Given that clustering was relatively widespread, it's likely that various environmental features (e.g.salinity, nutrients, turbidity or combinations thereof) can serve as filters of phylogenetic diversity in marine environments.Overall, our data follow a trend in the literature on plants and animals broadly for clustering to be more often detected in phylogenetic analyses than evenness (reviewed in Webb et al., 2002).This furthers the debate about whether ecologically relevant traits, that is those related to competition or resource exploitation, can be consistently linked to phylogenetic relatedness given that they can involve combinations of numerous characters and contextual trade-offs between them (Best & Stachowicz, 2013;Cavender-Bares et al., 2009;Webb et al., 2002Webb et al., , 2006)).However, the increasing frequency of extreme weather events spurred by global climate change (i.e.heatwaves and intense rainfall) may increase the strength and spatial scale of environmental filtering towards high tolerance and/or quickly evolving lineages, and further disrupt the ecological relationships that shape community and phylogenetic structure.

| Urban marine biodiversity and conservation
Globally, urbanization is causing extreme alterations to natural habitats, but previous ARMS-based studies have focused on relatively pristine areas.In terrestrial systems, which make up the bulk of community-level urbanization studies, urbanization tends to drive biotic homogenization among cities (McKinney, 2006;see Lokatis & Jeschke, 2022).In marine environments, changes in coastal structure and intensive alterations to water quality, for example extreme eutrophication, can lead to the extirpation of native species, allowing for the survival of only a subset of species that can cope in highly impacted landscapes (Airoldi, 2000;Airoldi & Bulleri, 2011;Altieri et al., 2017;Chihoub et al., 2020).In the region we studied, oceanographic conditions produce environmental gradients of various inputs where, for example, poor water quality can be diluted by oceanic flushing at relatively small spatial scales, leaving a patchwork of distinct communities.However, local stressors can act synergistically with global climate change, which is expected to drive region-wide increases in extreme heatwaves and rainfall events (Tan et al., 2022;Wang et al., 2021).While historically uncommon (Duprey et al., 2020), a recent extreme heating event triggered unprecedented coral bleaching throughout the region, which is often considered a thermal refugium (Zhao et al., 2023).Extreme rainfall events increase the delivery of terrestrial sediments, which constrain coral distributions (Goodkin et al., 2011), and nutrients, which are known to exacerbate coral bleaching and disease (Thurber et al., 2014).Concomitant with coastal infrastructure and the extirpation of natural barriers such as mangroves and seagrasses, this can increase the intensity and spatial extent of water quality degradation (Chen et al., 2021;Zhang et al., 2009).
Within the Pearl River Delta, areas with the poorest water quality had reduced benthic marine species and phylogenetic diversity.However, species at highly impacted sites were not just a subset of more broadly distributed regional fauna, but almost entirely distinct communities (Figures 3 and 4).The nascent field of urban marine ecology suggests that heterogeneous mosaics may be a common outcome of coastal urbanization, due to the characteristics of the drivers as well as the marine environment itself (Todd et al., 2019).For example, differences in flow regimes and oceanic flushing can allow for pronounced gradients in nutrient enrichment and sediment pollution at fine scales (Arévalo et al., 2007;Chen et al., 2019), and artificial structures can be quickly colonized by novel assemblages (Heery et al., 2018).In this study, the complex habitat provided by ARMS was quickly colonized, even in areas with little structural complexity otherwise (i.e. the soft, uncompacted sediments in San Shek Wan), attesting to the ability of a phyletically diverse, distinct set of marine benthic organisms to tolerate poor water quality.
Given the increasing threats to biodiversity, its conservation is becoming a ubiquitous goal of management and restoration.Yet the strategies and even the metrics of success are not well defined (Bonebrake et al., 2019;Carvalho et al., 2017;Fontoura et al., 2022; 1 min.Triplicate samples were combined prior to attaching indexing barcodes and sequencing adaptors.Samples were sequenced at the Genomics Core Centre for PanorOmics Sciences (CPOS) of The University of Hong Kong on an Illumina MiSeq with Pair-End 251 bp F I G U R E 1 Sampling sites, point sources of human impacts and water quality.(a) Map showing urbanization around the Pearl River and Pearl River Delta, with the region of study identified by the black box.(b) Region of study showing sampling locations marked as stars and colours showing high impact (red), medium impact (yellow) and low impact (blue), with point sources of high impacts marked by arrows.Nearby points indicate the location of water quality monitoring stations.(c) Water quality parameters measured monthly from January 2018 to August 2019 at monitoring stations nearby study sites (see panel b), averaged and scaled with Heatmap colours reflecting standard deviations from the mean.Clustering is based on Euclidean distances.Data are collected and made publicly available by the Hong Kong SAR, Environmental Protection Department.Map images from: https:// earth.google.com/ .Map lines delineate study areas and do not necessarily depict accepted national boundaries.
was used to calculate MNTD within the three phyla at the ARMS and site level.Null communities were generated by randomly shuffling species across the tips of the phylogeny in 999 simulations.The standardized effect size (z-score) of the observed MNTD obs versus MNTD null was calculated by (MNTD obs − MNTD null )/s.d.(MNTD null ).p-Values represent the elsewhere only at a single low-impact site, TPC (p < .05 in 3/18 ARMS and 1 of 3 sites).ARMS level analysis again showed complementary patterns, with 15 of 18 ARMS from high-impact sites, but only 3 of 18 ARMS from low-impact sites and 2 of 5 ARMS at the medium site showing significant clustering (p < .05;Figure5b; FigureS7b).Few instances of significant phylogenetic structure were detected among molluscan OTUs.At the site level, significant clustering occurred at a F I G U R E 2 Phylum-level diversity of COI sequences.(a) Proportion of OTUs per phylum per ARMS sampled across seven locations and two seasons.'Unassigned' indicates OTUs that could not be matched at the phylum level.Each column is labelled by its location as well as the season during which it was deployed/ retrieved (S: summer/summer, W: winter/ winter).(b) The number of unique OTUs per phylum from each site, excluding unassigned OTUs.
, and they indeed made up a large proportion of the genetic diversity in this F I G U R E 5 Community phylogenetic structure.Phylogenetic structure was assessed within the three most diverse phyla at the (a) site level and (b) ARMS level (labels indicate the location and the season in which the ARMS was deployed/retrieved; S: summer/ summer, W: winter/winter).Mean nearest taxon distances (MNTD) were calculated for OTUs within each phylum.Standardized effect sizes (z-scores) of MNTD in observed versus randomized communities were used to identify significant patterns in clustering and evenness.Black asterisks denote significantly lower than expected observed MNTD reflecting phylogenetic clustering (*p < .05,**p < .01,***p < .001),while white asterisks denote significantly higher than expected MNTD reflecting phylogenetic evenness (*p > .95,**p > .99,***p > .999).
Means and standard deviations in high-and low-impact sites for the water quality indicators of interest recorded at sites from January 2018 to August 2019.