Large‐scale eDNA metabarcoding survey reveals marine biogeographic break and transitions over tropical north‐western Australia

Environmental DNA (eDNA) metabarcoding has demonstrated its applicability as a highly sensitive biomonitoring tool across small spatial and temporal scales in marine ecosystems. However, it has rarely been tested across large spatial scales or biogeographical barriers. Here, we scale up marine eDNA metabarcoding, test its ability to detect a major marine biogeographic break and evaluate its use as a regional biomonitoring tool in Australia.


| INTRODUC TI ON
Broad-scale biomonitoring of marine environments is integral for the detection of biological changes, stressors and shifting baselines over large spatial and temporal scales (Dafforn et al., 2016).
Typically, these approaches utilize rapid assessment methods, such as underwater visual census (UVC), marine manta tow and baited remote underwater video (BRUV) surveys (Ellis et al., 2011;Gaertner et al., 2013;Piacenza et al., 2015) that provide information to distinguish broad-scale indicators and subsequently direct further research efforts to areas of interest. However, the application of these techniques is not suitable in marine environments with limited visibility and other safety hazards to divers, for example the presence of saltwater crocodiles (Crocodylus porosus).
The advent of environmental DNA (eDNA) metabarcoding coupled with next-generation sequencing (NGS) has enabled the genetic detection and profiling of a wide range of biota present in environmental samples (e.g. water, scat and soil etc.). Environmental DNA metabarcoding has the potential to be utilized as a sensitive, cost-effective, and rapid broad-scale biomonitoring tool and is particularly well suited to marine environments (Thomsen et al., 2012;Thomsen & Willerslev, 2015;Valentini et al., 2016).
Importantly, the collection of surface water (or at depth with a water sampler) for eDNA analyses bypasses logistical and safety hazards associated with visual surveillance work in turbid and dangerous marine environments. Furthermore, eDNA-derived compositional data can provide greater biological coverage to distinguish spatial and habitat variation, identify network associations, trophic structure, biological invasions and the presence of critically endangered species (Valentini et al., 2016). Whilst eDNA metabarcoding has demonstrated its applicability across small, yet highly sensitive, spatial (Jeunen et al., 2019;O'Donnell et al., 2017;Port et al., 2016;West et al., 2020) and temporal scales (Berry et al., 2019) in marine ecosystems, it is in a preliminary stage of being scaled up and tested across broader regional scales (Aglieri et al., 2020;Fraija-Fernández et al., 2020).
The extensive coastline of north-western Australia (NWA) supports a diverse array of tropical marine habitats and biota, extending from offshore coral reefs on the edge of the continental shelf, to coastal intertidal sand, rock and reef habitats, constituting 12 distinct bioregions (Wilson, 2014). A profound change in the underlying geomorphology of the Canning and Kimberley basins, from Cretaceous-Cainozoic sedimentary (largely sandstone) rocks to Proterozoic metasedimentary, metamorphic and igneous rocks, has shaped various coastal marine habitats in these bioregions (Wilson, 2014). The Canning bioregion comprises coastlines typified by benthic soft substrates, such as intertidal sand and mudflat habitats with very little coral reefs, whilst the Kimberley bioregion is dominated by rocky, intertidal platforms, fringing coral and offshore coral reefs, and substantial mangrove habitat (Richards et al., 2018;Wilson, 2014). Environmental conditions and connectivity patterns across these bioregions are additionally shaped by various oceanic currents, immense tidal systems (macrotides ranging up to 11 m in the Kimberley), seasonal discharge and extreme turbidity from major rivers (Semeniuk, 1993;Thackway & Cresswell, 1998). Temperature varies between the bioregions and also across subregions, semiarid in the Canning, sub-humid in the southern Kimberley, humid in the northern Kimberley and sub-humid in the northeast Kimberley (Cresswell & Semeniuk, 2011).
This environmental variation is purported to contribute to a major biogeographic break at Cape Leveque -the tip of the Dampier Peninsula, demarcating the border between the Kimberley and Canning bioregions -see Figure 1 (Travers et al., 2010;Wilson, 2014). A significant change in the fish assemblage composition across Cape Leveque (Hutchins, 2001a) likely reflects the latitudinal transition in benthic substrates, overlaid on a strong bioregional effect reflecting various habitat, tidal and riverine discharge influences (Travers et al., 2006(Travers et al., , 2010. Population connectivity studies in bony fish (stripey snapper; Lutjanus carponotatus and blackspotted croaker; Protonibea diacanthus) and corals (Isopora brueggemanni and Acropora aspera) further revealed a genetic transition zone across Cape Leveque, with dispersal and gene flow likely constricted by extreme tidal flushing at the head of King Sound Taillebois et al., 2017;Underwood et al., 2017).
The aim of this study was to conduct a broad-scale multimarker eDNA metabarcoding survey across the extensive coastline of NWA in order to: (a) detect the purported biogeographic break across Cape Leveque using eDNA-derived bony fish, shark and ray and aquatic reptile taxonomic compositional data, (b) update distributional information for endangered elasmobranchs, such as sawfish (family Pristidae) and the northern river shark (Glyphis garricki), marine turtles (superfamily Chelonioidea) and for data-deficient taxa such as sea snakes (subfamily Hydrophiinae), and (c) evaluate the overall strengths and weaknesses of eDNA metabarcoding as a biomonitoring tool when used across a broad geographic region.
Given the remoteness of NWA, long-term monitoring programmes are sparse, particularly for species that are not of commercial value . As such, there is a great potential to integrate eDNA metabarcoding as a long-term, large-scale biomonitoring tool the sustainable management and conservation of marine biodiversity in this unique marine region.

K E Y W O R D S
biogeographic, biomonitoring, elasmobranch, environmental DNA, Kimberley, large-scale, marine biodiversity, marine reptile, teleost, threatened species in NWA, capable of providing distribution information on a wide variety of taxa.

| DNA extraction
DNA was extracted from half of the membrane using a DNeasy Blood and Tissue Kit (Qiagen) with the following modifications: 540 μl of ATL lysis buffer, 60 μl of Proteinase K and a 3-hr digestion at 56ºC. Extracts were eluted in 100 μl of Buffer EB. This was completed within four weeks of collection. Extraction blank controls were processed in parallel with all samples to detect any cross-contamination. Genomic DNA extracts were then stored at −20°C.

| Metabarcoding assay design, amplification and library sequencing
Three PCR metabarcoding assays were employed: 16S Fish, COI Elasmobranch and 16S Reptile (Table 1)  The 16S Reptile assay has been recently designed to amplify northern Australian aquatic reptiles, such as sea snakes, turtles and crocodiles (West et al., 2021).
Following quantitative PCR-based (qPCR) quantification to optimize levels of input DNA (Murray et al., 2015), final qPCR was performed in a single step using fusion tagged primer architecture consisting of Illumina compatible sequencing adaptors, a unique index (6-8bp in length) and a respective primer sequence for each assay. All qPCR reactions were prepared in dedicated clean room facilities at the TrEnD Laboratory, Curtin University, and are described in detail in Section S1. Quantitative PCR amplicons were pooled at equimolar ratios based on qPCR ΔRn values and size-selected using
We then consolidated our assay data into three taxonomic-based TA B L E 1 PCR assay information for marine eDNA metabarcoding across the Canning/Kimberley bioregions

Target length (bp)
Annealing temp

(°C)
Primer reference Note: Three primer sets: 16S Fish, 16S Reptile and COI Elasmobranch corresponding to the mitochondrial 16S rDNA and COI regions were applied to all collected seawater samples. In the primer name, "F" refers to the forward primer and "R" to the reverse primer.

16S Fish
datasets: Actinopterygii (bony fish), Elasmobranchii (elasmobranchs, i.e. sharks and rays) and Reptilia (reptiles), which allowed us to examine community composition by discrete taxonomic group, rather than by individual metabarcoding assays which contained some overlap in taxonomic detections.

| Statistics
Community composition variation was analysed across the study region, with a particular focus across the purported biogeographic break at Cape Leveque. In order to control for the effect of differing habitats, variation was tested between sites within inshore, coastal and nearshore estuarine habitats independently. Presence-absence data for each taxonomic dataset (bony fish, elasmobranchs and reptiles) were converted to Jaccard similarity matrices and tested for compositional variation by a distance-based linear routine (DistLM) using a step-wise selection procedure and adjusted R 2 criterion in the PERMANOVA + add-on (Anderson et al., 2008) (Table S1); longitude was omitted due to collinearity with latitude. Temporal variation in sampling was not included, given earlier research indicated that season only has a small influence on inshore reef fish composition in this region (Travers et al., 2006). DistLM analyses are capable of handling unbalanced designs (Anderson et al., 2008), in this case, where there are an unequal number of sites in the spatial predictor variable groups. We did, however, run an additional DistLM routine on a subset (Sites 1-11 and 42-44) of the data assemblages to assess the bioregional influence across 14 sites (seven sites directly on either side of the purported biogeographic break). Site variation was visualized by principal coordinate analysis (PCO) using the stats function "cmdscale" and predictor variables overlaid using the vegan "ordisurf" function (Oksanen et al., 2019) in R Studio (v1.1.423; R Core Team, 2015). Observed taxonomic richness at each site was tested for significance between bioregions and subregions using ANOVA and graphed in ggplot2 (Wickham, 2016) in RStudio. Additionally, similarity percentage analyses (SIMPER) were conducted in PRIMER to identify the top inshore and coastal taxa that contribute to pairwise dissimilarity between bioregions and subregions, where significant in the DistLM analyses. This elucidated whether variation in community composition between the bioregions and subregions is driven by uneven taxonomic richness and/or variation in compositional diversity.

| Sampling and sequencing statistics
The three eDNA metabarcoding assays yielded a total of 57,311,878 sequencing reads. The mean number of filtered sequences (post-quality, denoizing and chimera filtering) was 75,800 ± 41,071  (Table S2). ASV accumulation curves based on the addition of each sampling replicate per site indicated that four one-litre water replicates (selected a priori to sampling) were just shy of maximizing ASV richness for each of the three assays ( Figures S1-S3). On fitting a polynomial curve to the median accumulation curve for each assay, it was extrapolated that an average of 6.9, 6.1 and 6.1 one-litre water replicates would be required to maximize ASV richness for 16S Fish, COI Elasmobranch and 16S Reptile assays, respectively. The rarefac- ASVs, 40,820 total reads). These species were targeted for fisheries and/or commercial research on the sampling vessels with the exception of pilchards, which were utilized as bait for BRUV deployments. Only compromized ASVs were removed from subsequent analyses. For example, we retained 15 barramundi ASVs that were not detected in filtration and/or extraction blanks. We also detected salmon (genus: Salmo; 1 ASV, 9 total reads), which has previously been detected as a sporadic reagent contamination in both our workflows and other laboratories , and as such was entirely removed. We also omitted all ASVs that produced detection hits for taxa outside of our targeted taxonomic groups of bony fish, elasmobranchs and reptiles. This included humans (Homo sapiens), chicken (Gallus gallus) and horse (Equus caballus).

| Overall diversity
A total of 310 taxa (ranging from family to species-level assignments; 4.9 ± 9.9 ASVs per taxa) were detected by the 16S Fish assay, 139 taxa (1.3 ± 1.0 ASVs per taxa) by the COI Elasmobranch assay and 181 taxa (2.9 ± 5.2 ASVs per taxa) by the 16S Reptile assay, prior to subsampling ( Figure 2). Collectively, the three metabarcoding assays yielded 453 identifiable taxa, representing 96 families within 41 orders of bony fish, elasmobranchs and aquatic reptiles (Table S3). Forty-four elasmobranch taxa (class: Chondrichthyes, subclass: Elasmobranchii) were detected from 11 families within four orders (Table S3). The two most speciose elasmobranch families were the Carcharhinidae (requiem sharks; 15 taxa) and Dasyatidae (stingrays; 14), which collectively comprised over half of the total detected shark and ray taxa (Table S3). We detected five elasmobranchs that are listed as either "Endangered" or "Critically Endangered" on the IUCN Red List and are under various national and state protection management (see Table 2); these taxa were the largetooth sawfish Only five reptile taxa (class: Reptilia) were detected from five families within three orders (Table S3): the saltwater crocodile (Crocodylus porosus), the black-headed python (Aspidites melanocephalus), Stokes's sea snake (Hydrophis stokesii), the white-bellied mangrove snake (Fordonia leucobalia) and the green turtle (Chelonia mydas). Given the low frequency of detection of these taxa, reptiles were excluded from all multivariate analyses.

| Bony fish composition
Bony fish composition was examined independently in each habitat type (inshore, coastal and nearshore estuarine), excluding the mid-shelf habitat which was only comprised of one site. In regard to inshore bony fish compositions, a distance-based linear model   Figure 3a. Cumulatively, the formed models explained between 20.1% and 26.7% of total fitted variance between inshore fish assemblages (Table 3; Table S4).
Taxonomic richness of inshore bony fish did not significantly differ between the two bioregions (Table S5; Figure S13). This indicates that the detected bioregional variation was not influenced by uneven taxonomic richness, but solely compositional variation. Similarity percentage analysis (SIMPER) was used to identify prominent inshore bony fish taxa contributing most to pairwise dissimilarity between the Canning and Kimberley bioregions (Table   S6). This indicated a higher detection rate of sardinella (genus: Amblygaster), purple tuskfish (Choerodon cephalotes) and chub mackerels (genus: Rastrelliger) in the Canning bioregion, whilst the Kimberley region had a higher detection rate of Spanish mackerel (genus: Scomberomorus), giant trevally (Caranx ignobilis) and blue tuskfish (Choerodon cyanodus).
In examining coastal fish assemblages (those restricted to the South and North Kimberley subregions), a DistLM analysis indicated that subregion was a highly significant predictor variable, explaining 17.9% of fitted variance (Table 3; Table S7; Figure 3b). For bony fish composition in nearshore estuarine sites (only surveyed in the South Kimberley), depth was the only significant predictor variable, explaining 12% of the fitted variance (Table S8; Figure 3c).

| Elasmobranch composition
Elasmobranch composition across all inshore sites was found to be driven by subregion and depth, explaining 24.3% of fitted variance (Table 4; Figure 4a; Note: These were constructed using a sequential step-wise selection procedure and adjusted R 2 criterion. Significant codes are as follows: 0 < 0.001 "***," 0.001 < 0.01 "**"and 0.01 < 0.05 "*." The predictor variables highlighted in bold are significant (p < .05). Full DistLM results, including marginal tests and best solutions, are provided in Tables S4, S7 and S8. the North Kimberley (Table 4; Table S9). Cumulatively, the significant spatial predictor variables that formed models explained between 23.6% and 34.3% of total fitted variance between inshore sites (Table 4). Taxonomic richness of inshore elasmobranchs did not differ significantly between the three subregions (Table S10; Figure S14); however, within the Dampier Peninsula only two sites (post-subsampling) had detectable traces of elasmobranch taxa.
SIMPER analysis was used to identify prominent inshore elasmobranchs contributing most to pairwise dissimilarity between the three subregions (  Figure 4b). For the nearshore estuarine sites, there were no significant tested predictor variables that could explain the fitted variance (Table 4; Table S13; Figure 4c).

| Bony fish compositional transitions across NWA
The Canning and Kimberley bioregions have some of the least impacted marine and coastal ecosystems in the world (Halpern et al., 2008) with over 1,500 reported species of bony fish (Fox & Beckley, 2005;Moore et al., 2014Moore et al., , 2020. This synthesis is the result of numerous surveys and museum records since the 1880s (Hutchins, 2001b;Moore et al., 2014Moore et al., , 2020Paxton et al., 2006). Studies of inshore soft substrate and reef fish fauna across NWA (Travers et al., 2006(Travers et al., , 2010(Travers et al., , 2012  shelf region (Fox & Beckley, 2005;Hutchins, 1997Hutchins, , 2001a. In our study, which examined the inshore bony fish compositions that  Note: These were constructed using a sequential step-wise selection procedure and adjusted R 2 criterion. Significant codes are as follows: 0 < 0.001 "***," 0.001 < 0.01 "**"and 0.01 < 0.05 "*." The predictor variables highlighted in bold are significant (p < .05). Full DistLM results, including marginal tests and best solutions, are provided in Tables S9 and S12. environments (Simpfendorfer et al., 2016). With the exception of reef-associated species (MacNeil et al., 2020), existing compositional data from coastal species are based on limited, and invariably biased, observations from commercial fisheries (Braccini & Taylor, 2016;Field et al., 2012;McAuley et al., 2005). In examining shark and ray eDNA-derived compositional data across the Canning and Kimberley regions, we detected a significant subregional influence on inshore species. This reflected variation across the purported biogeographic break between the Dampier Peninsula (Canning) and South Kimberley and also between the South and North Kimberley regions.
This widespread subregional influence on inshore shark and ray composition across NWA was consistent with observations from elsewhere across northern Australia where composition and relative abundance have been shown to vary markedly at a range of spatial and temporal scales (Espinoza et al., 2014;Harry et al., 2011;Taylor & Bennett, 2013;White & Potter, 2004;Yates, Heupel, Tobin, Moore, et al., 2015;Yates et al., 2015a).
Northern Australia has a comparatively high elasmobranch biodiversity that includes many large-bodied and highly mobile species (Last & Stevens, 2009). Variability in species composition is not only influenced by regional conditions, but also reflects the complex life-history strategies of many elasmobranchs that includes behaviour such as inshore nursery usage (Simpfendorfer & Milward, 1993;Yates, Heupel, Tobin, Moore, et al., 2015;Yates et al., 2015b), partitioning by size and sex (Knip et al., 2012;Yates, Heupel, Tobin, Moore, et al., 2015), and seasonal migration between tropical and temperate waters (Braccini et al., 2018;Heupel et al., 2015). Disentangling such patterns is beyond the capability of presence-absence data alone; however, these data can nonetheless assist in corroborating existing patterns in composition as well as identify new ones.

| A new approach for surveying endangered, elusive and data-deficient taxa in NWA
The coastline of NWA exhibits high turbidity resulting from immense tidal action and seasonal discharge from major rivers (Semeniuk, 1993;Thackway & Cresswell, 1998  Section S3 for further discussion). Despite a low detection rate, this is the first study to our knowledge, to have detected crocodiles and sea snakes using an eDNA approach under field conditions.
A species-specific eDNA assay has previously been developed to detect the largetooth sawfish (P. pristis) across northern Australia (Simpfendorfer et al., 2016). A significant finding in our study was the detection of three out of the four globally endangered sawfish taxa (family: Pristidae) found in Australia using a metabarcoding approach.
The largetooth sawfish is a euryhaline elasmobranch species that was once globally distributed in tropical marine, estuarine and freshwater environments of the Eastern and Western Atlantic, Eastern Pacific and Indo-West Pacific; however, population declines and extirpation have led to significant range contractions (Kyne, Carlson, et al., 2013).
It is currently listed as Critically Endangered on the IUCN Red List of Threatened Species and is a protected species in Australia; Northern Australia may be the last viable stronghold for the Indo-Pacific population and likely comprises a large proportion of the remaining global population (Kyne, Carlson, et al., 2013;Last & Stevens, 2009 The northern river shark (G. garricki) is considered a rare species with limited distribution information and population estimates available (Field et al., 2013). All identified populations are considered to be of high conservation value and as such, recreational fishing of the species is banned under Australian federal law. The detection of the northern river shark in this study extends its current known distribution in Western Australia from scattered sightings in the King Sound (Compagno et al., 2008;Thorburn & Morgan, 2004, Ord River, King River and Joseph Bonaparte Gulf (Pillans et al., 2009) to the Gairdner River and the Walcott River in the South Kimberley region.
These additional distribution records of endangered elasmobranchs will contribute to recovery plans and management arrangements and are already contributing to locations where additional surveys will be undertaken.

| CON CLUS ION
This large-scale eDNA metabarcoding study across the coastlines of North-western Australia was able to detect a purported marine biogeographic break between the Canning and Kimberley bioregions.
This demonstrates that eDNA metabarcoding is a highly sensitive detection tool, capable of producing large amounts of high-resolution (e.g. to a species level) presence-absence data that can discern finescale patterns across large geographic regions. Further broad-scale applications of this technique could be used to potentially reveal marine biogeographic breaks in other regions. For example, significant phylogeographic structure in mantis shrimp and seahorses in southeast Asia is claimed to reflect historical oceanographic divisions; the former exhibits divergence along a sharp genetic break between the Indian and Pacific Ocean regions (previously separated during the Last Glacial Maximum), whilst the latter is separated into east and west lineages reminiscent of the terrestrial Wallace's Line (Barber et al., 2000;Lourie & Vincent, 2004). Environmental DNA metabarcoding could be used to assess whether these phylogeographic breaks reflect wider biogeographic partitioning in marine community compositions.
The eDNA samples resulting from this study will be archived and available for further assay applications extending beyond the taxonomic groups targeted in this study. Additionally, our sequencing data can be retrospectively analysed with expanding databases to resolve ambiguous taxonomic assignments. Our georeferenced sites herein will be used as a baseline for future eDNA biomonitoring and notably will direct targeted surveying for the critically endangered elasmobranchs across NWA. We anticipate that eDNA metabarcoding will be integrated into broad-scale monitoring tool kits, particularly in northern Australia, where it circumvents many of the logistical and safety limitations of visual surveillance. At present, this technique is limited in its ability to provide quantitative data in relation to population sizes and biomass. However, its ability to produce multi-taxon and potentially even whole-ecosystem data, without the need for taxonomic expertise, is both time-and cost-efficient. Amidst global population declines and resource limitations, innovative approaches to whole-ecosystem and biodiversity surveying are required to underpin the best practice management of fisheries, tourism and commercial interests in this remote region of Australia. We advocate that eDNA offers a promising demand-driven solution that is fast gaining traction when planning and executing biodiversity surveys.

ACK N OWLED G EM ENTS
We acknowledge the Department of Primary Industries and Regional Development (WA), BMT Oceanica and the Australian Research Council (LP160100839) for project funding. We give special mention to the crew of the RV Naturaliste, Sam Moyle, Laura Fullwood, Dion Boddington and Gabby Mitsopoulos for fieldwork assistance. Gratitude is also extended to the Wunambal Gaambera, Dambimangari, Mayala, Bardi Jawi, Nyul Nyul, Jabirr Jabirr and Yawuru people, the traditional owners of the areas sampled and to their indigenous rangers for assistance with sampling. We would like to thank Kate Sanders for providing the sea snake databases for reptile assay development. For access to the Zeus supercomputer, which sped up much of our bioinformatic processing, we would like to thank Pawsey Supercomputing Centre (Kensington, WA). Lastly, we would like to give thanks to everyone at the TrEnD Laboratory for invaluable eDNA assistance across the duration of the project.

PEER R E V I E W
The peer review history for this article is available at https://publo ns.com/publo n/10.1111/ddi.13228.

DATA AVA I L A B I L I T Y S TAT E M E N T
Demultiplexed (unfiltered) metabarcoding sequencing data and taxonomic (read abundance) matrices are available for download on Dryad Digital Repository (https://doi.org/10.5061/dryad.8kprr 4xmm).