Guidelines and considerations for metabarcoding-based port baseline biodiversity surveys: towards improved marine non-indigenous species monitoring

Monitoring introduction and spread of non-indigenous species via maritime transport requires performing port biological baseline surveys. Yet, the comprehensiveness of these surveys is often compromised by the large number of habitats present in a port, the seasonal variability of the inhabiting communities and the time-consuming morphological approach used for taxonomic classification. Metabarcoding represents a promising alternative for rapid comprehensive port biological baseline surveys, but, before this technique can be routinely applied in this context, standardized protocols should be developed. We applied metabarcoding using two alternative barcodes (based on the Cytochrome Oxidase I or the 18S ribosomal RNA gene) to about two hundred port samples collected i) from diverse habitats (water column – including environmental DNA and zooplankton, sediment and fouling structures), ii) at different sites within the port (from inner to outer estuary), and iii) during the four seasons of the year. Comparing the biodiversity metrics and taxonomic composition derived from each sample group, we investigated the impact of the different strategies for metabarcoding-based port biodiversity baseline surveys. Each sampling method resulted in a distinct community profile and water samples alone using universal primers did not produce comprehensive macroorganismal biodiversity to substitute organismal sampling. Sampling at different seasons and locations resulted in higher observed biodiversity, but nestedness analyses suggested that sampling could be reduced to two seasons. Metabarcoding also allowed to detect previously recorded non-indigenous species as well as to reveal presence of new ones, even if in low abundance. We provide the first comprehensive evaluation of metabarcoding for port biodiversity baseline surveys from which we have derived guidelines for how, when and where samples should be collected. These guidelines are key to develop the cost-effective and standardizable approach needed for monitoring non-indigenous species introduction to ports via ballast water and hull fouling, particularly relevant with the recent entry into force of the International Convention for the Control and Management of Ships’ Ballast Water and Sediments.


Introduction 48
Globalization has led to an increased maritime transportation, with more and 49 larger ships than ever before transferring species into ports via ballast water and hull 50 fouling (Seebens, Gastner, & Blasius, 2013). Among the thousands species arriving daily 51 (Carlton, 1999), some are Non-Indigenous Species (NIS) and can become invasive, 52 unbalancing the equilibrium of the recipient ecosystem (Bax, Williamson, Aguero, 53 Gonzalez, & Geeves, 2003). Thus, being important gateways for introduction of NIS, 54 ports need to be monitored to provide information required by legal frameworks aiming 55 at controlling biological invasions (Lehtiniemi et al., 2015). 56 One of these legal frameworks is the International Convention for the Control 57 and Management of Ships' Ballast Water and Sediments (IMO, 2004), currently in force. 58 This convention requires that ships treat their ballast water before its release to port, 59 unless they show that the risk of transferring NIS between the donor and recipient ports 60 is limited (David, Gollasch, & Pavliha, 2013). Such risk assessment relies on the 61 combination of port environmental (temperature and salinity) and biological 62 information. The obtention of biological information is done by cataloguing biodiversity 63 through Port Biological Baseline Surveys (PBBS). Given the diverse range of habitats 64

Raw read preprocessing, clustering and taxonomic assignment 191
After quality checking of demultiplexed paired-end reads with FASTQC (Andrews, 2010), 192 forward and reverse primers were removed using Cutadapt (Martin, 2011) with the 193 "anchored 5' adapter" and for "paired end reads" options and with the "linked adapter" 194 option for COI and 18S respectively. Forward and reverse reads were merged using PEAR 195 ( 2) the NICS dataset to identify non-indigenous indicator taxa of each sampling site. 224 These analyses were based on the IndVal index calculated as the product of the degree 225 of specificity (measuring the uniqueness to a sampling method or site) and the degree 226 of fidelity (measuring the frequency of occurrence within a sampling method or site) of 227 an OTU to a given sampling condition. Statistical significance of associations was 228 assessed by performing 10,000 permutations. The effects of season and locality on taxa 229 targeted through PBBS communities were tested for significance using a permutational 230 multivariate analysis of variance (PERMANOVA) after checking for multivariate 231 homogeneity of group dispersions (betadisper). PERMANOVA and betadisper were 232 performed on OTU abundance (Hellinger distance) and presence/absence (Jaccard 233 dissimilarities). The contribution of replacement (changes in OTU identity) and 234 nestedness (richness differences where one sample is a subset of a richer sample) to 235 beta diversity of taxa targeted through PBBS between seasons and between sites was 236 computed using the relativized nestedness index of Podani and Schmera (2011) based 237 on Jaccard dissimilarities matrix. For each sampling method, replacement and 238 nestedness were calculated between pairwise comparisons of 1) samples belonging to 239 the same site but sampled at different seasons (season variation of beta diversity), and 240 2) samples belonging to the same season but sampled at different sites (spatial variation 241 of beta diversity). The mean proportion of nestedness and replacement contribution to 242 beta-diversity between sites and between seasons was then calculated. 243 244

Results 245
Communities retrieved by each sampling method and genetic marker 246 A total of 5,718,639 and 7,055,675 COI and 18S barcode reads were kept for 247 analysis (Table S1) The percentage of reads assigned to species or higher taxonomic levels differed 260 between sampling methods and barcodes. The majority of the reads were assigned to 261 species level for zooplankton nets (COI: 84% and 18S: 75%), settlement plates (COI: 69% 262 13 and 18S: 69%) and sediment grabs (COI: 64%); for filtered water, although 51% of reads 263 were assigned to species level with 18S, only 8% could be assigned to a species with COI 264 (Fig. 3A). Filtered water analyzed with COI had also the largest proportion of reads that 265 could not be assigned to phylum (87%). This is possibly due to the amplification of non-266 metazoan taxa, which are underrepresented in the BOLD database (Fig. S3), which was 267 confirmed with BLAST searches against GenBank (data not shown). 268 Both barcodes detected a wide range of eukaryotic groups (Fig. 3B). With COI, 269 the greatest majority of reads belonged to metazoans, with Arthropoda dominating all 270 sampling methods except sediment grabs, dominated by Annelida. With 18S, a more 271 diverse spectrum of taxa was retrieved, including phytoplankton and macroalgae; as 272 expected, the number of reads assigned to these phyla in zooplankton nets and 273 settlement plates was very low in comparison to filtered water. 274

Distribution of taxa targeted through port biological baseline survey 276
A total of 16,828 and 9,091 OTUs were kept as taxa targeted through PBBS for 277 COI and 18S respectively. From those, indicator analysis identified respectively 2,600 278 (15%) and 1,700 (19%) OTUs significantly associated to one of the sampling methods. 279 Settlement plates were associated with 1,268 (COI) and 808 (18S) OTUs, zooplankton 280 nets with 1,001 (COI) and 315 (18S) OTUs, and sediment grabs with 238 (COI) OTUs. Only 281 12 OTUs (9 of which were metazoans) were indicators of filtered water with COI while 282 580 (3 of which were metazoans), were associated to this sampling method with 18S. 283 We observed expected strong associations of some taxa to one particular sampling 284 method: fouling bivalves (Ostreoida, Mytilida) with settlement plates, copepods 285 (Calanoida, Cyclopoida) with zooplankton nets and sea urchins (Spatangoida) with 286 14 sediment grabs (Fig. 4). Yet, less obvious associations were also observed such as 287 barnacles (Sessilia) and two polychaeta orders (Spionida, Phyllodocida) with 288 zooplankton nets or dinoflagellates (Peridiniales and Gymnodiniales) with settlement 289 plates, illustrating the advantages of a complementary sampling approach to recover 290 the diversity of these taxonomic groups. 291 The need for complementarity of sampling methods was further confirmed by 292 PCA on PBBS target taxa (Fig. 5). For both barcodes, expected taxonomic differences 293 were found between the sampling methods: sediment grabs were characterized by the 294 polychaeta Maldane glebifex (Fig. 5B), zooplankton nets were distinguished by several 295 copepods (Fig. 5B, D) and settlement plates, by the encrusting Semibalanus balanoides 296 (Fig. 5D). Filtered water communities were characterized differently according to each 297 barcode. For COI, filtered water samples were not differentiated as a distinct group and 298 were in general close to those retrieved with zooplankton samples collected at the same 299 season (Fig. 5A). In contrast, for 18S, filtered water was different due to the presence of 300 the phytoplankton species Phaeocystis globosa (Fig. 5D); yet, when targeting only 301 metazoan taxa, the patterns observed with COI were similarly to the ones retrieved with 302 18S (Fig. S4A). Concerning filtered samples, when looking only at metazoan taxa, the 303 proportion of OTUs detected was largely different compared to the other methods (Fig.  304   S5). For instance, Decapoda orders had a low OTU diversity with filtered water (COI: 305 0,7%; 18S: 7%), whereas more OTU diversity could be detected for Calanoida (COI: 8%; 306 18S: 35%), Sabellida (COI: 15%; 18S: 28%) and Leptothecata (COI: 13%; 18S: 58%). In 307 general, filtered water recovered the smallest metazoan OTU diversity. 308 309

Influence of sampling seasonality and locality on detected biodiversity 310
Seasonal variations contributed significantly to differences in community 311 composition of zooplankton nets and filtered water but not of sediment grabs (Table  312 S2). Total OTU richness and unique OTU richness varied between seasons for all 313 sampling methods (Fig. 6A, B). Spring and late summer were generally richer than 314 autumn and winter, excepted for filtered water communities depicted with 18S. For 315 sediment grabs and zooplankton nets with 18S, a strong proportion of OTUs found in 316 autumn and winter were a subset of OTUs retrieved in late summer and spring. Indeed, 317 the nestedness component of sediment grab assemblages represented between 60 to 318 70% of total compositional variations among these pairs of seasons (Fig. 6D). For 319 zooplankton net assemblages, 63 and 72% of nestedness was observed between late 320 summer and spring and between late summer and autumn respectively, suggesting that 321 late summer recovered a majority of spring and autumn diversity (Fig. 6D). Thus, the 322 combination of spring and late summer for sediment grabs retrieved 85% of the total 323 OTU richness over the four seasons, while for zooplankton nets, in late summer alone 324 71% was retrieved (Fig. 6C, E). For filtered water and settlement plates, seasonal 325 community changes were driven by both OTU replacement and nestedness, with similar 326 relative contributions (Fig. 6D). The seasonal influence observed in intra-port samples 327 was also observed between ports (Fig. S6). The differences in communities between 328 Bilbao, A Coruña and Vigo (ports belonging to the same ecoregion but separated by over 329 500 km) during the same season was generally smaller than that between seasons in the 330 same port, indicating that communities were driven by seasonality rather than location 331 ( Fig. S6 and Table S3). This pattern was more pronounced with 18S than with COI. 332 Locality within the Bilbao port appeared to impact on benthic assemblages (Table  333 S2), since sites outside the estuary (sites 1 to 3) were different from site 4 inside the 334 estuary; mainly, the polychaete Maldane glebifex was less abundant in site 4 (Fig. 5B). 335 Regarding zooplankton nets, site 4 was the main driver of difference in communities as, 336 when not considered, no significant differences between sites were observed (Table S2). 337 Each site harbored a similar proportion of the total OTU richness found by each sampling 338 method (Fig. 7C). This proportion did not exceed 60%. Indeed, OTU replacement 339 contributed more to community variation among sites than nestedness, especially with 340 sediment grab and settlement plates (Fig. 7D). OTU replacement was more important 341 when comparing sites from outside the estuary (sites 1, 2 and S) with site 4 inside the 342 estuary, while it contributed less to community variation among sites 1,2 and 3. This is 343 congruent with site 4 having generally more unique OTUs than the other sites (Fig. 7B). 344 An exception was observed for filtered water with COI, where site 4 had the lowest OTU 345 richness and unique OTU richness, and where nestedness contributed more to the 346 variation in community composition in comparison to sites 1 and 2 (Fig. 7A, B, D). 347 348

Detection of Non-Indigenous and Cryptogenic Species (NICS) 349
Our port baseline biodiversity survey detected 79 NICS, among which 29 were 350 previously recorded in the port Bilbao ( In order to be comprehensive, port baseline surveys need to have an extensive spatial 368 and temporal coverage so that most biodiversity is captured by the collected samples. While it is expected that increased temporal coverage will retrieve more taxa, 425 the costs associated to extensive sampling demands an evaluation of the benefits gained 426 by sampling at each season. Indeed, the HELCOM/OSPAR protocol limits sampling to late 427 summer for sediment and fouling, and to spring and late summer for plankton. Here we 428 20 show that for sediment and fouling, spring sampling provides higher diversity than late 429 summer. Although sampling in late summer could be appropriate for morphological 430 taxonomy because of the more abundance of adult individuals, for metabarcoding, 431 sampling in spring is preferable because during this season i) sizes of organisms are less 432 variable and thus metabarcoding is less likely to under detect small organisms (Elbrecht,433 Peinert, & Leese, 2017) and ii) species diversity is at its maxima due to being a high Concerning spatial sampling, we observed no significant differences between 447 sites when using eDNA, except between the only site located in the inner estuary, and 448 the others and only when using zooplankton, and significant differences overall when 449 using sediment grabs. The proportion of the total OTUs found in a single site was low, 450 suggesting that spatially comprehensive sampling is crucial to recover the port´s 451 biodiversity. Interestingly, site 4 was not only the most different from all four sites but 452 was also the one that recorded the largest number of NIS. In coherence with our 453 findings, it has been observed that brackish environments favors NIS settlement (Zorita 454 et al., 2013) because these species usually support wider range of salinity (Cardeccia et  455 al., 2018). Thus, samples for port monitoring should include those with a wide range of 456 abiotic conditions and covering priority sampling, such as highly active ship berths, 457 potential reservoirs of newly arrived NIS (Hewitt & Martin, 2001). 458 459

Metabarcoding provides valuable information on Non-Indigenous Species 460
Our metabarcoding port biodiversity baseline survey detects species previously 461 recorded in the port of Bilbao and species known to be present in the port´s large marine 462 ecoregion. Importantly, it also detects presence of seven species for which no records 463 exist so far in this ecoregion, highlighting the potential of metabarcoding for early NICS 464 detection. Nonetheless, not all NICS species previously recorded in Bilbao were found 465 by our analyses. This might be due to these species not being present in the port at the 466 time of our survey, to biases of the metabarcoding process such as differential DNA 467 extraction, primer non-specificity, etc (