High‐throughput metabarcoding of eukaryotic diversity for environmental monitoring of offshore oil‐drilling activities

As global exploitation of available resources increases, operations extend towards sensitive and previously protected ecosystems. It is important to monitor such areas in order to detect, understand and remediate environmental responses to stressors. The natural heterogeneity and complexity of communities means that accurate monitoring requires high resolution, both temporally and spatially, as well as more complete assessments of taxa. Increased resolution and taxonomic coverage is economically challenging using current microscopy‐based monitoring practices. Alternatively, DNA sequencing‐based methods have been suggested for cost‐efficient monitoring, offering additional insights into ecosystem function and disturbance. Here, we applied DNA metabarcoding of eukaryotic communities in marine sediments, in areas of offshore drilling on the Norwegian continental shelf. Forty‐five samples, collected from seven drilling sites in the Troll/Oseberg region, were assessed, using the small subunit ribosomal RNA gene as a taxonomic marker. In agreement with results based on classical morphology‐based monitoring, we were able to identify changes in sediment communities surrounding oil platforms. In addition to overall changes in community structure, we identified several potential indicator taxa, responding to pollutants associated with drilling fluids. These included the metazoan orders Macrodasyida, Macrostomida and Ceriantharia, as well as several ciliates and other protist taxa, typically not targeted by environmental monitoring programmes. Analysis of a co‐occurrence network to study the distribution of taxa across samples provided a framework for better understanding the impact of anthropogenic activities on the benthic food web, generating novel, testable hypotheses of trophic interactions structuring benthic communities.


Introduction
Traditional biodiversity assessments of marine sediments are based on taxonomic determination of fauna using morphological characters (Gray 2000;Diaz et al. 2004). However, the temporal resolution varies between environmental monitoring programmes (for Norwegian Standard sampling guidelines, see ISO 5667-19:2004 andISO 16665:2014) and relatively low frequency sampling limits in-depth understanding of stressors (Chariton et al. 2010a;Baird & Hajibabaei 2012). Another constraint is that monitoring is typically focused on macrofauna (>1 mm fraction). To generate a more complete and mechanistic understanding of marine benthic ecosystems, the meio-and microfauna should also be assessed (Bourlat et al. 2013). This may be especially important to identify early-warning signs of disturbance (Cowart et al. 2015). Microbial eukaryotes are of vital importance to marine ecosystems and, together with prokaryotes, represent the base of the food web. Hence, alterations in the composition of this base can generate structural changes affecting the whole ecosystem.
In spite of the benefits of increased taxonomic and temporal coverage, it is typically not considered economically viable for routine monitoring programmes (Borja & Elliott 2013;Bourlat et al. 2013) and most current monitoring programmes are limited by the need for highly trained taxonomists to laboriously examine samples by microscopy (Hynes 1994;Maurer 2000). This creates an analytical bottleneck, limiting more comprehensive sampling. In addition, accuracy is decreased due to variable taxonomical skills, as well as problems associated with cryptic species complexes (Chariton et al. 2014;Cowart et al. 2015). Consequently, marine reports and inventories frequently fail to detect or determine the composition of significant taxonomic groups, reducing their value for ecosystem assessment (Schander & Willassen 2005;Bourlat et al. 2013).
Developments in DNA sequencing technology may offer a solution to these shortcomings, with competitive speed, cost and ease of use (Baird & Hajibabaei 2012). Sequencing of environmental DNA using a taxonomic marker such as 18S rRNA allows assessment of a broad taxonomic spectrum of the eukaryotic community (Chariton et al. 2014;Cowart et al. 2015;Gong et al. 2015;Lallias et al. 2015;Zimmermann et al. 2015). Commonly referred to as metabarcoding, this provides a more complete description of biodiversity including macro-and meiofauna (Baird & Hajibabaei 2012). This, in turn, can improve the understanding of structural changes in benthic ecosystems, enabling better assessment and prediction of anthropogenic effects (Baird & Hajibabaei 2012).
Indeed, studies investigating the diversity of eukaryotic micro-organisms using metabarcoding in marine sediments are becoming more common and have confirmed their key role in maintaining ecosystem function (reviewed in Bik et al. 2012a).
A number of studies have evaluated the accuracy of metabarcoding using mock communities with known compositions of vertebrates (Valentini et al. 2015), zooplankton (Brown et al. 2015;Elbrecht & Leese 2015) or ciliates (Gong et al. 2013). The performance of metabarcoding has also been directly compared to traditional morphology-based identification in environmental sediment samples (e.g. Hajibabaei et al. 2011;Chariton et al. 2014;Kermarrec et al. 2014;Cowart et al. 2015;Pochon et al. 2015;Zimmermann et al. 2015). Perhaps most relevant for monitoring programmes, a number of studies have evaluated the ability of metabarcoding to identify shifts induced by contamination and other anthropogenic impacts on total benthic communities (Chariton et al. 2010b(Chariton et al. , 2014Santos et al. 2010;Bik et al. 2012b), or within particular groups such as macroinvertebrates (Hajibabaei et al. 2011;Lejzerowicz et al. 2015) or foraminifera (Pawlowski et al. 2014;Pochon et al. 2015). These studies revealed a far greater biodiversity than microscopy and generally confirmed the capacity to separate samples based on contamination or other environmental characteristics. Thus, they verify the potential of metabarcoding to improve our understanding of anthropogenic impacts and to guide regulatory agencies and stakeholders in coastal zone management. Nonetheless, limitations in metabarcoding, including lack of reference data, technical artefacts and variable degrees of correlation to traditional morphology-based species assessments (Brown et al. 2015;Cowart et al. 2015), remain. Further studies and the establishment of clearer methodological guidelines (Baird & Hajibabaei 2012;Lallias et al. 2015) will be important in overcoming the above limitations.
We assessed the use of metabarcoding, using the 18S rRNA gene, for biomonitoring of samples adjacent to offshore oil-drilling platforms on the Norwegian continental shelf. In particular, we focused on the relationships between sequencing data and environmental parameters, in comparison with classical morphologybased characterization. We identified a number of putative indicator taxa, whose abundances were strongly correlated with contamination levels, and evaluated a co-occurrence network based on sequencing data, to establish possible trophic links and other ecological interactions in the benthos.

Study site and samples
Sediments from the Norwegian continental shelf in the North Sea, in the Troll-Oseberg offshore oil-drilling region (Region III), were studied (Fig. S1, Supporting information). We analysed 45 sediment samples from different oilfields as part of ongoing periodic sampling for environmental monitoring in 2010 by DNV for Statoil Petroleum AS (DNV 2011). All samples were located within 250-5000 m from oil platforms. The fields studied were Oseberg C, D and G ('OSEC', 'OSED' and 'OSEG', respectively); Veslefrikk ('VFR'); and Tune. Sediments from Oseberg and VFR were composed of fine to medium-fine sand, whereas Tune had more coarse sand. Individual samples were also taken from the oilfields Fram West (station A02) and Huldra (station 9). Table S1 (Supporting information) lists all samples with station of origin, geographical coordinates and results of physiochemical measurements as they appear in the monitoring report by DNV (2011). Parameters include distance from platform; depth; sediment grain size and composition (share of sand, silt/mud and gravel); total organic material (TOM); total hydrocarbons (THC); polycyclic aromatic hydrocarbons (PAH); naphthalene phenanthrene and dibenzothiophene (NPD); and heavy metals Ba, Cd, Cr, Cu, Hg, Pb and Zn.
For several of the samples studied, traditional microscopy-based taxonomic classification and enumeration of the macrofauna were available (DNV 2011). These data were transformed to allow statistical analysis, and duplicate taxa with ambiguous names were merged (e.g. 'spp.' vs. 'indet'; Table S2, Supporting information).

Sample processing, DNA extraction, amplification and sequencing
Sample processing and DNA extraction were essentially performed as optimized for marine sediments by Lekang et al. (2015). Aliquots of 50-100 g of sediment were transferred from each sample to a 250-mL plastic container (Kautex Textron) and fixed with 96% ethanol (to a final concentration of 80%). Samples were stored at À20°C until further analysis. Ethanol was removed prior to DNA extraction by centrifugation at 6000 g followed by pipetting. Genomic DNA was extracted from ten 0.5 g replicates of each sediment using PowerSoil â DNA extraction kits (MO BIO Laboratories Inc., Carlsbad, CA), to decrease heterogeneity and stochastic extraction effects. An equal volume of each replicate extraction was pooled prior to PCR amplification.
PCR amplification targeting the V4-V5 region of the 18S rRNA gene was carried out using the barcode-and adapter-linked primers F566 and R1200 (Hadziavdic et al. 2014), with eight replicates per sample, each using 2.5 lL pooled, extracted DNA; with 25 lL HotStar Taq Master Mix (Qiagen), 0.5 lM of each primer and 1 lg/ lL of BSA (Fermentas). A C100 Thermal Cycler (Bio-Rad) was used with the following program: 95°C for 15 min and 35 cycles of 95°C for 45 s, 60°C for 45 s, 72°C for 1 min and a final extension step of 72°C for 10 min. Resulting amplicons were visualized on a 1.5% agarose gel electrophoresis stained with GelRed (Biotium). Successful PCR replicates were pooled, and further concentrated using a vacuum centrifuge. Concentrated amplicons were purified to remove primers and PCR reagents using Agencourt AMPure XP (Beckman Coulter Inc.) following the manufacturer's recommendations. DNA concentrations were determined by Quant-iT TM PicoGreen â dsDNA quantification kit (Invitrogen) and an ND3000 fluorospectrophotometer (Nanodrop Technologies Inc.), using bacteriophage k DNA (Invitrogen) to produce the DNA standard curve. Finally, amplicon libraries were pooled in equimolar amounts. Pyrosequencing was performed using a Genome Sequencer FLX (454 Life Sciences) using titanium reagents at the Norwegian Sequencing Center (University of Oslo, Norway). Three runs were carried out using one-region gaskets, with 3-19 barcoded samples included per run.

Sequence analysis and multivariate statistics
Demultiplexed flowgram data (SFF files) were processed using AmpliconNoise (Quince et al. 2011) with default parameters. This program removes low-quality reads including those not matching the forward primer and performs iterative probabilistic clustering based on flowgrams and base-called sequences, to correct artefacts from PCR and sequencing. Removal of chimeric sequences was performed using Perseus, and sequence data sets pooled and subjected to maximum distance (complete linkage) hierarchical clustering using NDist and FCluster (Quince et al. 2011), retaining read abundance information from AmpliconNoise. An OTU frequency table was derived for a divergence cut-off of 2% (OTU 98 , used for subsequent analysis). Alternative OTU tables used for evaluating the effect of varying the OTU cut-off were also derived using 1% (OTU 99 ), 3% (OTU 97 ) and 5% (OTU 95 ). A fifth 'plankton-filtered' OTU 98 table was derived by manually removing all OTUs classified as taxa thought to be dominated by pelagic organisms (OTU F ). Representative OTU 98 sequences were taxonomically classified using CREST (SilvaMod) with default parameters except for a minimum bitscore of 100 (Lanz en et al. 2012). Based on classification, two additional OTU distribution tables were derived as follows: exclusively metazoans and nonmetazoans.
Calculation of diversity estimates, multivariate statistics and visualization was performed using the R vegan package (Oksanen et al. 2013). Rarefied OTU 98 richness estimates (expected richness in a random subsample) were calculated based on the lowest number of reads for each taxonomic subset (total communities, metazoans and nonmetazoans; function rarefy). Rarefied Pielou evenness estimates were obtained using Shannon diversity estimates divided by the logarithm of rarefied richness. All analyses were limited to these rarefied diversity estimates for amplicon data sets, as opposed to microscopy-based species composition where other measures were taken to prevent sampling bias (DNV 2011).
Sample-specific read abundances for OTU distribution tables, as well as the classical morphology-based taxon table, were transformed into relative abundances using the function DECOSTAND (method = 'total', i.e. total reads for each OTU divided by the total number of reads in the corresponding sample). All subsequent analysis was based only on relative abundance data. This approach, as opposed to rarefaction of abundance data, was motivated by the fact that valid read abundance data are not omitted, thus inflating random variance in sample dissimilarity estimates (McMurdie & Holmes 2014).
Relative abundance estimates were subjected to Hellinger-transformation before calculation of Bray-Curtis dissimilarity matrices. Nonmetric multidimensional scaling (NMDS) using METAMDS, permutational ANOVA (PERMANOVA) using ADONIS, and Mantel tests (including partial) were performed using these dissimilarity matrices. PERMANOVA was carried out only with parameters that were significantly correlated (P < 0.05) to the resulting NMDS coordinates using function ENVFIT. Parameters were divided into two subsets: (i) contamination independent (depth, grain size and %sand) and (ii) contaminants. Parameters from each subset were added sequentially in order of correlation strength to NMDS (best fit first) and subsequently removed unless found to also be significant in the PERMANOVA model. For Mantel tests, a geographical position distance matrix was calculated using simple Euclidean distances (function DIST). Parameter dissimilarity matrices were calculated for the two parameter subsets (i) and (ii) using log-transformation (VEGDIST option log).
Taxon distribution was studied at order rank as determined by CREST, manually including taxa of lower ranks lacking child nodes at order level. Metazoan and nonmetazoan taxa were separated as for OTU data and the top 20 taxa of each subset visualized using the R GGPLOT2 package (Wickham 2009). Putative indicator taxa were identified using Kendall rank correlations between relative abundances and contamination-related parameters measured. A P-value cut-off of 0.05 after Bonferroni correction was used. Co-occurrence network analysis based on all taxa with average relative abundance above 0.01% was performed with the R package IGRAPH (Csardi & Nepusz 2006) using Kendall rank correlation. All correlations resulting in s coefficients above 0.45 or below À0.45 were visualized as edges and taxa with at least one edge included in further analysis.

Biodiversity and environmental parameters
Amplicon sequencing resulted in over 1.6 million reads after quality and chimera filtering for the 45 samples, yielding 26 213 OTUs (including 37% singleton reads, 2% maximum divergence). Table S3 (Supporting information) provides a detailed overview of read counts, OTUs and diversity estimates, including data from classical monitoring based on morphology (DNV 2011). The number of reads per sample varied greatly, resulting in a trend of higher OTU richness and share of singletons with read depth. Thus, sequencing depth was insufficient to cover the full biodiversity of most sediment samples studied, although this trend can partially be attributed to remaining sequence artefacts (Quince et al. 2011). To compensate, only rarefied OTU richness and evenness were considered, that is expected richness of theoretical subsamples corresponding to the smallest sequencing depth. Rarefied richness ranged from 455 OTUs in Tune-01 to 1254 in OSEG-06-07. OSEG-06-07 also had the highest rarefied Pielou evenness, whereas Tune-08 had the lowest evenness. Neither rarefied richness nor evenness correlated to sequencing coverage (P = 0.9 and 0.7, respectively).
Concentration of contaminants associated with effluents from oil drilling varied considerably between samples and oilfields. The Veslefrikk samples VFR-04 and 05 were the most contaminated in terms of total hydrocarbon (0.1 and 1 g THC/kg, respectively) and Ba concentrations (>5 g/kg), whereas THC was below detection limits and Ba was <50 mg/kg at other sites (DNV, 2011; Table S1, Supporting information). All significant correlations between diversity indices and environmental parameters are illustrated in Fig. 1 as a network annotated with coefficients (Kendall's s) and P-values (mentioned below; see also Fig. S2,Supporting information). No significant relationship could be identified between sequencing-based diversity estimates and corresponding data from classical morphology-based monitoring. However, both rarefied OTU richness and evenness correlated significantly with the ratio of bacterial to eukaryotic rRNA copies as determined by qPCR (B/E, Fig. 1A). This indicates that sediments with more dominant bacterial food webs tended to have more diverse eukaryotic communities, although bacterial diversity as such was not evaluated.
Total rarefied OTU richness correlated negatively to Cd concentration (Fig. 1B). Similar to Ba, elevated Cd is associated with effluents from offshore drilling. The same comparisons resulted in significant correlations using all alternative OTU divergence cut-offs (OTU 99 , OTU 97 and OTU 95 ). Rarefied richness using these alternative cut-offs showed strong linear correlations with the default OTU 98 rarefied richness (Fig. S3, Supporting information).
Metazoan sequence reads contributed 22% of total abundance, ranging from 6% in VFR-20 to 61% at Tune-03 (Table S3, Supporting information). This share (see 'M' in Fig. 1B) was negatively correlated to several contaminants including Ba, Pb, Cu and THC. Communities with a lower relative abundance of metazoans tended to have higher metazoan diversity (Fig. 1B). The highest rarefied metazoan OTU richness was encountered at VFR-K1, the third most contaminated station in terms of Ba and THC.

Multivariate analysis of community structure
Ordination based on molecular OTU composition (NMDS) resulted in one distinct cluster containing the samples collected from VFR and the single sample from nearby Huldra ( Fig. 2A). Samples from Oseberg C and D formed distinct clusters, although both overlapped with samples from Oseberg G and the latter with Tune. This pattern indicates that geographical location of the oilfield had a strong effect on community composition. However, it is not sufficient to distinguish geographical or dispersal effects from contrasting environmental properties or contamination levels. Several such parameters correlated significantly with the NMDS coordinates including bottom depth, percentage of sand, median grain size (in φ-scale), total organic material (TOM) and concentration of heavy metals (see Fig. 2A and Table 1). The two most contaminated samples VFR-05 and VFR-04 (in terms of THC and Ba) appear strongly associated with Ba and Cd concentrations, as expected. Morphology-based monitoring (DNV, 2011) confirms the strong effects on metazoan community structure at these sites.
To evaluate the influence of OTU divergence cut-off and filtering of OTUs likely to be pelagic (detritus), the above analysis was repeated using the corresponding alternative OTU tables. Approximately 25% of OTUs, representing 43% of reads, were removed based on their taxonomic classification and removed from the filtered OTU table (OTU F ; see Table S4, Supporting information). A represents significant (P < 0.05) positive Kendall correlations (s > 0) and Network B negative correlations (s < 0). Thickness of edges represents strength of correlation, and different coloured nodes represent different types of parameters according to the legend. S 0 = rarefied OTU richness, S = taxon richness (microscopy-based), J 0 = rarefied Pielou evenness (H 0 /ln(S 0 )), J = estimated Pielou evenness (microscopy-based), M = metazoan proportion of read abundance, B/E = bacterial rRNA copy number abundance relative eukaryotic as determined by qPCR. PAH and NPD were not included due to limited data.
Almost half of the OTUs (49%), representing 30% of reads, were from taxa that we could not distinguish based on taxonomic classification alone and these were retained in OTU F . NMDS clustering patterns for these alternative OTU distributions and correlations to physiochemical parameters were generally consistent, although the Tune, OSEC and OSED data sets were generally more clearly separated for OTU F (Fig. S4, Supporting information). Further, filtering or alternative cut-offs improved correlation of several physiochemical parameters, although not in a consistent manner (Table S5, Supporting information), with NPD, Cu and Zn correlated significantly to only the OTU F and OTU 97 NMDS, whereas Ba only correlated with that of OTU 95 or OTU 98 (default), A separate NMDS was carried out, including only molecular OTUs from metazoa. This resulted in a similar ordination pattern, except for larger overlap between Oseberg and Tune samples and different positions of Fram and Huldra samples (Fig. S5, Supporting information). All but two measured parameters correlated more strongly to the resulting ordination space based on metazoan composition (see Table 1). A corresponding analysis based on nonmetazoan taxa resulted in an ordination pattern visually similar to that based on all OTUs and an intermediate strength of parameter correlations (Table 1 and Fig. S6, Supporting information). Thus, when dividing the community data into metazoan and nonmetazoan taxa, both subsets corresponded better to environmental parameters than the total data set, similar to the trends observed with diversity estimates (Fig. 1).
NMDS based on morphology (microscopy data) (DNV 2011) resulted in a pattern similar to those based on metabarcoding (Fig. 2B). However, the locations (oilfields) appearing to harbour the most heterogeneous communities based on metabarcoding formed much smaller clusters, and different oilfields were generally better separated. Further, correlations were stronger for all parameters except distance from the drilling site (see Table 1). This is consistent with metazoan composition being more sensitive to environmental parameters and contaminants. It also suggests that morphology-based monitoring was more sensitive than sequencing-based assessments in this respect. However, all environmental parameters correlating significantly to the morphologybased NMDS coordinates were also significant in the metabarcoding-based NMDS (metazoan subset) and in three cases with better P-values in spite of lower 'goodness of fit' (Cd, Pb and Zn; Table 1).
The influence of environmental parameters on molecular OTU composition across all sample sites was verified using PERMANOVA, indicating that depth, sand and grain size together explained 16% of the variability (data available for 40 samples). By adding the heavy metals Cd, Cr and Ba to the model, 26% was explained. Including TOM (n = 32 samples), allowed 33% of variability to be explained. The same model based on the metazoan OTUs resulted in a similar result (R 2 = 32%). Results based on morphology explained 62% of the variability (n = 32). PERMANOVA models based on the 'plankton-filtered' OTU F distribution resulted in essentially identical significance and residual vs. modelled R 2 values. The model including TOM was not significant for alternative OTU cut-off distributions, and heavy metals were not significant for OTU 97 . However, residual R 2 was lower for OTU 95 compared to the default OTU 98 for the first two models (explaining 18% and 28% of variance, respectively). For OTU 99 , residual variance was consistently higher.
Mantel tests (see Table 2) were used to verify results from PERMANOVA and make further comparisons between data sets. This analysis verified that community dissimilarity correlated significantly to the combined influence of all environmental parameters (r = 0.47, P < 0.001). However, geographical distance was more strongly correlated than environmental parameters (r = 0.52). Geographical distance was also correlated to differences in contamination among the samples studied (r = 0.48). Partial Mantel tests compensating for either the combined influence of depth, sand and grain size (r = 0.33, P < 0.001), or geographical distance (r = 0.16, P = 0.03) verified that contamination-related variables alone contributed significantly to OTU composition (Table 2).
We also verified that relative abundances of molecular OTUs added significant explanatory power compared to transformation into presence/absence, when correlated to all environmental parameters (r = 0.19, P = 0.007; Table 2). Community dissimilarities based on morphology strongly correlated to those based on metabarcoding (r = 0.66) and to environmental parameters (r = 0.60, P < 0.001; Table 2).

Taxonomic composition, indicator taxa and network analysis
In total, 68% and 62% of the metazoan and nonmetazoan sequence reads, respectively, could be classified taxonomically, by including all taxa at order rank (Note: in some cases, lower ranks were used when taxonomic information at the order rank was missing; 91 and 75%, respectively, could be classified to class level). The most abundant metazoan orders based on sequencing were the benthic copepods Harpacticoida, followed by the polychaetes Sabellida and Spionida (Fig. S7, Supporting information). Several samples were strongly dominated by other taxa, and no obvious trends were identified with respect to differences between oilfields. Nonmetazoan taxon distributions appeared more consistent, with the most abundant orders being Thalassiosirales followed by Chaetocerotales (both diatoms) and Thaumatomonadida (cercozoan flagellates). Nonmetazoans appeared to include detritus from planktonic organisms to a large extent, although making such a distinction was often uncertain at order rank for many taxa. The majority of nonmetazoan reads were classified as taxa lacking multicellular organisms, although multicellular algae or fungi were also present.
The distribution of abundant metazoan taxa based on morphology (DNV 2011) also indicated considerable differences between oilfields (Fig. S8, Supporting information). The two most contaminated samples VFR-04 and VFR-05 were dominated by Chaetozone setosa, whereas no such shift was identified from sequencing data. Several of the most abundant species identified by morphology also belonged to orders dominating sequencing data, including Spionida (C. setosa, Spiophanes bombyx and Aonides paucibranchiata) and Sabellida (Galathowenia oculata and Owenia fusiformis). Table 1 Correlation of environmental parameters to NMDS clustering pattern. R 2 values for linear correlation of parameter vectors with maximal correlation to NMDS space resulting from Bray-Curtis distance of Hellinger-transformed OTU or taxon abundance values are displayed and annotated with significance (* < 0.05, ** < 0.01, *** < 0.001) as determined by function ENVFIT  A number of metazoan and nonmetazoan indicator taxa were identified, with relative abundance significantly correlated to contamination levels (Tables 3-4 and Fig. S9, Supporting information). Metazoans sensitive to contamination were the gastrotrich order Macrodasyida, the flatworms Macrostomida and the tube-dwelling anemones Ceriantharia. In contrast, Spionida appeared more abundant in sediments with higher levels of THC and heavy metals. Compared to metazoans, more nonmetazoan indicator taxa were identified, including several orders of ciliates (Euplotida, Philasterida, Pleurostomatida and Heterotrichida), Naviculales (diatoms), Labyrinthulida and Perkinsida, as well as the aquatic fungi Chytridiomycota and the oomycetes Olpidiopsidales.
Different taxa are expected to interact and influence each other in sediment communities. To identify such ecological interactions, Kendall rank-correlation network analysis was used. Thus, six modules of interconnected co-occurring taxa were identified (see Fig. 3A). The two largest modules were connected to each other by three negative correlations and appeared highly interconnected internally. The largest of these (Module 1) contained several indicator taxa, suggesting that sensitivity to contamination may influence abundance both within and across modules (e.g. Module 5 or Pholadomyodia). Module 2 also contained indicator taxa, indicating a more complex interaction mechanism with Module 1. Several co-occurrence links represented putative trophic interactions (Fig 3B), such as metazoa grazing on ciliates and ciliates grazing on heterotrophic flagellates as well as the detrital component of the sediments.

Discussion
This study illustrates the utility of metabarcoding to assess anthropogenic impacts on benthic eukaryotic community structure and arrives at conclusions compatible with those previously obtained by classical morphology-based monitoring techniques (DNV 2011) in a cost-effective, faster and more comprehensive manner. This supports projections that metabarcoding of environmental DNA has several advantages over classical monitoring (Santos et al. 2010;Baird & Hajibabaei 2012;Cowart et al. 2015;Zimmermann et al. 2015).
Whereas morphology-based monitoring is typically limited to macro-invertebrates, metabarcoding can assess a substantially larger fraction of the biodiversity, including meio-and microfauna that are difficult to identify using morphological characteristics. However, abundances derived from sequencing do not compare directly to morphology-based methods, and similarly, OTU richness is not analogous to morphology-based diversity indicators (Chariton et al. 2014). Nonetheless, we found rarefied OTU richness and evenness of nonmetazoans to be negatively influenced by several contaminant concentrations. Corresponding data from morphology-based monitoring showed conflicting trends with respect to evenness (negatively influenced, Table 2 Mantel test statistics. Permutation-based Mantel tests were used to evaluate the correlation between two dissimilarity matrices ('explanatory' and 'dependent' variables below). Using partial Mantel tests, we intended to compensate for the influence of a third dissimilarity matrix ('conditioning variables') in order to evaluate the effect of explanatory on dependent variables, independent of conditioning variables (by keeping the correlation structure constant between conditioning and explanatory matrices). Bray-Curtis dissimilarity was used to derive community dissimilarities, log-transformation for environmental parameters and simple Euclidian distance for geographical positions  Fig. 1A). Rarefied metazoan OTU richness showed no clear trends. This supports earlier suggestions that the response of macro-and meiofauna to oil contamination differs from that of the microfauna (Ansari & Ingole 2002) and that nonmetazoan diversity may be a more sensitive indicator of pollution in the studied environments. Both biodiversity and community composition analyses, including relative abundance data, demonstrated that incorporating nonmetazoan organisms in the assessment generated additional information that could not be obtained from larger metazoans alone. In accordance with earlier studies (Santos et al. 2010;Bik et al. 2012b;Chariton et al. 2014;Cowart et al. 2015;Lallias et al. 2015), we identified several taxa among the nonmetazoan meio-and microfauna, as potential bio-indicators of contamination. Metazoans, on the other hand, only constituted a minority of such taxa (n = 8/36). This is supported by previous studies of oil contamination as well as ecological theory, suggesting that responses of micro-organisms to environmental changes are often faster and more pronounced due to their large surfaceto-volume ratio, high abundance and short generation time (Santos et al. 2010;Payne 2013).
However, NMDS indicated that metazoan communities as a whole appeared more sensitive to contamination (Table 1). The low number of putative metazoan  indicator taxa identified (Table 2) may also be due to lower accuracy in evaluating relative abundances when using a molecular approach. As opposed to smaller organisms, metabarcoding as implemented here is limited to targeting traces of DNA from macroinvertebrates, rather than complete specimens. We expect this fact to increase heterogeneity associated with metazoan relative abundance estimates considerably. Changes in some indicator taxa (Tables 2 and 3) may be indirectly caused by contaminants. One such example was Perkinsida, which is an order of mainly parasitic protists on hosts such as molluscs (genus Perkinsus, present among reads classified to higher ranks). Its negative correlation with contaminants may be due to decreasing abundance of the host organism. There were also indicators that positively correlated to contaminants, such as Microascales. This fungal order has been associated with degradation of toluene (Prenafeta-Bold u et al. 2006). Another example was the metazoan polychaete Chaetozone setosa (order Spionida), which has been characterized as highly tolerant to hydrocarbons, even increasing in abundance as a result of contamination (Hiscock et al. 2004). Data from morphology-based monitoring also confirmed that the two most contaminated samples were dominated by this species (DNV 2011; Fig. S8, Supporting information).
Using metabarcoding, we may also gain knowledge regarding trophic interactions in the ecosystem, contributing to a more mechanistic understanding of responses to anthropogenic impacts. In the co-occurrence network (Fig. 3), interactions between detritus, ciliates, heterotrophic flagellates, metazoans, parasites and decomposers were suggested. In particular, a co-occurring group of organisms including ciliates, metazoans and decomposers was identified (Module 1), including several taxa that were negative indicators of contamination. Decreased abundance of organisms within this module was associated with increased abundance of several other modules (and vice versa), including primary producers (some of which likely did not belong to the active benthic community, but rather were detritus deposited in the sediments), as well as heterotrophic flagellates (e.g. modules 2 and 5). This may result from reduced grazing when the abundance of ciliates declines due to contaminations.
Other taxa classified as primary producers were included in Module 1, notably Hemiaulales. Although the majority of organisms in this order are pelagic photosynthetic diatoms, some have been reported to colonize benthic sediments down to 10-14 cm (Yahia-Kef ı et al. 2005;Cibic & Facca 2010). This order could therefore represent an active member of benthic Fig. 3 Kendall rank-correlation-based cooccurrence network of taxa based on relative abundances across samples. Taxa are represented by circles in the correlation network (A) and coloured according to expected position in a simplified food web model (B) adapted from Thingstad et al. (2008). Edges represent Kendall correlations above the threshold for inclusion (s > 0.45). The size of circles is proportional to average relative abundance across data sets (cut-off for inclusion = 0.01%) and the thickness of edges proportional to strength of correlation.
communities. Another example is the Pleurostomatida. The most abundant genus detected in this ciliate order was the Litonotus (and its most common species L. pictus). Members of this taxon are predatory ciliates, feeding on other protists including other ciliates (Coats & Clamp 2009). This may explain its co-occurrence links to four other ciliate taxa. It also illustrates the limitations of simplified food web models such as that used in Fig. 3 for understanding ecosystem dynamics. The co-occurrence network structure and indicator taxa obtained should mainly be regarded as a basis for forming ecological hypotheses. However, it illustrates an important strength of the sequencing-based approach for monitoring total eukaryotic diversity, compared to metazoans alone. This permits us to explore possible biological interactions between taxa, enabling a more holistic approach to environmental monitoring and providing a functional interpretation of ecosystems impacted by anthropogenic stressors.
To assess any overall change between the prokaryotic and eukaryotic fraction as a function of the environmental parameters measured here, we conducted a qPCR analysis using universal prokaryote and eukaryotic primers on all samples. Relative bacterial abundance was positively correlated to rarefied eukaryotic richness and evenness, suggesting that the bacterial community is another important factor determining eukaryotic diversity. Indeed, bacteria compose a substantial part of the biomass in marine sediments and contribute significantly to nutrient cycling (Nodder et al. 2003). Considering this, it would be interesting to include prokaryotes in future evaluations of sequencing-based monitoring.
Using a partial Mantel test, we verified the utility of incorporating relative abundance data when calculating community dissimilarities. This resulted in significantly better correlation with environmental parameters, compared to using presence/absence alone, which in effect discards useful abundance data. This is also illustrated by the fact that many indicator taxa identified in this study showed a decrease in abundance in the presence of pollutants, although they were also detected at the most contaminated sites (Fig. S9, Supporting information). This supports the notion that read abundance data are meaningful as a semiquantitative indicator when comparing community structure between biological samples (Pilloni et al. 2012;D'Amore et al. 2016). Most studies pioneering metabarcoding for environmental monitoring have focused on presence/absence of taxa (Bik et al. 2012b;Chariton et al. 2014;Lallias et al. 2015) and may consequently have overlooked some potential indicator taxa identified here, whereas the use of foraminiferal metabarcoding for monitoring of fish-farming sites did successfully utilize read abundance data (Pawlowski et al. 2014;Pochon et al. 2015). However, these studies are not directly comparable, because relative abundance data can be more accurately estimated for unicellular protists such as foraminifera, compared to metazoa.
Although metabarcoding holds promise for routine environmental monitoring, there are methodological aspects that require further consideration. For example, despite synthesizing relatively long amplicons, reducing amplification efficiency of degraded DNA, we still detected several pelagic taxa. Nonetheless, considering the importance of detritus as an energy source for sediment communities, its composition and subsequent degradation can provide useful information about the food web in healthy and disturbed sediments. The influence of dead biomass could be minimized in future studies using propidium monoazide treatment (Nocker et al. 2010), or by targeting RNA instead of DNA (see, e.g. Orsi et al. 2013), although it may remain useful (Cowart et al. 2015) to complement such analyses with intentional sequencing of detritus DNA in sediments as has been carried out previously (Bohmann et al. 2014).
Our attempt to remove OTUs that were likely of pelagic origin did not change results based on multivariate models (NMDS and PERMANOVA) in any meaningful manner, although OTUs representing almost half of the sequencing reads were removed. However, taxonomic resolution or information about specific taxa was often lacking, raising questions about the systematic reliability of such an approach. It would certainly not be possible to standardize or replicate in a manner suitable for routine monitoring, lacking a thoroughly curated reference database at higher taxonomic levels. This also illustrates another limitation of metabarcoding, namely the reliance on taxonomic reference databases (Blaxter et al. 2005). We have overcome this limitation when comparing community composition and diversity, using taxonomy-independent OTUs defined by de novo clustering. As databases and tools for taxonomic classification and prediction of ecological function improve, so will the analysis of indicator taxa and co-occurrence in ecosystems. Phylogenetic distances, such as UniFrac, may also improve the accuracy of community dissimilarity matrices (Lozupone & Knight 2005), although such measures depend on the quality of phylogenetic alignments used to define reference taxa.
The taxonomic markers used can also influence taxonomic coverage and the ability to discriminate between taxa; much like similar morphology can limit the ability to discriminate taxa in classical approaches. Cowart et al. (2015) have begun to address this issue by comparing morphological identification to cytochrome oxidase I and 18S rRNA-based metabarcoding. More such comparisons are needed to improve metabarcoding approaches in different types of samples and over time.
Metabarcoding may 'overestimate' diversity compared to microscopy-based species identification due to intraspecific variation in the 18S rRNA V4-V5 region targeted here, even when using divergence cut-offs as high as 10% (Brown et al. 2015). However, such divergence varies considerable between taxonomic groups and cut-offs as low as 1% may inappropriately cluster together closely related zooplankton species such as Daphnia pulex and D. pulicaria, or 'Artemia salina' spp. Thus, there is no 'perfect' cut-off for OTU clustering, analogous to morphological identification of species, suggesting that group-specific or multiple cut-offs should instead be applied (Brown et al. 2015). On the other hand, less well-studied species, particularly protists, may also harbour groups of ecologically divergent subspecies or populations, not easily distinguishable using microscopy. We consider separation of such groups into separate OTUs to be as important as avoiding artefactual OTUs due to technical errors and intragenomic or intrapopulation variation.
We applied a 2% sequence divergence cut-off, as a compromise maximizing the possibility of distinguishing such ecologically relevant groups, while compensating for most PCR-and sequencing errors remaining after the application of AmpliconNoise (Quince et al. 2011). This is in line with recommendations by Lie et al. (2014) and Mohrbeck et al. (2015) based on different SSU rRNA regions. We expect a smaller influence of errors compared to the above studies because the denoising and Needleman-Wunsch alignment-based clustering applied here is more appropriate for the indel errors associated with 454 pyrosequencing. Alternative OTU cut-offs were, however, also evaluated. Rarefied OTU richness showed strong linear correlation with the default divergence and consistent correlation patterns with physiochemical parameters. Multivariate analyses based on resulting community dissimilarities also generated similar results, suggesting that ecological patterns persist independent of the divergence cut-off chosen.
We anticipate that the accuracy of metabarcoding can be improved relative to this study using newer sequencing platforms such as Illumina MiSeq, allowing a considerably higher read depth and better per-base sequence quality (although lower read length), at a lower cost. As sequencing technology continues its rapid development and becomes more affordable, alternative approaches will also become economically viable for routine biomonitoring, such as metagenomics or 'total RNA' metatranscriptomics, allowing for reconstruction of full-length rRNA genes and profiling of the most abundant proteincoding transcripts (Epelde et al. 2014).
In conclusion, the metabarcoding approach presented here shows considerable promise for routine monitoring of marine sediments affected by anthropogenic activities. Microscopic analyses of macrofauna have been used in monitoring programmes for decades, and the long time series generated have value for historical comparisons, even though they do not provide a comprehensive assessment of the structure and function of the sediments. In general, any change to current regulatory practices will require stringent comparison of metabarcoding over time with classical monitoring. Documentation of added value in terms of comprehensiveness, and better temporal and spatial resolution at cost-effective levels for operators, will be required to inform any new monitoring regulations and policies. However, current routine monitoring is standardized, and current methods are used within long time series for comparisons. Therefore, any change to current practice will require that new sampling designs, processing and handling as well as analyses methods are compared over time, and improvements documented to regulatory authorities.
In this study, we demonstrated the application of metabarcoding for monitoring of sediment biodiversity in association with oil-drilling activities. We were able to identify effects of oil-drilling operations on both the metazoan and nonmetazoan benthic compartments, the latter of which is not typically assessed by classical monitoring programmes. Using a network of co-occurring taxa, we further illustrated the potential for ecological insights when using such an approach. Compared to morphology-based monitoring, metabarcoding permits evaluations of more samples at a corresponding or lower cost, generating increased temporal and spatial resolution together with a more complete ecosystem assessment, important for understanding the effect of anthropogenic stressors at regional and global scales. publication elsewhere while under consideration. The Norwegian Research Council and Statoil funded the project. The work was conducted, analysed and evaluated by independent research institutes, and the financing of a PhD stipend by Statoil was in no way contingent on, or linked to, specific research outcomes.       Table S1 All samples with station of origin, geographical coordinates and results of physiochemical measurements as they appear in the monitoring report by DNV (2011).

Table S2
Data from traditional microscopy-based taxonomic classification used in the study (DNV, 2011).

Table S3
Detailed overview of read counts, OTUs and alpha diversity estimates, including data from classical monitoring based on morphology.

Table S4
Taxa and number of OTUs affected by manual filtering.

Table S5
Correlation of environmental parameters to NMDS clustering patterns resulting from alternative subsets of OTUs or clustering cutoffs (1, 3 and 5% maximum distance).