High throughput shotgun sequencing of eRNA reveals taxonomic and derived functional shifts across a benthic productivity gradient

Benthic macrofauna is regularly used in monitoring programmes, however the vast majority of benthic eukaryotic biodiversity lies mostly in microscopic organisms, such as meiofauna (invertebrates < 1 mm) and protists, that rapidly responds to environmental change. These communities have traditionally been hard to sample and handle in the laboratory, but DNA sequencing has made such work less time consuming. While DNA sequencing captures both alive and dead organisms, environmental RNA (eRNA) better targets living organisms or organisms of recent origin in the environment. Here, we assessed the biodiversity of three known bioindicator microeukaryote groups (nematodes, foraminifera, and ciliates) in sediment samples collected at seven coastal sites along an organic carbon (OC) gradient. We aimed to investigate if eRNA shotgun sequencing can be used to simultaneously detect differences in (i) biodiversity of multiple microeukaryotic communities; and (ii) functional feeding traits of nematodes. Results showed that biodiversity was lower for nematodes and foraminifera in high OC (6.2%–6.9%), when compared to low OC sediments (1.2%–2.8%). Dissimilarity in community composition increased for all three groups between Low OC and High OC, as well as the classified feeding type of nematode genera (with more nonselective deposit feeders in high OC sediment). High relative abundant genera included nematode Sabatieria and foraminifera Elphidium in high OC, and Cryptocaryon‐like ciliates in low OC sediments. Considering that future sequencing technologies are likely to decrease in cost, the use of eRNA shotgun sequencing to assess biodiversity of benthic microeukaryotes could be a powerful tool in recurring monitoring programmes.


Introduction
Biodiversity is decreasing globally due to human alteration and pollution of terrestrial and aquatic environments (Brondizio, Settele, Díaz, & Ngo, 2019).Essential ecosystem services affiliated with human health, such as availability of food, clean water, and recreational areas are dependent on biodiversity (Cardinale et al., 2012;Pan, Marcoval, Bazzini, Vallina, & Marco, 2013).In addition to the provision of ecosystem services, biodiversity losses have also been linked to a decrease in ecosystem stability (McCann, 2000).Anthropogenic pressure on coastal aquatic ecosystems by e.g.climate change, eutrophication and contaminant pollution threatens the diversity of many organisms in these systems (Pan et al., 2013).Such threats on coastal ecosystems should be taken seriously because coastal zones are transitional areas directly adjacent to human settlements between land and sea, and impacted areas are predicted to increase in both number and area with a continued climate change scenario (Levin et al., 2001;Rabalais, Turner, Díaz, & Justić, 2009).It is therefore essential to understand how the diversity of organisms living in coastal zones respond to changes in environmental gradients and anthropogenic pressure (Snelgrove, Thrush, Wall, & Norkko, 2014).
Ciliates are used as bioindicators in e.g.aquaculture (Stoeck, Kochems, Forster, Lejzerowicz, & Pawlowski, 2018), wastewater treatment plants, and monitoring of eutrophication and chemical pollution (Chen, Xu, Tam, Cheung, & Shin, 2008;Foissner, 2016;Pawlowski, Lejzerowicz, Apotheloz-Perret-Gentil, Visco, & Esling, 2016).In natural aquatic environments, the diversity and community composition of ciliates are influenced by e.g.salinity, pH, and anthropogenic pollution (e.g.Gong et al., 2015;Jiang, Xu, Hu, Warren, & Song, 2013).One of the main merits of assessing the diversity of protists as bioindicators is their documented rapid change to environmental conditions (Payne, 2013).The assessment of microeukaryotic biodiversity is therefore a good proxy in monitoring programmes to study changes in ecosystems.However, these communities are rarely studied together and challenges still include being able to investigate multiple communities from bulk sediment samples without time consuming activities involved in studying the benthic microeukaryotic fraction such as sieving, sorting, and microscopy.
In the last ten years, environmental DNA (eDNA) and RNA (eRNA) metabarcoding studies targeting the 18S rRNA marker gene (or 18S rRNA for eRNA) have been extensively conducted to study microeukaryotes (Creer et al., 2016;Forster et al., 2019;Pochon et al., 2015).Such tools have drastically reduced the time needed to taxonomically classify organisms compared to morphological taxonomic techniques that also involves sieving and sorting of organisms (Carugati, Corinaldesi, Dell'Anno, & Danovaro, 2015).However, limitations exist with metabarcoding such as non-optimized PCR protocols and primer bias when targeting multiple taxa (Kelly, Shelton, & Gallego, 2019), and limitations of available species in reference databases when taxonomically classifying sequences.Compared to metabarcoding that typically yields ~60 000 sequences per sample (Singer, Fahner, Barnes, McCarthy, & Hajibabaei, 2019) shotgun sequencing can generate millions of sequences per sample.New bioinformatic tools that can today taxonomically classify these large datasets within minutes to hours (e.g.Wood, Lu, & Langmead, 2019) and estimate relative abundances at species or genus level (e.g.Lu, Breitwieser, Thielen, & Salzberg, 2017).However these methods rely on the reference databases to classify taxonomy and are therefore likely to become more precise over time when databases grow.While eDNA makes it possible to assess the biodiversity of both living organisms plus non-degraded DNA from dead organisms, eRNA is targeting mainly living organisms or RNA derived from organisms of recent origin in the environment (Cristescu, 2019;Wood et al., 2020).It is therefore valuable to investigate if eRNA combined with shotgun sequencing, thereby bypassing PCR limitations of metabarcoding, is a useful approach to assess differences in the biodiversity of active multiple communities from highly diverse and densely inhabited environments such as sediments.
Here we assessed the biodiversity and community composition of three microeukaryotic groups in sediment samples: nematodes, forams, and ciliates, along an OC gradient in a coastal archipelago in the Gulf of Finland, Baltic Sea.The aim was to investigate if eRNA shotgun sequencing, without any sieving or sorting of samples (i.e.bulk sediment), could be used to detect differences in biodiversity of multiple microeukaryotic communities for biomonitoring purposes.This is possible because this method is not based on amplification of known markers and avoids common limitations of metabarcoding such as: i) PCR primers only targeting certain species; ii) amplifying certain species more than others, and iii) the amount of cycles and type of polymerase used has been shown to influence diversity and community composition (Kelly et al., 2019;Nichols et al., 2018).Additionally, we assessed if changes in nematode functional ecology (feeding type) as a response to the OC gradient could be detected.We expected that nematode deposit feeders would have higher relative abundance in stations with higher OC.This approach was coupled to the latest sequencing platform (Illumina NovaSeq S4, yielding ~87 million read-pairs per sample in our study) which has been demonstrated to detect significantly more taxa compared to Illumina MiSeq sequencing that is the most used technology for metabarcoding studies (yields ~60 000 read-pairs per sample) (Singer et al., 2019).To analyze this large dataset, we used new bioinformatic tools to estimate taxonomic classifications and relative abundances (Kraken 2 + Bracken 2.5 combination).The Gulf of Finland is characterized by strong environmental gradients associated with eutrophication (Andersen et al., 2015;Villnäs et al., 2019).This contributes to spatially heterogenous benthic macro-communities in terms of diversity and composition in this ecosystem (Bonsdorff, Laine, Hanninen, Vuorinen, & Norkko, 2003).The Gulf of Finland is therefore a well-suited system to investigate if a similar heterogeneity exists in active microeukaryotic communities.

Field sampling
Sediment was collected on board R/V Electra during 2018 September 20-23 in the Gulf of Finland (Baltic Sea) close to the Tvärminne Zoological Station, Finland.A total of seven stations were visited along coastal gradients in depth and OC (0-4 km, 10-45 m water depth; Figure 1).The bottom water in the study areas at the time of sampling was oxic with 7.6-8.7 ml/l O2 measured by oxygen probes equipped on a CTD instrument (full details in Broman, Sun, et al., 2020).The stations were divided into four low % OC shallow sites (stations 11, 12, 15, 16; 1.2-2.8% OC) and three sites with higher % OC and depth (stations 7, 10, 13; 6.2-6.9 % OC), following a station labelling system used during reoccurring monitoring in the Tvärminne region (Table 1).Triplicate sediment cores (labelled A, B, C), retrieved in rinsed acrylic core liners, were collected from each station with a GEMAX twin gravity corer (height: 80 cm, inner diameter: 90 mm).The top 0-2 cm sediment surface layer was sliced into autoclaved 215 ml polypropylene containers (Noax Lab).After slicing, the sediment was directly aseptically homogenized inside the containers and 2 cm 3 sediment transferred into 2 ml cryogenic tubes (VWR) that were immediately flash frozen at -196ºC.The samples were transported on dry ice and stored at -80°C until RNA extraction.The remaining sediment in the 215 ml containers were stored at -20°C for sediment C and N content and pore water chemistry analyses.

Sediment and pore water chemistry analyses
The remaining sediment in the frozen 215 ml containers were thawed, homogenized, and 15 cm 3 sediment from each sample was dried at 60ºC for seven days for C/N analysis.In addition, 20 cm 3 of sediment from each sample was centrifuged at 2200 × g to extract the pore water for ammonium (NH4 + ) and phosphate (PO4 3-) analyses.The dried sediment was ground, homogenized, and 1 cm 3 dry weight sediment per sample stored in a desiccator prior to freeze drying, re-grinding, re-homogenization and treated with HCl to remove inorganic carbon.
Samples were subsequently weighed into tin capsules.Concentrations of total OC and total nitrogen were determined on an elemental analyzer (Flash 2000, Thermo Scientific).The pore water was collected after centrifugation by filtering 10 ml of the supernatant through a 0.45 µm polyethersulfone membrane filter (Filtropur S 0.45, Sarstedt).NH4 + and PO4 3-were determined colorimetrically (Multiskan GO spectrophotometer, Thermo Scientific) and NH4 + analysis followed the modified salicylate-hypochlorite method by Bower and Holm-Hansen (1980), and PO4 3-analysis followed the standard methods for seawater analyses (Grasshoff, Kremling, & Ehrhardt, 2009).NH4 + values were first reported in Broman, Sun, et al. (2020).

RNA extraction and sequencing
Sediment was thawed within minutes inside the cryotubes and ~2 g of material was added into the RNeasy PowerSoil bead tubes and was extracted using the same kit (RNeasy PowerSoil, QIAGEN).After RNA extraction, any remaining DNA was removed with DNase treatment using the TURBO DNA-free kit (Invitrogen), followed by bacterial rRNA depletion using the RiboMinus Transcriptome Isolation Kit (bacteria version, ThermoFisher Scientific).A 2100 Bioanalyzer (Agilent) was used to confirm that no DNA contamination was present in the samples.Library preparation followed the TruSeq RNA Library Prep v2 kit (Illumina) without including the poly-A selection step.This procedure does not include an amplification of a marker gene and therefore avoids PCR limitations common for metabarcoding studies as mentioned in the introduction.The samples were sequenced at the Science for Life Laboratory, Stockholm on a single Illumina NovaSeq6000 S4 lane using paired-end 2 × 150 bp read technology.A full list of sample names, sequences yielded, quality scores, read lengths etc. are available in Supplementary Data 1.
Small subunit (SSU) rRNA sequences were extracted from the quality trimmed data using SortMeRNA 2.1b (Kopylova, Noé, & Touzet, 2012) with the databases supplied with the software.Taxonomic classification was conducted with Kraken2 2.0.7 (Wood et al., 2019) using paired-end reads against the SILVA SSU r132 NR99 (Quast et al., 2013)  Kraken2 uses a k-mer based approach to classify sequences, and a lowest common ancestor (LCA) algorithm to determine where unclassified sequences belong on a taxonomic tree (Wood et al., 2019).To estimate the relative abundance of each taxon Bracken 2.5 was used on the Kraken2 outputs with default settings set to genus level (i.e. a count threshold of 10) (Lu et al., 2017).Bracken 2.5 uses a Bayesian algorithm method to estimate the genus level read abundance (or species, we chose genus for higher accuracy) of Kraken2 sequences classified higher up on the taxonomic tree (Lu et al., 2017;Wood et al., 2019).This is important, because without estimating read abundance to genus the unclassified reads on higher taxonomic levels will underestimate relative abundances of genera (Lu et al., 2017).The Bracken output reports were combined into a biom-format file with the python package kraken-biom 1.0.1 (using parameters: ---fmt hdf5 -max D --min G), and the biom-format file was converted to a tax table using the python package biom-format 2. 1.7 (McDonald et al., 2012).The 18S rRNA eukaryotic data was extracted, normalized as relative abundances (%), and analyzed in the software Explicet 2.10.5 (Robertson et al., 2013).Results for i) Nematoda (NCBI NT classifications, on average 478,331 sequences per sample); ii) Foraminifera (NCBI NT, average 13,913 sequences), and iii) Ciliophora (SILVA, average 774,027 sequences) were extracted and analyzed separately.The NCBI NT database was used for the Nematoda and Foraminifera data because, 1) the SILVA database is known to contain errors in the nematode classifications (Broman et al., 2019;Holovachov, Haenel, Bourlat, & Jondelius, 2017), and the NCBI NT has previously been used to discern differences in nematode communities on a spatial scale in the Baltic Sea (Broman et al., 2019); and 2) the SILVA database gave inaccurate classifications for Foraminifera, resulting in the identification of taxa never discovered in the Baltic Sea (more details in the discussion).

Nematoda functional ecology analyses
Nematode genera were classified into feeding types based on their known buccal cavity morphology in available literature according to Wieser (1953).Each genus was categorized into the four feeding types described by Weiser: 1A) selective deposit feeder, 1B) non-selective deposit feeder, 2) epistrate feeder, and 2B) predator/omnivore.In addition, the maturity index (MI) of each nematode community was calculated to infer changes in the life history characteristics of nematode genera.MI was calculated according to Bongers, Alkemade, and Yeates (1991) by assigning colonizer-persister (cp) values to nematode genera on a scale from 1 to 5 based in available literature.Low cp-values indicate nematode genera that can be classified as colonizers (short life cycle, high reproduction rates, high colonization ability and tolerance to disturbance) while high cp-values represent persisters (nematode genera that display long life cycles, few offspring, low colonization ability and high sensitivity to disturbance).MI could then be calculated from: where ν(i) is the cp-value of genus i and f(i) is the frequency of genus i.

Statistics
Rarefaction curves of sequence counts versus the taxonomic classifications were conducted in the R package vegan 2.5.6 (Oksanen et al., 2018) using the rrarefy function with default settings.Species richness (Chao1) and alpha diversity (Shannon's H) were calculated in the software Explicet 2.10.5 for each taxonomic group (Nematoda, Foraminifera, and Ciliophora).
Before calculating Shannon's H index the data was sub-sampled to the lowest sample size and bootstrapped × 100 (Nematoda 79,815 counts, Foraminifera 2473 counts, Ciliophora 299,504 counts).Non-metric multidimensional scaling (NMDS) plots showing beta diversity were based on the presence/absence (Sørensen dissimilarity index) and Bray-Curtis dissimilarity index (based on relative abundance) using the software past 3.26 (Hammer, Harper, & Ryan, 2001).The difference in read abundance between the high and low OC stations for Nematoda feeding type data was normalized and statistically tested using the R package DESeq2 1.26 with default settings (Love, Huber, & Anders, 2014).The DESeq2 output was plotted using the ggplot2 package in R (Wickham, 2016).Differences between groups on alpha diversity metrics (Chao1, Shannon's H), relative abundance of taxonomic groups, and maturity index for nematodes were tested with univariate statistics conducted in the software IBM SPSS Statistics 26.First, Shapiro-Wilk tests were used to check if the data was normally distributed.
Differences between groups in normally distributed data were tested with One-Way ANOVA tests, while non-parametric data were tested with Mann-Whitney U tests.PERMANOVA tests (9999 permutations) were conducted in the software past 3.26 and used to identify differences in beta diversity between stations, based on presence/absence, relative abundance, and Hellinger transformed data (i.e.square rooted relative abundances) (Legendre & Gallagher, 2001).To investigate if the abiotic variables (% OC, % N, PO4 3-, NH4 + , and water depth) were associated with the community composition canonical correspondence analysis (CCA) was conducted in the R package vegan 2.5.6 with the cca function and plotted using the ggplot2 3.2.1 package.The input data for the CCAs were the measured abiotic variables and relative abundances of the different taxa.Significant associations between abiotic variables and community compositions were tested for CCA axis 1 and axis 2 with PERMANOVA tests (9999 permutations) using the function envfit included in the vegan package.To detect if measured abiotic variables (rather than solely water depth) significantly explained community compositions of the three studied groups the adonis2 function in the R package vegan 2.5.6 was used.The Bray-Curtis dissimilarity matrix for each study group was loaded with the abiotic data and the abiotic variables were added in sequential order after water depth.Mantel tests (Mantel, 1967) of the of Bray-Curtis dissimilarity distances were used to test the correlation with OC by using the function mantel.rtest in the R package ade4 1.7.13 (Dray & Dufour, 2007), after turning the OC data to a distance matrix with the dist function with default settings.

Sediment 18S rRNA community
The most abundant microeukaryotic taxonomic groups in our sediments included nematodes, arthropoda (mainly copepods), rotifers, and single-celled eukaryotes such as Bacillariophyta (mainly diatoms), ciliates, and Kraken2 unclassified eukaryotic sequences that Bracken2 distributed to protists Malawimonadidae and Hemimastigophora (Figure 2a-b and Supplementary Data 3).There was a significant difference in community composition when testing the stations grouped as Low OC against High OC (Sørensen dissimilarity index test (presence/absence data), PERMANOVA, pseudo F = 6.71,P = 0.0001; Figure 2c).This was also significant when tested with Bray-Curtis dissimilarity index based on relative abundance data (PERMANOVA, pseudo F = 4.73, P = 0.0007), as well as Hellinger transformed data (pseudo F = 5.36, P = 0.0001).For this study we focused on three microeukaryotic groups used in biomonitoring: Nematoda (average 4% of all eukaryotes), Foraminifera (average 0.15% of all eukaryotes), and Ciliophora (average 7% of all eukaryotes).

Alpha and beta diversity for nematodes, foraminifera, and ciliates
Rarefaction analyses showed that the majority of the genera had been detected in the samples (Supplementary Figure 1).The species richness Chao1 index and Shannon's H alpha diversity index were significantly lower for Nematoda and Foraminifera at the High OC stations compared to Low OC (Chao1: One-Way ANOVA test for each group; Nematoda, F(1,19) = 32.7,P = 0.000016; Foraminifera, F(1,19) = 57.0,P = 0.0000004; Figure 3a-b 3de).No significant difference in species richness or Shannon's H alpha diversity was observed for Ciliophora when comparing High OC stations with Low OC (Figure 3c).A full list of Shannon's H values is available in Supplementary Data 4.
Beta diversity was also significantly different between stations for all three groups, with the presence/absence Sørensen dissimilarity index test (PERMANOVA) when testing the stations grouped as Low OC against High OC (PERMANOVA) test for each group, Nematoda, pseudo F = 11.4,P = 0.0001; Foraminifera, pseudo F = 25.5, P = 0.0001; Ciliophora, pseudo F = 5.1, P = 0.0001; Figure 4).Similar results were also observed when using the Bray-Curtis dissimilarity index based on relative abundances as well as Hellinger transformed data (Supplementary Figure 2) CCAs based on the relative abundance of genera showed that the measured abiotic variables (water depth, sediment % C and % N, plus pore water NH4 + and PO4 3-) were associated with the High OC stations for all of the three studied taxonomic groups, i.e.Nematoda, Foraminifera, and Ciliophora (Figure 5).The CCA analysis showed that 67 %, 77 % and 66 % of the total constrained inertia for nematodes, foraminifera, and ciliates, was explained with the five environmental variables here studied, respectively.There was also a significant association between all five abiotic variables and the community composition for each studied group (PERMANOVA test, Nematoda, R 2 = 0.76-0.83,P < 0.001; Foraminifera, R 2 = 0.52-0.94,P < 0.05; Ciliophora, R 2 = 0.53-0.85,P < 0.01; Supplementary Data 5).
Moreover, adonis PERMANOVA tests showed that OC was a significant variable determining community composition for all three taxonomic groups, even when accounting for the variance explained by depth (Nematoda pseudo F = 6.06,Foraminifera pseudo F = 12.85, Ciliophora pseudo F = 6.01; all P < 0.001; see Supplementary Data 5 for results of all variables).OC was also tested separately with mantel tests with the Bray-Curtis dissimilarity distances for each of the three taxonomic groups, and was positively correlated with the community composition (Nematoda r = 0.58; Foraminifera, r = 0.76; Ciliophora, r = 0.54; all P = 0.0001).

Differences in Foraminifera community structure
Looking closer at Foraminifera, the genera with a high relative abundance such as Elphidium had a significantly higher relative abundance among the Foraminifera at the High OC stations (66.3 ± 2.7 %) compared to Low OC (16.2 ± 2.1 % (% denote portion of Foraminifera community); Mann-Whitney U test, U = 0, P = 0.000007; Figure 6b).On the other hand, the genus Rhizammina had a significantly higher relative abundance at the Low OC stations (17.9 ± 3.1 %) when compared to High OC (1.0 ± 0.8 %; Mann-Whitney U test, U = 2, P = 0.000027; Figure 6b).In addition, we also detected genera with a low relative abundance that showed a significant difference, although with high variation, between Low OC and High OC stations.

Differences in Ciliophora community structure
Examples of Ciliophora genera that were significantly different between the Low OC and High OC stations included Cryptocaryon that had a significantly higher relative abundance at the Low OC stations (17.4 ± 1.4 % compared to High OC 12.4 ± 1.4 % (% denote portion of Ciliophora community), Mann-Whitney U tests, U = 21, P = 0.018; Figure 6c).Instead, the genus Spirotrachelostyla had a significantly higher relative abundance at High OC (2.9 ± 0.5 %) when compared to Low OC (0.8 ± 0.1 %, Mann-Whitney U test, U = 9, P = 0.00066; Figure 6c).Ciliophora with low relative abundance that had significant difference, although with high variation, between Low OC and High OC stations included e.g.Bresslaua and Epiphyllum having a higher relative abundance at Low OC (2.0 ± 2.5 % and 1.8 ± 0.8 % compared to 0.1 ± 0.1 and 0.9 ± 0.5 % at High OC, respectively), and Zosterodasys having a higher relative abundance at High OC sites (2.8 ± 1.0 % compared to 1.3 ± 1.0 % at Low OC; Mann-Whitney U tests, P < 0.05; Fig. 6).A full list of taxonomic classifications and sequence counts for all three studied groups are available in Supplementary Data 6.

Nematoda functional ecology
The maturity index calculated from classified Nematoda genera showed no difference between High OC and Low OC stations (1.9 ± 0.1 maturity index for all samples, One-Way ANOVA test, Supplementary Data 7).Considering that values closer to one indicate habitat colonizers (and values closer to five indicate habitat persisters) the nematode communities in this study are considered colonizers.Looking closer at the classified feeding type of the nematodes the Genera classified as non-selective deposit feeders (1B, following the classification systems by Wieser (1953)) had a significantly higher number of reads in the High OC stations when compared to Low OC (log2 fold change 1.79, DESeq2 analysis, false discovery rate (FDR) < 0.01; Figure 7).In contrast, the Low OC stations had significantly more genera classified as selective deposit feeders (1A, log2 fold change 1.62) and predator/omnivores (2B, log2 fold change 1.40) (FDR < 0.01 and FDR < 0.05, respectively; Figure 7).A full list of all maturity index and feeding type classifications and their relative abundance per Nematoda genera is available in Supplementary Data 7.

Discussion
In this study we investigated if current sequencing technology and eRNA shotgun sequencing has the power to differentiate changes in biodiversity of multiple microeukaryotes in bulk sediment samples.We focused on nematodes, forams, and ciliates which are useful bioindicators and known to change in diversity and community composition in relation to environmental change (Gong et al., 2015;Ingels et al., 2009;Martins et al., 2015;Pawlowski et al., 2014;Ristau et al., 2015).The results showed a difference in community structure for each of the communities along the OC gradient in the study area.For example, the nonselective deposit feeding nematode genera Sabatieria and Axonolaimus (Schratzberger, Warr, & Rogers, 2007) had a higher relative abundance at the High OC stations.Potentially this could be a beneficial feeding strategy at the deeper stations where the sediment consists mainly of decayed organic particles and bacteria as food (and is reflected in the nematode feeding type analysis; Figure . 7).Sabatieria are typical nematodes found in organic rich sediments, and have been identified in sediments also containing other non-selective deposit feeders such as Daptonema (Armenteros et al., 2009;Broman et al., 2019;Montagna & Harper, 1996;Schratzberger, Warr, & Rogers, 2006).Interestingly, the genera Daptonema and Desmolaimus (also a non-selective deposit feeder (Schratzberger et al., 2007)) had a higher relative abundance at the Low OC stations.The Low OC stations had more nematodes classified as selective deposit feeders and predator/omnivores, suggesting different kinds of food and increased competition for the available food in these sediments.Nematodes of the genus Sabatieria are known to also inhabit deeper layers of the sediment in the Baltic Sea (Nascimento et al., 2008) and it is possible that such a response to increased competition in the top sediment layer influenced the relative abundance of this genus in Low OC sediments.In addition, the chemistry data indicate that the High OC sediments had higher concentrations of dissolved phosphate compared to the Low OC stations, which indicate more reduced conditions and generally a thinner oxic zone (Bonaglia, Deutsch, Bartoli, Marchant, & Brüchert, 2014).
This could be beneficial for Sabatieria which is known to be resistant to low oxygen conditions (Broman, Bonaglia, et al., 2020).These nematode genera (Axonolaimus, Daptonema, and Sabatieria) have previously been detected in other basins of the Baltic Sea using 18S rRNA gene metabarcoding (Broman et al., 2019), and here their presence was confirmed by shotgun sequencing.
The foram genera Elphidium and Rhizammina showed contrasting patterns in the dataset, with Elphidium having higher relative abundance at High OC stations, and Rhizammina at the Low OC stations.Both Elphidium and Rhizammina are known to exist in the south-western Baltic Sea (Frenzel, Tech, & Bartholdy, 2005;Schweizer, Polovodova, Nikulina, & Schönfeld, 2011), and to our knowledge, this is the first study using molecular data to investigate diversity of forams in the Gulf of Finland.Many benthic forams depend on high saline conditions because they build shells (so called tests) with calcium carbonate (Charrieau, Filipsson, Nagai, et al., 2018), while some species instead agglutinate sediment particles (Charrieau, Filipsson, Ljung, et al., 2018).Considering the low saline conditions in our study area it is likely difficult for calcitic forams to fully develop calcified tests.In a study by Charrieau, Filipsson, Ljung, et al. (2018) species belonging to the calcitic foram genera Elphidium and Ammonia were found in the Southern Baltic Sea at slightly higher salinities, but with dissolved tests.It is therefore possible that many of the calcite forming forams found in our study had none, partly developed, or dissolved tests.Previous morphological studies have shown that the community composition of forams change in response to OC enrichment as observed in the north Atlantic (Alve et al., 2016) and Mediterranean Sea (Jorissen et al., 2018).Even though such studies are missing for the Baltic Sea, our data indicate that Elphidium increased in relative abundance to OC enrichment.The morphospecies Elphidium excavatum has been found in OC-rich, brackish sediments in Japan (Takata, Takayasu, & Hasegawa, 2006), however it is not certain that the same species is present in our study.Similarly to nematodes in our study, forams also showed a lower alpha diversity at the High OC stations.This finding is in accordance with previous metabarcoding work by Pawlowski et al. (2014) that also found benthic forams to have a lower alpha diversity and different community composition as a response to high organic matter areas (fish farms, North Atlantic, Scotland).For taxonomic classification of protists, SILVA is one of the recommended options when classifying 18S rRNA sequences (Creer et al., 2016).However, we were still surprised to see many differences in classified Foraminifera genera between the SILVA and NCBI NT databases.For example, SILVA reported a high relative abundance of genera (e.g.Calcarina, up to 35% in the offshore stations) never previously detected in the Baltic Sea (Supplementary Data 6).Almost 100 foram species have been reported from the south-western Baltic Sea, but very few studies investigating forams in the central and north Baltic Sea are available (Frenzel et al., 2005).Hard-shelled (calcitic and agglutinated) forams have low densities in our study area and soft-shelled (organic) forams are not often studied morphologically.The NCBI NT data also reported potential alien species such as Planoglabratella previously detected in shallow New Zealand sediment (Hayward, 1999) and this could also be due to database limitations (Fig. 6).It is therefore possible that the differences we observed between databases are due to a limited number of 18S rRNA foram sequences in the databases.There are specific foram databases such as the PFR 2 (Morard et al., 2015).However, this database is focusing on oceanic planktonic forams which are absent from the Baltic Sea.Nevertheless, we report good results with the NCBI NT database.
Regarding the ciliate community, Stoeck et al. (2018) used metabarcoding and showed that benthic ciliate communities in the vicinity of fish farms (i.e.areas with high organic matter) had a lower alpha diversity and different community composition compared to non-affected reference sites.In our study we also observed a significant difference in ciliate community composition between the Low OC and High OC sites, and although there was a decrease in alpha diversity it was not significant.Neither was there a difference between ciliate genera with a high relative abundance (except for Cryptocaryon).Therefore, the difference observed in beta diversity is likely due to differences in low abundant genera.The higher variance in ciliate alpha diversity (compared to nematodes and forams) indicates that higher replication and associated statistical power is required to detect differences in ciliate diversity between the stations.The ciliate genus Cryptocaryon was more prominent at the Low OC stations.This is a marine genus known to include parasitic species targeting fish (Wright & Colorni, 2002).
However, low-saline (5-7 ppt) variants of Cryptocaryon have previously been described (Yambot, Song, & Sung, 2003).Cryptocaryon-like ciliates have previously been detected phylogenetically in the more saline (~14 ppt) deeper waters of the Baltic Sea (Stock, Jürgens, Bunge, & Stoeck, 2009), and both SILVA and NCBI NT (and manually checking classified sequences via BLAST) confirmed Cryptocaryon-like ciliates in our samples.Potentially because the Low OC stations were located in shallow areas closer to the shore, the Cryptocaryon-like ciliates detected in this study might be adapted to lower salinities (~7 ppt) and related to host organisms residing in these more diverse and euphotic habitats.In addition, considering that the summer heatwave of 2018 was one of the most intense ever recorded in the study area (Humborg et al., 2019), the warmer waters might attract Cryptocaryon-like ciliates which are typically more common at temperate and tropical temperatures (Colorni & Burgess, 1997).As far as we are aware, this is the first time such species have been reported from the Gulf of Finland based on molecular data.Finally, measurement of more geochemistry variables such as iron and sulfate, and sediment microprofiles (e.g.oxygen profiles) could help to further explain differences in nematode, foram, and ciliate diversity and taxonomy between low and high OC stations.
Limitations of this study include the relatively small sample size for RNA extraction (2 g).
A previous study investigating the effect of sample size on diversity estimations of microeukaryotes and metazoans using metabarcoding, found that larger volumes of sediments are necessary to accurately estimate small-scale spatial heterogeneity in biodiversity (Nascimento, Lallias, Bik, & Creer, 2018).Here we used the available commercial extraction kit that could process the largest input of sediment volume.It will be useful for future studies to develop larger kits for eRNA extraction and investigate similar effects of sample size on biodiversity assessment based on eRNA.Nevertheless, even with this sample size we report clear differences in biodiversity of multiplied communities including metazoans along the environmental gradients.We also used a kit to deplete bacterial rRNA in the laboratory, and this might have influenced the eukaryotic rRNA results.However, this influence should be similar among all samples and have a negligible impact in the results here shown.Previous studies focusing on eukaryotic RNA data backed up with microscopy has shown that bacterial rRNA depletion did not change the main findings as determined by microscopy (Broman, Bonaglia, et al., 2020;Broman, Varvara, Dopson, & Hylander, 2017).Organisms contain multiple ribosomes and while this is not an issue when analyzing species richness or beta diversity with a presence/absence index, for community composition organisms with a large number of ribosomes could skew the proportions.This issue also exists in DNA metabarcoding studies, with many eukaryotic organisms carrying multiple genome copies per cell (i.e.polyploidy) (Edgar, Zielke, & Gutierrez, 2014) and for prokaryotes that can also be polyploid (Soppa, 2017).With the shotgun sequencing approach there is not a specific region of the 18S rRNA targeted by PCR and instead all regions are sequenced randomly.This could potentially influence the classification for certain taxa such as Foraminifera where some 18S rRNA regions have been shown to be more precise than others (Pawlowski & Lecroq, 2010).However, in this study the goal was to investigate if changes in biodiversity of multiple communities along environmental gradients could be detected using eRNA shotgun sequencing.For this purpose getting a large number of reads covering as much as possible of the 18S rRNA region for different taxa is likely a benefit.Methods used in this study (and metabarcoding studies as well) relies heavily on the information in the reference databases when classifying taxonomy.In this study we set the lower limit to "genus" during taxonomic classification to compare relative abundance, as species would likely decrease the accuracy.However, we discovered that a large portion of unclassified eukaryotic sequences by Kraken2 had been distributed to protists groups Malawimonadidae and Hemimastigophora.These two groups have been found in soil, as well as freshwater for Malawimonadidae (Adl et al., 2019;Lax et al., 2018).Considering Kraken2 only detected a few hundred sequences for these taxa, while Bracken2 distributed millions of unclassified eukaryotic sequences to these taxa it is possible that this is an effect of the information available in the databases.For example, Hemimastigophora was recently phylogenetically placed outside all eukaryote supergroups (Lax et al., 2018), and this could explain why Bracken2 would distribute unclassified sequences to such taxa.When databases grow with more reference species for microeukaryotes tools such as Kraken2 are likely to become more accurate for both genus and species level.The cost of shotgun sequencing on the latest platform is still quite high (several thousands of USD per Illumina NovaSeq S4 lane, yielding ~2000 million read-pairs) compared to metabarcoding of marker genes (a few thousand USD per Illumina MiSeq flowcell, yielding ~18 million read-pairs; yields are based on information from SciLifeLab, Stockholm).However, there has been a large decrease in sequencing cost over the past 20 years (Wetterstrand, 2020) and if this trend continues, alongside streamlined bioinformatic protocols, large scale eRNA shotgun studies could be a future possibility in biomonitoring programmes.
Shotgun sequencing "catches" all organisms in the sample including both prokaryotes and eukaryotes (Zepeda Mendoza, Sicheritz-Pontén, & Gilbert, 2015).rRNA has a relatively short lifespan in the environment (Blazewicz, Barnard, Daly, & Firestone, 2013) and sediment surfaces in shallow water systems are highly active environments The approach used in this study therefore likely targeted active organisms present in the study area, as well as eRNA derived from other motile organisms that were originally not located in the collected sediment (up to ~13h or longer in biofilm, see Wood et al., 2020).However, there have been studies that indicate that rRNA might be stable for long periods, potentially up to several years in deep sea sediments, therefore potentially failing to reflect live communities better than eDNA in such stable environments (Brandt et al., 2019;Orsi, Biddle, & Edgcomb, 2013).It is possible that rRNA is also prevalent for some time in other sediments including shallow systems.However, because coastal ecosystems are highly active environments rRNA degradation is expected to be faster.eRNA sequencing is therefore a potential useful method to study benthic communities, especially considering that a substantial portion of sediment consists of longlasting dead organic matter (Burdige, 2006).For example, forams are a known microfossil group and with the use of DNA extraction it has been possible to study the ancient DNA of these organism (Lejzerowicz et al., 2013).However, limitations such as rRNA stability, especially below the oxic sediment surface, cannot be ruled out.Nevertheless, for biomonitoring studies changes in biodiversity and taxonomy are studied over time from the same stations and this might be less problematic.Shotgun sequencing of eRNA has been used in a wide variety of marine studies, including investigations of prokaryotic communities (Broman, Sachpazidou, Pinhassi, & Dopson, 2017;Broman, Sjöstedt, Pinhassi, & Dopson, 2017;Klindworth et al., 2014;Urich et al., 2014), marine viruses, (Culley, Lang, & Suttle, 2006), sediment eukaryotic metatranscriptomes (Broman, Varvara, et al., 2017), nematodes in oxygen deficient sediment (Broman, Bonaglia, et al., 2020), old marine groundwaters in the deep terrestrial biosphere (Lopez-Fernandez et al., 2018), and has also been used in similar environments such as soil ecosystems (Urich et al., 2008).Many studies have used eRNA to study prokaryotes (as mentioned above and e.g.Cottier et al., 2018), however there is paucity of studies using eRNA to assess the biodiversity of microeukaryotes in sediment.In addition, active bacterial communities could also be an interesting method to monitor changes in OC content.Although 18S rRNA metabarcoding has gained popularity to investigate such communities (see e.g.Birrer et al., 2018;Comeau, Lagunas, Scarcella, Varela, & Lovejoy, 2019;Rodríguez-Martínez et al., 2020), we have here shown that eRNA shotgun sequencing is also a viable approach to detect differences in diversity and community compositions for multiple communities as a response to different environmental conditions.Even though not directly compared in this study, shotgun sequencing avoids PCR limitations of metabarcoding such as i) PCR primers only targeting certain species; ii) amplifying certain species more than others, and iii) the amount of cycles and type of polymerase used has been shown to influence diversity and community composition (Kelly et al., 2019;Nichols et al., 2018).In addition, eRNA shotgun studies also provide information on all organisms over a large range of trophic stages in the sediment, and it is also possible to study the RNA transcripts of expressed genes to estimate oxidation and reduction processes from prokaryotic metabolism (see e.g.Broman, Sjöstedt, et al., 2017).Finally, common shotgun sequencing bioinformatic pipelines are intricate and include many different software which increases the complexity of the data analysis.Here, conversely, we followed a protocol with few and straightforward steps, including: 1) Quality trimming (removal of Illumina adapters and phiX control sequences, quality trimming, and verifying final quality); 2) extraction of SSU rRNA sequences from the dataset; 3) taxonomic classification of the SSU rRNA sequences using Kraken2 against the NCBI NT and SILVA databases; and 4) estimation of relative abundance at genus level using Bracken2 (for more details on the kraken2+bracken2 combination see Wood et al., 2019).
These new bioinformatic tools make it less daunting and possible to classify large datasets containing hundreds of millions of sequences within minutes to hours which would previously have taken several weeks using traditional aligners.As such this approach is closer to current metabarcoding bioinformatic pipelines with relatively straightforward steps (see e.g. the DADA2 pipeline Callahan et al., 2016).Moreover, ongoing developments in machine learning could make eRNA shotgun sequencing a powerful tool for future biomonitoring programmes.
Such innovative approaches combined with eRNA shotgun sequencing would allow to bypass some of the database limitations when assigning taxonomy as the data can be incorporated into taxonomy-free models (Cordier et al., 2018;Cordier, Lanzén, Apothéloz-Perret-Gentil, Stoeck, & Pawlowski, 2019).

Conclusions
Here we have shown that eRNA shotgun sequencing is a useful tool to study the biodiversity of benthic microeukaryotes.The latest sequencing technology yields tens of million sequences per sample and this makes it possible to investigate the biodiversity of multiple communities.
In our study we focused on three microeukaryotic groups (nematodes, forams, and ciliates).
We were able to detect a decrease in biodiversity for nematodes and forams in sediments with higher OC, when compared to low OC sediments.Moreover, we detected differences in beta diversity for all three groups between the stations along the OC gradient, as well as in the functional ecology of nematodes (i.e.feeding type).Considering that future sequencing technologies are likely to develop and decrease in cost, shotgun sequencing of eRNA to assess biodiversity of benthic microeukaryotes could be a useful method in recurring monitoring programmes.Taken together, eRNA shotgun sequencing and new bioinformatic tools give the opportunity to simultaneously study a large diversity of microeukaryotes within a reasonable time frame.These methods also make it possible to avoid any biases introduced by PCR amplification, and thus captures the whole environmental diversity in the samples.were first reported in (Broman, Sun, et al., 2020).Each circle represents one sediment core.
The P values shows the statistical significance (PERMANOVA) between the abiotic data and community composition when tested between Low OC and High OC stations.

Figure 2 .
Figure 2. Pie charts showing eukaryotic taxonomic groups on the highest level (based on all

Figure 3 .
Figure 3.The boxplots show the species richness Chao1 and Shannon's H alpha diversity index

Figure 4 .
Figure 4. NMDS plots showing the beta diversity of the three studied taxonomic groups in the

Figure 6 .
Figure 6.The stacked bars show the taxonomic classifications for the groups (a) Nematoda,