Pleistocene glacial and interglacial ecosystems inferred from ancient DNA analyses of permafrost sediments from Batagay megaslump, East Siberia

Pronounced glacial and interglacial climate cycles characterized northern ecosystems during the Pleistocene. Our understanding of the resultant community transforma-tions and past ecological interactions strongly depends on the taxa found in fossil assemblages. Here, we present a shotgun metagenomic analysis of sedimentary ancient DNA ( seda DNA) to infer past ecosystem- wide biotic composition (from viruses to megaherbivores) from the Middle and Late Pleistocene at the Batagay megaslump, East Siberia. The shotgun DNA records of past vegetation composition largely agree with pollen and plant metabarcoding data from the same samples. Interglacial ecosystems at Batagay attributed to Marine Isotope Stage (MIS) 17 and MIS 7 were characterized by forested vegetation ( Pinus , Betula , Alnus ) and open grassland. The microbial and fungal communities indicate strong activity related to soil decomposition, especially during MIS17. The local landscape likely featured more open, herb- dominated areas, and the vegetation mosaic supported birds and small omnivorous mammals. Parts of the area were intermittently/partially flooded as suggested by the presence of water- dependent taxa. During MIS 3, the sampled ecosystems are identified as cold- temperate, periodically flooded grassland. Diverse megafauna ( Mammuthus , Equus , Coelodonta ) coexisted with small mammals (rodents). The MIS 2 ecosystems existed under harsher conditions, as suggested by the presence of cold- adapted herbaceous taxa. Typical Pleistocene megafauna still inhabited the area. The new approach, in which shotgun sequencing is supported by metabarcoding and pollen data, enables the investigation of community composition changes across a broad range of taxonomic groups and inferences about trophic interactions and aspects of soil microbial ecology.


| INTRODUC TI ON
Ongoing climate change affects biota at the ecosystem level (Pecl et al., 2017;Tylianakis et al., 2008;Walther, 2010). Amplified arctic warming causes northern boreal and arctic ecosystems to experience more extreme climate change than other ecosystems in the Northern Hemisphere (Miller et al., 2010) fostering the northward extension of shrubs and trees into tundra ecosystems (Myers-Smith et al., 2019;Tape et al., 2006). Due to the complexity of biotic and abiotic interactions, predictions of species distribution patterns at ecosystem scale are difficult to establish (HilleRisLambers et al., 2013;Walther et al., 2002).
Macrofossils mostly originate from plants growing at the study site and represent local conditions. However, the number of identifiable macrofossils is affected by taphonomy and preservation, which can vary strongly among taxa (e.g., Foote & Raup, 1996;Kienast et al., 2001;Tomescu et al., 2018). Similar issues affect vertebrates (Behrensmeyer, 1988;Gardner et al., 2016). Pollen records reflect regional ecological changes well, but information is constrained by different pollen productivities and transportability (Birks et al., 2012;van der Knaap, 1987). Moreover, the taxonomic resolution of pollen is limited, typically in trees and shrubs to genus/subgenus, but more variably from species to family in herbaceous taxa (Beug, 1961;Moore et al., 1991). In a rigorous comparison, pollen and sedaDNA (metabarcoding) from a lakesediment core (Polar Urals; Clarke et al., 2019) yielded a similar floristic richness, 114 and 137 vascular plant taxa, respectively, but each method featured a different configuration of taxa. How sedaDNA represents vegetation at a sampling site requires further work as both taphonomy, for which investigation methods are still in their infancy, and geomorphological processes, which are site dependent, impact the scale of the sedaDNA signal (Alsos et al., 2018;Giguet-Covex et al., 2019). However, there is increasing evidence that DNA from lacustrine sediments reflects the catchment vegetation, while DNA from terrestrial sediments (e.g., palaeosols) provides a highly local signal (Edwards, 2020).
Here, we present data showing ecosystem-wide changes between Pleistocene glacial and interglacial intervals in East Siberia inferred from sedaDNA shotgun sequencing, plant metabarcoding, and pollen analysis. Five sediment samples from the Batagay megaslump are analyzed. The dating of samples is imprecise: the oldest dates from the period relating to MIS 17 to MIS 16 (~650 kyr), another from MIS 16 to MIS 7 (~600 to ~200 kyr), the third to MIS 3 (~50 kyr), and the two youngest samples date to MIS 2 (~27 and ~23 kyr).

| Site description
The Batagay megaslump (67.58°N, 134.77°E) is located on a hillslope underlain by permafrost just of the Yana River in northern Yakutia, near the town of Batagay (Figure 1). It is the world's largest known retrogressive thaw slump. The slump formed during recent decades and was ~1.8 km long and >0.8 km wide in 2019 (Kunitsky et al., 2013;Murton et al., in revision). More than 60 m deep, it exposes a discontinuous sequence of Middle Pleistocene to Holocene permafrost deposits ranging in age from >MIS 6 ( Ashastina et al., 2017) and even >MIS 17 to 16 (Murton et al., 2021) to MIS 1.
The climate in the region of the Yana Highlands is continental subarctic (defined by K̈oppen et al., 2011) and characterized by low precipitation and the largest seasonal temperature range globally.
The mean air temperature between 1988 and 2017 was −40°C in winter (December to February) and 13.7°C in summer (June to August). The mean annual precipitation was 203 mm. Both air temperature and precipitation have increased at Batagay since the midtwentieth century (Murton et al., in revision).
The Batagay megaslump is located within the northern taiga vegetation zone of the upper Yana floral district (Isaev et al., 2010).
The modern vegetation is dominated by larch (Larix gmelinii) taiga.
Siberian dwarf pine (Pinus pumila) is also common around the study site. The local flora includes shrubs such as Alnus, Betula, and Salix and dwarf shrubs such as Ledum, Vaccinium, Arctous, and Empetrum.
The ground is moist and covered with a thick layer of lichens and mosses. Only a few grass species and herbs such as Polygonum, Corydalis, and Pedicularis occur (Ashastina, 2018). The local fauna is typical for Yakutian taiga comprising small mammals such as arctic hare (Lepus arcticus) and Siberian chipmunk (Eutamias sibiricus), which are prey for birds such as the northern goshawk (Accipiter gentilis) and the golden eagle (Aquila chrysaetos). Other predators include brown bear (Ursus arctos), lynx (Lynx lynx), and smaller species such as mink and wolverine (Mustelidae family). Wolves (Canis lupus) predate bigger prey such as reindeer (Rangifer tarandus) and elk (Alces alces). Insect-eating birds come to the taiga to breed, whereas seedeaters or omnivorous birds such as finches (Fringillidae), sparrows (Passeridae), and crows (Corvidae) are present all year round.

| Fieldwork and subsampling
During fieldwork in late July and early August 2017, we studied several cryostratigraphic units of the Batagay megaslump and took permafrost sediment and ground-ice samples for cryostratigraphic, sedimentological, geochemical, stable-isotope, chronological, and palaeoecological analyses. Before sampling, the exposure was cleaned and described. Ten samples were taken for sedaDNA analysis using a battery-driven hand-drill or hammer and axe cleaned with bleach between each sample. Samples were placed in two whirlpack bags, stored for several hours in a plastic container placed directly on the permafrost table, covered by a thick moss layer, and then moved to a freezer. Samples were kept frozen during transportation to the DNA laboratory in Potsdam.
Five samples were selected for this study based on the cryostratigraphic and chronological data (Table 1; Figure S1).
Subsampling took place in a climate chamber, in a different building than that housing the molecular genetics laboratories. The temperature was set to −10°C to prevent thawing of the core material.
The subsampling procedure was carried out following the protocol described by Zimmermann, Raschke, Epp, Stoof-Leichsenring, Schwamborn, et al. (2017) using the same equipment, treatments, and facilities. Ten subsamples were collected: two per sample for both sedaDNA and palynological analyses and stored at −20 and +4°C, respectively.

| Chronology
The ancient permafrost exposed at the Batagay megaslump is the oldest known in northern Eurasia (Murton et al., 2021). The pilot chronology of the exposed permafrost sediments is based on four dating techniques: radiocarbon dating of organic material, optically stimulated luminescence (OSL) dating of quartz grains, postinfrared infrared (pIRIR) luminescence dating of K-feldspar grains, and 36 Cl dating of ice wedges. Estimated ages for the sampled cryostratigraphic units are provided in Table 1 (for details, see Murton et al., 2021). The lower ice complex, hosting the sample B17-D3, developed at least 650 kyr ago, likely at some time within MIS 17 or 16.
The lower sand unit, discordantly overlying the lower ice complex, was formed at some time between MIS 16 and MIS 6 but is likely nearer in age to the latter (Ashastina et al., 2017). Sample B17-D5 was taken from the lowermost part of this unit and may be partially  Figure S1). Sample names within the manuscript consist of the sample ID and its associated MIS. In this way, we provide age context to the reader, even if we constrain the age information later on.
The complete pollen record is available in Table S1. The extracts with a concentration lower than 3 ng/μl were concentrated with GeneJet PCR purification kit (Thermo Fisher) and samples were diluted to a final concentration of 3 ng/μl.

| Metabarcoding approach
The PCRs were performed with modified trnL g and h primers (Taberlet et al., 2007;

| Shotgun approach
The DNA libraries were prepared following the single-stranded DNA library preparation protocol of Gansauge et al. (2017)

| Metabarcoding
Filtering, sorting, and taxonomic assignments of the metabarcoding sequences were performed with the OBITools package (Boyer et al., 2015). With the illuminapairedend function, we merged forward and reverse reads, and demultiplexing and sample sorting were performed with the ngsfilter function. With the obigrep command, sequences shorter than 10 bp and less than 10 read count were excluded. Duplicated sequences were merged with obiuniq and cleaning of potential PCR or sequencing errors was performed with the obiclean function. Two reference databases were used for taxonomic assignments. The first one, ArctBorBryo, is based on the quality- Final taxonomic assignments were determined by selecting the best identity given by EMBL or the ArctBorBryo database; if both assignments showed the same sequence identity, the taxonomic name was selected from ArctBorBryo database because of its specificity to arctic and boreal vegetation. Only sequences assigned with a best identity of 100% and present in at least two replicates of a sample or two different samples were used for this study. The sequences assigned to a taxonomic level higher than family were removed from the dataset. Sequences were also removed if more than 0.2% of their total read counts were present in the extraction blanks TA B L E 2 Summary of reads assigned with kraken2 to main taxonomic groups  (Table S2). Sequence data of the PCR replicates were merged to represent samples and sequences assigned to the same taxon were merged.
Before proceeding to the data analysis, a rarefaction analysis was performed based on the minimum number of sequence counts (n = 84,348, for sample B17-D3_MIS17-16) to normalize the total counts of each sample (https://github.com/Stefa nKrus e/R_Raref action). The complete metabarcoding record, before and after rarefaction, is available in Table S2.

| Shotgun
Demultiplexed FASTQ files obtained from the two sequencing runs were quality checked, trimmed, and paired-end merged using fastp (Chen et al., 2018). Statistics of both sequencing runs are reported in Table S3. We then merged both sequencing run files and proceeded to taxonomic classification using kraken2 (v. 2.1.2, Wood et al., 2019) against two different databases.
The first database used was the non-redundant nucleotide database (nt) from the National Center for Biotechnology Information (NCBI) (ftp://ftp.ncbi.nlm.nih.gov/blast/ db/FASTA/ nt.gz; downloaded in May 2020), and the NCBI taxonomy (retrieved via the kraken2-build command). It was used with a confidence threshold of 0.05, first, to differentiate Eukaryota from Prokaryota and Viruses and, second, for the taxonomic assignment of Prokaryota (Bacteria and Archaea) and Viruses.
The sequences previously unassigned and assigned to Eukaryota against the nt database were extracted and taxonomically reassigned using kraken2 with default confidence parameters against a second, customized, database of curated full chloroplast and mitochondrion genomes from the NCBI Reference Sequence Database (Refseq, https://www.ncbi.nlm.nih.gov/refse q/, 10,010 sequences downloaded in December 2020).
Using the nt database to taxonomically assign our reads to Prokaryota and Viruses and the curated Refseq database for Eukaryota, we created different subsets for working taxonomic groups. We included bacterial families (34,521,704 reads assigned to a family), Archaea (317,626 reads), and Viruses (42,112 reads). For the Eukaryota, we investigated families from Viridiplantae (435,733 reads), Fungi (348,619 reads), and selected terrestrial classes representing Metazoa with at least 0.5% of total metazoan reads (6887 total reads): Insecta (1493 reads), Mammalia (294 reads), and Aves (29 reads). Only families that represented at least 0.5% of the total reads assigned to each subset were kept. The only exception was for the Viridiplantae, where we kept all families that represented at least 0.2% of total reads assigned, the rationale being to capture as many as possible of the families detected via the other methods. An overview of the read count and relative proportion of the selected families is summarized in Table 2.
If the abundance of reads detected in the blanks for an assigned family exceeded 1% of its total reads, this family was removed from the dataset. Furthermore, Primate reads were removed from the dataset as probable human contaminants. Finally, as shotgun sequencing almost inevitably yields "unlikely" taxa, only the families likely to be present in the study area were kept; therefore, we excluded metazoan aquatic classes and aquatic families of Mammalia.
Detailed information about reads present in the blanks and the families excluded is reported in Table S3.
Resampling to the sample with the minimum read counts was performed with the rarefy function of the "vegan" R package (v. 2.5-7 Oksanen et al., 2020) using R v. 3.6.1 (R Core Team, 2019) for all the subsets, except the metazoan ones. Because few reads were assigned to the selected metazoan classes, differences in families detected per sample are reported instead of relative abundance, as was done in other phyla. Detailed information about lower taxonomiclevel taxa detected from selected families is reported in Table S3.

| Damage pattern analysis
When performing data analysis on a shotgun sequenced dataset, we were particularly careful that the investigated signal comes from the ancient DNA of past organisms and not from modern contamination.
Higher nucleotide misincorporation (cytosine deamination or C to T substitution) rates or DNA damage patterns can be observed on the 5′-overhang and statistically estimated. Here, we used the mapDam- Finally, the damage pattern investigation confirmed the ancient origin of our shotgun sedaDNA signal. Indeed, we identified damage patterns for all investigated subsets except Metazoa as not enough reads were aligning to the reference genomes and no damage pattern investigation was possible. A detailed description of the tested signal is available in Figure S2.

| Plants (Viridiplantae)
All the palynological, sedaDNA metabarcoding, and shotgun proxies Other woody taxa, such as Betulaceae (Alnus, Betula) in the pollen record and sequences assigned to Menispermaceae in the shotgun data, were also abundant in both interglacial spectra. The three methods also detected numerous herbs including Poaceae and other families typical of steppe communities, such as Asteraceae (Artemisia). Rosaceae was also detected with the metabarcoding approach and Solanaceae with the shotgun approach in B17-D3_ MIS17-16 and B17-D5_MIS7.
The three methods used revealed that the glacial spectra had a high proportion of non-woody taxa, in particular, a high relative abundance of Asteraceae and Poaceae. B17-D10_MIS3 was particularly rich in Plantaginaceae (Plantago) and Boraginaceae (Eritrichium) detected both using metabarcoding (13.4% of Plantaginaceae, 6.7% of Boraginaceae) and shotgun (13.5% of Plantaginaceae, 2.5% of Boraginaceae). Both B17-D7_MIS2 and B17-D6_MIS2 samples were characterized by a high abundance of Cyperaceae, Caryophyllaceae, and Fabaceae (including Astragalus and Oxytropis).

| Mammals (Mammalia)
Overall, this class had few reads assigned to family level, especially for the deepest and oldest sample (B17-D3_MIS17-16): six reads assigned to selected families. Selected families were detected even at higher confidence thresholds (--confidence 0.5 with kraken2), to taxa such as Rangifer (Cervidae), Bison (Bovidae), Equus (Equidae), Prionodon (Viverridae), and even Mammuthus (Elephantidae) but almost no exotic terrestrial taxa were found (Table S3). In addition, among the few metazoan selected families detected, three (out and insectivores in the hedgehog family (Erinaceidae).

Evidence of larger mammals, including horses (Equidae) and
cervids (Cervidae), occurred again in the two samples dated MIS2.
Predators such the Canidae and even the Felidae were detected.

| Birds (Aves)
Only very few reads were assigned to this class and no representative of the selected families was present in all five samples (Figure 2b).
In B17-D10_MIS3, the woodland tit family of Paridae was detected. We also detected sandpipers (Scolopacidae) which, like Paridae, need an availability of invertebrates as a food source in addition to water; a need which is shared by herons (Ardeidae) that were also detected. Ground-dwelling rails (Rallidae) were detected too.
In B17-D7_MIS2 and B17-D6_MIS2, the avian community was similar, characterized by ground-dwelling birds from Phasianidae.
Birds of prey such as falcons (Falconidae) in B17-D7_MIS2 and eagles (Accipitridae) in B17-D6_MIS2 were also detected. In addition, some bird families that need water habitat such as sandpipers (Scolopacidae) or cranes (Gruidae) were detected in B17-D6_MIS2.
Lastly, families that typically need proximity to woodlands were detected in both samples, such as Paridae in B17-D7_MIS2 or doves and pigeons (Columbidae) in B17-D6_MIS2.

| Insects (Insecta)
Reads were assigned to Insecta in every sample, even at higher thresholds (--confidence 0.5). For the three most recent samples (B17-D7_MIS2, B17-D6_MIS2, and B17-D10_MIS3), between 137 and 374 reads were assigned to selected Insecta families, while only 73 were assigned to the oldest sample, B17-D3_MIS17-16, and 385 to B17-D5_MIS16-6. Culicidae, the mosquito family, and Tettigoniidae, bush crickets, were the only two families (out of the 31 selected) to be detected in all samples. In terms of insect communities, a clear difference was observed between the three most recent and the two most ancient ones (Figure 2b).
In B17-D5_MIS16-6, we also observed a high abundance of skippers (Hesperiidae), butterflies (Nymphalidae), and moths (Yponomeutidae). Several families that were only present in this sample indicate the local presence of dead wood, for example, flat bark beetles (Cucujidae) and termites (Rhinotermitidae), while the abundance of mayflies (Siphlonuridae) suggests a wetter environment. This sample is also the only one in which we detected skin beetles (Dermestidae).

| Prokaryotes (Bacteria, Archaea) and Viruses
We identified a core community consisting of Actinobacteria (especially the soil-associated families Nocardioidaceae, Streptomycetaceae, Microbacteriaceae, and Mycobacteriaceae) and Haloarchaea (Figure 2c). Beside this shared community, we detected several distinctive patterns along the depositional sequence that likely reflect changes to the microbial community in response to environmental conditions.
In the most ancient sample (B17-D3_MIS17-16), Actinobacteria (Micrococacceae; 8.5% of all selected families of bacteria), particularly the genera Arthrobacter and Pseudoarthrobacter (Table S3), and Sphingomonadaceae (4.5%), had a higher relative abundance when compared to younger samples. Halophilic archaea were also well represented in the oldest samples, especially by Natrialbaceae

| DISCUSS ION
The Batagay megaslump data date back discontinuously to MIS 17-16 (~650 kyr) or older. Hence, this study presents one of the few records from northern Eurasia that reach back to the Middle Pleistocene Ashastina et al., 2018;Tarasov et al., 2005;Wetterich et al., 2019). It is also one of the first to successfully investigate sedaDNA as old as MIS 17 (Parducci, 2019) and identify ancient DNA damage patterns for several representatives of investigated phyla supporting the signal's authenticity ( Figure S2).
Furthermore, we provide unique snapshots of Pleistocene glacial and interglacial biota that include all biological kingdoms. Previous shotgun sequencing investigations have been limited to fewer kingdoms (e.g., Murchie et al., 2021;Pedersen, 2016). Our results indicate distinct ecosystem-wide differences between forested (interglacial) and non-forested (glacial) intervals (Figure 3).  warm stages (i.e., interglaciations) by intense permafrost degradation with widespread thermokarst (e.g., Kienast et al., 2011;Reyes et al., 2010). This suggests that other interglacial and glacial deposits between MIS 17 and MIS 7 may not have been preserved. Further dating is needed to constrain the chronology of the sediments exposed in the Batagay megaslump.

| Interglacial communities
The  (Bakker et al., 2016;Smith et al., 2016). This confirms the hypothesis that large herbivores play a role in maintaining forests as a mosaic landscape composed of closed canopy woodland and open areas (Vera, 2000). Our data indicate that forest-dwelling taxa including viverrids (Viverridae) were common during MIS 7, as well as smaller taxa from the squirrel family (Sciuridae). The high proportion of insects in this sample includes forest-specific families such as flat bark beetles (Cucujidae) and termites (Rhinotermitidae) supporting the inference of forest conditions. Furthermore, the presence of parasitic insects such as Trichogrammatidae suggests a high host density, that is, other insects or mammals (Abrams, 2000).
In the MIS 7 sample, the microbial community is rather distinctive, owing to the marked increase of taxa involved in the degradation and cycling of organic matter in soil, which include the fungus Pseudogymnoascus sp. and bacteria such as Nocardioidaceae and Clostridia. Members of Pseudogymnoascus are associated with decaying roots or plants, especially in boreal forests (Sigler et al., 2000).
They can form mycorrhizal associations with arctic shrubs, which align with the increased presence of shrubs and woody taxa in MIS 7 (Semenova et al., 2015). Similarly, Nocardiodaceae of the genera Nocardioides, Corynebacterium, and Pseudonocardia are aerobic saprophytic bacteria, considered to be consumers of organic material.
Many, especially Nocardioides, secrete extracellular enzymes, and degrade and metabolize a wide range of natural organic compounds, thus being important players in the turnover of organic matter in ecosystems, including cold ones (Babalola et al., 2009;Perez-Mon et al., 2020). They have a copiotrophic lifestyle, implying that they are fast-growing and metabolically versatile, and are consequently fast responders to increased nutrient availability (Perez-Mon et al., 2020).
Clostridium is a dominant bacterium of the permafrost subsurface environment and related DNA sequences have been detected in ancient samples up to ~1 Myr (Burkert et al., 2019;Liang et al., 2019).
It can hydrolyze biopolymers such as cellulose, lignocellulose, and lignin, in deep sediments under anoxic conditions and produce alcohols, organic acids (e.g., lactate, acetate, butyrate), and hydrogen as metabolic by-products (Ueno et al., 2016). These compoundsespecially lactate-can then be fermented by Propionibacterium spp.
to propionate, acetate, and carbon dioxide, which can be assimilated by methanogenic archaea (Methanosarcinaceae) with a final synthesis of methane (Drake et al., 2009). Thus, these data suggest high organic matter turnover going on during MIS 7, both at the surface and subsurface. In permafrost and cold ecosystems generally, this is directly related to an increase in dissolved organic carbon mobilized through soil horizons, which is a consequence of relatively high ambient temperature (Bracho et al., 2016;Selvam et al., 2017). This may well mean that climate and environmental conditions were mild during this period.

| Glacial communities
Our data clearly indicate that herb communities (mostly Asteraceae, Poaceae, Cyperaceae) dominated the vegetation in the Batagay area under glacial conditions during MIS 3 and MIS 2. This is in line with the palaeo-vegetation reconstruction based on plant macrofossils and pollen previously reported for this site .
The data can represent trophic interactions (Krebs et al., 2014). indicates the availability of water in the area, which points to at least an intermittent but substantial moisture supply. This is in accordance with an increased presence of Flavobacterium spp. and other minor taxa (Tenacibaculum, Formosa, Cellulophaga, Lacinutrix), which are considered as "first responders" to phytoplankton blooms due to their activity in degrading algal cells and detritus (Williams et al., 2013;Zeder et al., 2009). This may relate to spring bloom of snow algae or algae in permafrost polygonal or thaw ponds during the cold MIS 3.
The reconstructed glacial mammoth steppe ecosystems lack modern analogs. It is hypothesized that the extinct megaherbivores played a critical role in Pleistocene ecosystems. They directly affected the abiotic component of the system by functioning as "nutrient pumps" (Gross, 2016), allowing distribution of nutrients in the ecosystem. The community structure of the mammoth steppes can be compared to that of the modern savannah (Zimov et al., 2012).
Indeed, the megafauna taxa, by their trampling and sapling consumption, may, during milder climatic intervals, have suppressed colonization by trees, keeping the open vegetation largely intact (Bakker et al., 2016;Dale Guthrie, 2001), with the exception, perhaps of woodland patches in habitats with a favorable mesoclimate (Chytrý et al., 2019;Edwards et al., 2014

| Potential and limitations of the sedaDNA shotgun approach applied to ancient permafrost sediments
Using the sedaDNA shotgun approach, we were able to identify taxa from multiple kingdoms and infer trophic structure and biotic interactions at the ecosystem level. Information about vegetation was in some cases confirmed by the more established methods of metabarcoding and pollen. However, we faced some challenges, in particular the handling of taxonomic assignment to optimize the ratio between false negative and false positives (Guillera-Arroita et al., 2017). Highly curated databases are often used to limit the biases of false positives during taxonomic assignments via filtering out unlikely taxa from the database (Pedersen, 2016). In our opinion, such methods are too sensitive to false negatives, particularly with respect to groups where we have only limited knowledge on the past biota and which are poorly presented in the databases. Therefore, we used non-curated databases and worked with likely families to allow the reconstruction of communities to be as broad as possible, limiting the false negatives but reducing the impact of false positives with a taxonomic assignment at higher taxonomic levels.
Furthermore, we opted for a more stringent taxonomic assignment using kraken2 v.2.1.2, which requires 2 hit groups (set of overlapping k-mers with the same minimizer) in comparison to the previous version kraken2 v.2.0.7-beta requiring only 1 hit group. While this leads to fewer reads assigned especially for very short DNA fragments (i.e., as those originating from old sediment up to MIS17), it minimizes the rate of false positive, thus increasing the overall confidence that the taxonomic assignments correspond to real signals.
This would also provide a stronger support to the reads assigned, for example, to Metazoa, which are known to be underrepresented in sedaDNA metagenomics when compared to other taxonomic groups as reported also by others .
To overcome the technical challenges of the metagenomics analyses of sedaDNA and further reduce taxonomic assignment biases such as false positive rates, one could improve the quality of metagenomic reference databases that are incomplete, contaminated, biased toward modern taxa, domain specific and have a skewed taxonomic representation (Breitwieser et al., 2019;Steinegger & Salzberg, 2020). The use of other tools is also advised, such as alignment-based classification with MALT (MEGAN Alignment Tool) or de novo assembly of contigs and investigation of metagenomeassembled genomes (MAGs; Herbig et al., 2016;Hübler et al., 2019;Sangwan et al., 2016) but they are still computationally very intense and will need further optimization for use with large databases.

| CON CLUS IONS
We reconstructed the palaeoecology associated with permafrost sediments from the Batagay megaslump in eastern Siberia using se-daDNA shotgun sequencing, metabarcoding, and pollen analyses.
We inferred ecosystem-wide information on glacial and intergla-

CO N FLI C T O F I NTE R E S T
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.