DNA metabarcoding for biodiversity monitoring in a national park: Screening for invasive and pest species

DNA metabarcoding was utilized for a large‐scale, multiyear assessment of biodiversity in Malaise trap collections from the Bavarian Forest National Park (Germany, Bavaria). Principal component analysis of read count‐based biodiversities revealed clustering in concordance with whether collection sites were located inside or outside of the National Park. Jaccard distance matrices of the presences of barcode index numbers (BINs) at collection sites in the two survey years (2016 and 2018) were significantly correlated. Overall similar patterns in the presence of total arthropod BINs, as well as BINs belonging to four major arthropod orders across the study area, were observed in both survey years, and are also comparable with results of a previous study based on DNA barcoding of Sanger‐sequenced specimens. A custom reference sequence library was assembled from publicly available data to screen for pest or invasive arthropods among the specimens or from the preservative ethanol. A single 98.6% match to the invasive bark beetle Ips duplicatus was detected in an ethanol sample. This species has not previously been detected in the National Park.


| INTRODUC TI ON
The worldwide decline in biodiversity currently presents an urgent challenge facing humanity, and slowing down or halting this decline is an objective of broad international political agreement (Thomsen & Willerslev, 2015). A major barrier to achieving this objective is the lack of knowledge of biodiversity states and patterns on a global scale (Geijzendorffer et al., 2016;Lindenmayer et al., 2012).
A well-designed monitoring effort should provide an early warning of changes in the ecosystem which could otherwise become problems that are difficult or impossible to remediate (Bohmann et al., 2014;Lindenmayer et al., 2012). One such change is the introduction of animal and plant species to non-native geographical areas.
Accurate, rapid identifications of invasive species are needed to better manage the risks associated with alien species. An estimated 1% of all neozoans and neophytes become invasive with serious economic impacts (Meyerson & Reaser, 2002;Williamson, 1996). Some taxa which are innocuous or only minor pests in their native regions have unforeseen consequences after arriving in new areas lacking microbial control, competition or predators. For example, of the six most serious forestry pests introduced in North America, only the European gypsy moth had pest status in its indigenous range (Cock, 2003). In New Zealand, the introduced painted apple moth, Orgyia anartoides (Walker, 1855), from Australia was predicted to cause €33-205 million in damage if it was not eradicated (Armstrong & Ball, 2005).
Traditional biodiversity monitoring has relied on visual observation and identification of species and counting of individuals. These efforts may be hampered by a lack of available taxonomic expertise for morphological identifications, as well as nonstandard sampling techniques (Beng et al., 2016;Corlett, 2017;Ji et al., 2013;Thomsen & Willerslev, 2015). Towards the aim of fulfilling an urgent need for accurate large-scale biodiversity monitoring, molecular methods have been applied in recent years, particularly since the advent of DNA barcoding (Hebert, Ratnasingham, & de Waard, 2003). DNA barcoding (Hebert et al., 2003), the characterization of sequence variation in a standard DNA fragment, is a broadly applicable and objective method, which increases the speed and taxonomic resolution of specimen identification as well as reducing costs. In this way, DNA barcoding and, more recently, metabarcoding (Hajibabaei, Shokralla, Zhou, Singer, & Baird, 2011)-a process by which genetic material is extracted from mixed or bulk samples, amplified, sequenced by high-throughput sequencing (HTS) and analysed holistically-assist in augmenting biodiversity monitoring efforts (Ji et al., 2013).
DNA barcoding and metabarcoding also permit species-level identifications when only eggs, larvae or parts of specimens are available for analysis. These may be intercepted at borders (e.g. wooden pallets at airports, ports, railway stations) as they are transported by vectors or accidentally by humans, such as in the ballast waters of ships, or with animals and plants in the food trade (Borrell, Miralles, Do Huu, Mohammed-Geba, & Garcia-Vazquez, 2017). For these reasons, HTS has been considered the ideal method for early warning of invasive species (Comtet, Sandionigi, Viard, & Casiraghi, 2015).
In terrestrial ecosystems, macroinvertebrates are often stored directly in ethanol following their collection. DNA can subsequently be harvested either directly from the specimens or from the preservative.
Maceration of the specimens followed by subsequent extraction of DNA from a subsample of the homogenate is commonly practised (Yu et al., 2012), and it is probably both the simplest and the most effective way of securing a representative DNA extract from a bulk sample for subsequent metabarcoding (Elbrecht et al., 2017). However, there is a growing need to integrate sequence-based with morphological research (Silva-Santos, Ramirez, Galetti, & Freitas, 2018), and requirements to keep specimens intact for subsequent morphological control sometimes exist. Therefore, the efficiency and effectiveness of various nondestructive methods of sample preparation and DNA extraction of mixed samples for metabarcoding is a subject of ongoing research.
Additionally, an issue impacting the ability of metabarcoding to recover sequences representing the total biodiversity of a holistically homogenized sample is the bias in primer competition due to unequal specimen size (Elbrecht & Leese, 2015;Elbrecht et al., 2017;Leray & Knowlton, 2015). Larger specimens have more biomass and thus more DNA to contribute to lysed tissue pools. Therefore, larger individuals become overrepresented in sequencing results, and smaller ones underrepresented, increasing the risk of failure to detect taxa with small body sizes. Nondestructive ethanol-based DNA extraction methods have been recommended for their potential to provide solutions to sampling and vouchering challenges of metabarcoding (Hajibabaei, Spall, Shokralla, & van Konynenburg, 2012); and specifically, an ethanol filtration method has been shown to exhibit weak or even no correlation between specimen biomass and read numbers (Zizka, Leese, Peinert, & Geiger, 2019), thus potentially remediating the size-bias problem. As an objective of the present study is qualitative biodiversity analysis of mixed samples of invertebrates, we decided to supplement the standard homogenized tissue DNA extraction method with ethanol-based methods in 2018, in order to improve taxon recovery rates. The aims of the present study are to (a) perform biodiversity analysis comparing collection sites in and around the Bavarian Forest National Park (Nationalpark Bayerischer Wald, NPBW) and in two study years; and (b) construct a custom database of potential pest and invasive arthropod species in Germany based on public data sets and literature, and use it to screen our samples for these taxa.
The results reported in this study derive from two major DNA barcoding campaigns: "Barcoding Fauna Bavarica" (BFB, www. fauna bavar ica.de, Haszprunar, 2009) and the "German Barcode of Life" project (GBOL, www.bolge rmany.de, Geiger, Astrin, et al., 2016), which aim to establish a DNA barcode reference library for all German species. Since their initiation in 2009, DNA barcodes for more than 23,000 metazoan species in Germany have been assembled. Through the analysis of more than 250,000 specimens, the SNSB -Bavarian State Collection of Zoology (ZSM, see www.barco ding-zsm.de) has made a major contribution to parameterization of the global DNA barcode library maintained in the Barcode of Life Data System (BOLD, www.bolds ystems. org, Ratnasingham & Hebert, 2007). Currently, the DNA barcode library created by researchers at the ZSM represents the second-most comprehensive library of any nation, with good coverage for Coleoptera, Diptera, Heteroptera, Hymenoptera, Lepidoptera, Neuroptera, Orthoptera, Araneae and Opiliones, and Myriapoda (see Table 1).

DNA extraction from preservative ethanol
For extraction of DNA from the preservative ethanol, we followed protocols employed by Hajibabaei et al. (2012). This evaporative ethanol technique was performed on five samples (1 May to 1 July) from each of the nine traps in 2018. A 50-ml aliquot of preservative ethanol was taken from each bottle. From this, two 1-ml aliquots were placed into Eppendorf tubes and allowed to dry overnight at 56°C. Fifty microlitres of molecular water was added the next morning, and the tubes were vortexed. Afterwards, DNA extraction was performed on the entire 50-µl sample using the DNeasy Blood and Tissue kit.
For another five samples (trap T3-50B 2018; 2 July, 1 August, 2 August, 1 September, 2 September II) a 50-ml aliquot of ethanol was used for filtration of DNA and tissue residuals using analytical test filter funnels (0.45 µm, Fisher Scientific) equipped with a water jet pump. After ethanol was filtered, the filter funnels were lysed overnight at 56°C. DNA extraction was performed using the DNeasy Blood and Tissue kit following the manufacturer's instructions and eluted into 50 µl of molecular-grade water.

TA B L E 2 Locations of the nine Malaise traps deployed in this study in 2016 and 2018
subsequent DNA extraction. PET bottles (500 ml) were filled with sufficient amounts of lysis mixture (9:1 insect lysis buffer/Proteinase K) and incubated overnight at 56°C. For DNA extraction, 1 ml of the lysate was used following the above-mentioned methods using the DNeasy Blood and Tissue kit. The remaining bulk sample was then dried, and the residual insect lysis buffer was discarded. Samples were then homogenized as described in the Section 2.2.1 above. Amplification success and fragment lengths (~350 bp) were observed using gel electrophoresis on a 1% agarose gel.

| Pre-processing and clustering of sequence data
All FASTQ files generated were combined although they were sequenced on separate runs throughout the study period. Sequence processing was performed with the vsearch version 2.4.3 suite (Rognes, Flouri, Nichols, Quince, & Mahé, 2016) and cutadapt version 1.14 (Martin, 2011). Because some runs did not yield reverse reads of sufficiently high quality to enable paired-end merging, only forward reads were utilized. Forward primers were removed with cutadapt. Quality filtering was with the fastq_filter program of vs-

| Pest and invasive species custom reference libraries
Reference sequences for species from the following sources were  package BOLD (Chamberlain, 2018). Of the 1,004 total species names, 361 were found in BOLD. These were exported as a tab-separated file and processed into FASTA format with Linux command lines. The remaining species were searched for on NCBI GenBank (advanced search, criteria including ["COI" OR "CO1" OR "COXI" OR "COX1"]). Forty-one of the species names were found and downloaded as FASTA files. To combine the sequences from both sources into a single database and blast, we used BOLD_NCBI_Merger (Macher, Macher, & Leese, 2017

| Biodiversity analysis
As DNA metabarcoding is not quantitative (Krehenwinkel et al., 2017

| Biodiversity analysis (BOLD BIN-based database)
A total of 19,727 OTUs were produced by the pipeline. Of these, 12,513 matched at ≥73% identity to the database downloaded from BOLD. After filtering for alignment lengths of ≥100 bp, E-value of 10e-6 and ≥97% identity to the reference sequences, 5,782 matches remained. The majority of matches belonged to Arthropoda, with the majority of those belonging to Diptera (3,169), Hymenoptera (1,173), Lepidoptera (527) and Coleoptera (411). Table 3 lists total BIN detections broken down by order in 2016 and 2018, and the proportion of BINs which were recovered in both years (percentage overlap). Total read numbers produced per sample are given in Table S1, and rarefaction curves for BINs detected are in Figure S1.     (Zhang, 1994). Eurasian in origin, it was introduced to the USA in the 19th century. We detected its sequences at 100% match to the database in Malaise it was found in the same trap but more frequently: in every collection through August, and also in trap T1-34 in the first collection of June ( Figure 6). Interestingly, we also observed similar patterns of presence/absence for E. tedella and its parasite, Lissonota dubia ( Figure 7). AAB6845 with Dendrolimus pini (Linnaeus, 1758), which is known throughout most of Europe, including Germany. This result illustrates that, because a small custom database was used for this task, consisting of only species of interest, hits must be investigated further when the possibility exists that a specimen actually belongs to a closely related species not in this database. Therefore, it is probable that the latter was the species which was collected. Further integrative taxonomic study is needed to examine whether superans may better be downgraded to subspecies rank or synonymy of pini.

| Economically important terrestrial arthropods and other species of interest
Ips duplicatus (BOLD: ACD5566) matched at 98.64% identity to the database in Malaise trap T3-50 (inside the National Park), collection 2 July 2018, filtered ethanol sample (Table 4). I. duplicatus is endemic to northern Europe, where it is a pest of pine trees (Pinus spp.), whereas it is unknown if it additionally poses a threat to biodiversity.
The species was unknown in Germany at the time of publication of the warning list, but has recently been spreading southward, through central, eastern and southern Europe (Fiala & Holuša, 2019). Although another congeneric species, Ips typographus (Linnaeus, 1758) (BOLD: ACT0826), a keystone pest species in the Bavarian Forest National Park (Müller, Bußler, Goßner, Rettelbach, & Duelli, 2008), was also detected in the same trap at 100% identity, these two species' barcode sequences cluster less closely together, and they do not share a BIN. The present result is therefore likely to be a case of correct molecular identification of I. duplicatus, and to represent the first detection of this invasive saproxylic beetle in the National Park. Detection frequencies of species of interest could also be examined. Same-time detection of host and parasite species was observed, in Epinotia tedella and Lissonota dubia, in both study years ( Figures 6 and 7). These results provide support for the use of metabarcoding as a reliable method for informing phenologies of individual species. It is noteworthy, too, that detection patterns of Lymantria dispar, a known pest, potentially suggest an increase in its abundance throughout the National Park. Efforts to track the spread of pest and invasive arthropods should be continued, and metabarcoding represents a viable time-and cost-efficient method of their early detection. We think that implementation of biodiversity data from various sources-such as bulk data on BOLD-will be valuable for ongoing monitoring efforts. Spatial biodiversity analysis    (Table 4). Of two potential matches above 97% identity to database sequences, one was a participant in BIN-sharing, clustering together with an endemic species. Therefore, Ips duplicatus was the only mo- Additionally, the fact that this species was detected exclusively by ethanol filtration provides further support for our recommendation of the use of multiple methods of DNA extraction in conjunction for metabarcoding efforts, whenever possible.
With the rapidly growing demand for large-scale biodiversity data, metabarcoding has gained popularity as the method of choice for any major biomonitoring initiative. Our study shows that the method qualifies as a cost-and time-efficient alternative to traditional approaches. However, despite its apparent advantages, more research is needed to overcome its current limitations in both the laboratory and informatic areas. We encourage further studies towards this aim, to investigate patterns of biodiversity across all varieties and scales of ecosystems and environments, in order to increase the ability of scientists to effectively manage resources and conserve the biodiversity upon which life on Earth depends.

ACK N OWLED G EM ENTS
We express our sincere gratitude to Olaf Schubert for administrating the collection of the specimens from the Malaise traps in the field. Without his dedicated work this study would not have been possible. We would also like to thank Dr Marina Querejeta Coma for her diligent work on the laboratory portion of this study, as well as