Optimized and affordable high‐throughput sequencing workflow for preserved and nonpreserved small zooplankton specimens

Genomic analysis of hundreds of individuals is increasingly becoming standard in evolutionary and ecological research. Individual‐based sequencing generates large amounts of valuable data from experimental and field studies, while using preserved samples is an invaluable resource for studying biodiversity in remote areas or across time. Yet, small‐bodied individuals or specimens from collections are often of limited use for genomic analyses due to a lack of suitable extraction and library preparation protocols for preserved or small amounts of tissues. Currently, high‐throughput sequencing in zooplankton is mostly restricted to clonal species, that can be maintained in live cultures to obtain sufficient amounts of tissue, or relies on a whole‐genome amplification step that comes with several biases and high costs. Here, we present a workflow for high‐throughput sequencing of single small individuals omitting the need for prior whole‐genome amplification or live cultures. We establish and demonstrate this method using 27 species of the genus Daphnia, aquatic keystone organisms, and validate it with small‐bodied ostracods. Our workflow is applicable to both live and preserved samples at low costs per sample. We first show that a silica‐column based DNA extraction method resulted in the highest DNA yields for nonpreserved samples while a precipitation‐based technique gave the highest yield for ethanol‐preserved samples and provided the longest DNA fragments. We then successfully performed short‐read whole genome sequencing from single Daphnia specimens and ostracods. Moreover, we assembled a draft reference genome from a single Daphnia individual (>50× coverage) highlighting the value of the workflow for non‐model organisms.


| INTRODUC TI ON
Small-bodied species are particularly interesting for ecological and evolutionary research as they have the potential to rapidly adapt to environmental change due to higher levels of genetic variation (Ellegren & Galtier, 2016), shorter generation times (Blueweiss et al., 1978), and potentially faster molecular evolution (Martin & Palumbi, 1993;Thomas, Welch, Lanfear, & Bromham, 2010).
Additionally, they often play an important role in food webs and ecosystems (Sommer et al., 2012;Sommer, Gliwicz, Lampert, & Duncan, 1986). Most phylogenetic groups have a right-skewed size distribution, meaning they typically have more small-bodied species (Kozłowski & Gawelczyk, 2002). There may also be practical reasons, for example fewer sampling restrictions with invertebrates or lower costs for storage, that can make it easier to work with small organisms.
In the last centuries, over 81 million animal specimens, more than 42 million of which are arthropods, have been collected and preserved by researchers and are stored by institutions all over the world (GBIF.org, 2020). These specimens represent an invaluable source of information on extant biodiversity as well as extinct populations and species. For example, the effect of natural selection on the demise of the Passenger Pigeon (Ectopistes migratorius) was reconstructed using mitochondrial and nuclear DNA that could be extracted from museum samples (Murray et al., 2017). Preserved samples can also be used to analyse time series to understand how populations react to environmental change, such as species invasions or increased human disturbance. For example, Hauser, Adcock, Smith, Bernal Ramirez, and Carvalho (2002) used decades-old archived fish scales to demonstrate the loss of genetic diversity as a consequence of human overexploitation of the New Zealand snapper (Pagrus auratus). Also today, samples are routinely preserved in ethanol during field trips to remote, inaccessible areas (Camacho-Sanchez, Burraco, Gomez-Mestre, & Leonard, 2013), when it is infeasible to bring individuals back to the laboratory alive or to cryopreserve them in the field.
Recently, high-throughput sequencing techniques have revolutionized the field of biology in general and evolutionary biology in particular (Schuster, 2008). These methods enable more accurate estimation of population structure, gene flow, and genetic variation compared to previous methods that relied on a limited number of markers (Gilbert et al., 2015). The mapping and characterization of genes involved in adaptation is now feasible even for non-model organisms (Ekblom & Galindo, 2011;Stapley et al., 2010).
However, to take full advantage of these exciting new possibilities, adequate quantities of DNA are required for the preparation of adapter-ligated libraries for high-throughput sequencing. The limited amount of available tissue from small-bodied species or valuable museum specimens, currently constrains their usability for cutting-edge genomic technologies. Small amounts of input DNA can be problematic for library preparation. They can lead to an incomplete representation of the genome in the sequencing data when haphazardly parts of the genome are not sufficiently amplified or sequenced due to low initial copy numbers in the input DNA or library, respectively.
Moreover, more polymerase chain reaction (PCR) cycles are needed during the library preparation when initial copy numbers are low.
This results in an increase of both PCR amplification errors and sequence duplication rates. The former issue affects error rates during genotyping whereas the latter problem increases sequencing efforts and costs as more sequencing is required to compensate for redundant, duplicate reads. Consequently, despite the wealth of samples and their ecological and evolutionary significance many preserved samples are not yet accessible to high-throughput sequencing methods and, hence, the full scientific potential of such collections cannot be utilized (Wandeler, Hoeck, & Keller, 2007).
High-throughput sequencing of small organisms has so far been performed by including an WGA step (Cruaud et al., 2019;Grealy, Bunce, & Holleley, 2019;Lack, Weider, & Jeyasingh, 2017). This technique enables the use of small amounts of DNA, but introduces biases due to PCR selection, PCR artefacts, and PCR drift (Sabina & Leamon, 2015). Additionally, this extra step adds considerable costs and time. A commonly used alternative to WGA is collecting individuals in the field and establishing clonal (Innes & Ginn, 2014;Schaffner et al., 2019) or large inbred (Benesh, 2019) cultures in the laboratory. While this has enabled the first population genomic study in Daphnia , culturing lineages in the laboratory has several major disadvantages. It only works for species that are clonal or easy to inbreed and can be kept in a laboratory setting. Furthermore, the survival of individuals in the laboratory and the establishment of clonal lineages is not random and introduces biases. Moreover, mutations can occur in cultures that introduce genetic variation among clonal individuals (Keith et al., 2016), even though this might be negligible over few generations (Dukić, Berner, Haag, & Ebert, 2019). A third alternative for sequencing of small organisms is pooling individuals for sequencing (Pool-seq), which captures allele frequencies but cannot be used for individual-based analyses, e.g., genome-wide association studies or pedigree analyses (Futschik & Schlötterer, 2010). Additionally, due to practical problems in the equimolar pooling of individuals, Pool-seq can suffer from inaccurate calling of rare variants, incorrect allele frequency estimation (Anand et al., 2016), and elevated estimates of population differentiation (Dorant et al., 2019). Hence, until now most population analyses in small-bodied zooplankton have been restricted to Sanger-sequencing-based methods (Koenders, Schön, Halse, & Martens, 2017;Ma, Hu, Smilauer, Yin, & Wolinska, 2019 Schwentner, Combosch, Pakes Nelson, & Giribet, 2017) to compare and modify several different DNA extraction methods in an effort to identify methods yielding high DNA quantities and qualities from small aquatic invertebrates. Further, we investigated the effect of ethanol preservation on extraction success and yield. We then produced a total of 24 high-quality whole genome sequencing (WGS) libraries from minimal DNA amounts without prior whole genome amplification for several Daphnia species and ostracods (another group of crustaceans typically between 0.5 and 2 mm in size).
Next, we showed that a very basic de novo assembly can already be produced from a single individual sequenced to >50× coverage using our new workflow. We demonstrated the suitability and reliability of our protocol by mapping our libraries to different reference genomes and calculating the concordance of genotyped sites from sequenced libraries from the same individual and extraction.
Additionally, we constructed a tree from whole-mitochondrial sequences and performed a principle component analysis on nuclear variants to demonstrate the validity of the generated data. Finally, we combined all these approaches into a workflow (Figure 1) detailing the steps from sample to high-quality sequencing result.

| Samples
A total of 27 Daphnia species and the ostracod species Eucypris virens were used for extractions (Table S1 and Methods S1). For Daphnia samples, we measured body length (BL) (henceforth referred to as "BL") of individuals as the distance from the top of the eye to the anterior base of the spine. Additionally, the number of eggs and embryos in the brood pouch ("eggs") was counted from pictures taken prior to extraction. For each extraction, a single individual, either living (henceforth referred to as "nonpreserved") or F I G U R E 1 Flowchart depicting the workflow presented here. Blue boxes indicate steps of the workflow, and orange boxes show physical objects resulting from these. Several quality control steps are included in the workflow to ensure fragmented and contaminated samples can be removed High-throughput sequencing preserved in ethanol ("ethanol-preserved") was used. Nonpreserved samples were either obtained from laboratory cultures or collected from Lake Constance with 200 µm mesh size plankton net hauls.
Ethanol-preserved samples comprised various species collected by collaborators and had been stored in ethanol between one and 29 years prior to extraction (Table S1). The exact storage conditions at collection for the samples are not known, but all were stored in >70% ethanol on receipt.

| Extraction kits and procedure
A few test samples were extracted with a Phenol-Chloroform Isopropanol (PCI) extraction protocol (Green & Sambrook, 2017); however, due to consistently poor results we did not continue Complete DNA and RNA Purification Kit (Lucigen), and a modification of the HotSHOT protocol (Truett et al., 2000) by Montero-Pau, Gómez, and Muñoz (2008) were then tested on nonpreserved samples, covering commonly-used approaches for DNA extraction. The two most promising kits were then also tested with ethanol-preserved samples. An overview of all kits including the used protocols is given in Table S2; the modifications and additional information is given in Methods S1.
Before extraction all samples were washed three times in au-

| Library preparation and sequencing
Three different methods were tested for whole-genome library preparation; the original NEBNext Ultra II FS DNA Library Prep Kit for Illumina (NEB) and two modifications of the Illumina Nextera DNA Library Prep Kit (Baym et al., 2015;Therkildsen & Palumbi, 2017).
The protocol of Baym et al. (2015) was modified slightly to allow for lower DNA input and to optimize DNA fragment size distribution. Further information on the kits and modifications are available in Methods S1. Final libraries were either size-selected individually at 410-800 bp using a PippinPrep (Biozym Scientific), with a 1.5% cassette and internal markers, or after pooling at BGI Shenzhen.
Libraries were then sequenced with unrelated libraries on three HiSeq X10 lanes at BGI Shenzhen in 150 bp paired-end mode. All sequenced libraries are listed with details on DNA input amount and library preparation protocol in Table S3.

| Extractions
To assess the effects of several different factors (extraction kit, sample preservation, duration of preservation, size of individuals, and number of eggs) on DNA extraction success (success: ≥0.4 ng total DNA; failure: <0.4 ng total DNA) and on DNA yield (ng total DNA), we compared several different combinations of generalized linear models. Models were compared with the Akaike (AIC) and Bayesian information criterion (BIC) and if they supported different models, we chose the model with the lowest BIC score as it should select the correct model when the sample size is much higher than the number of parameters used in the models (Aho, Derryberry, & Peterson, 2014). If the best model included the used extraction kits as a significant explanatory factor, post hoc tests were performed with the r package emmeans (Lenth, 2019) and corrected for multiple testing with Tukey's range test. All analyses were performed in r 3.6.1 (R Core Team, 2019) and the generalized linear models were fitted with lme() from the r package nlme (Pinheiro, Bates, DebRoy, Sarkar, & R Core Team, 2019). All compared models with their respective AIC and BIC values are given in Table S4.
Additionally, differences in the peak of the fragment size distribution between different kits were assessed with a Kruskal-Wallis rank sum test. Extractions of ethanol-and nonpreserved samples using the GeneJET and MasterPure kits were classified as separate groups. Pairwise tests were performed with Wilcoxon signed-rank tests, with correction for multiple testing according to the Benjamini-Hochberg procedure. The PCI extractions were not included in the statistics as sample size was too low.

| Whole genome sequencing
The major steps of all analyses are outlined below, for details see Methods S1. The expected genome sizes of the samples were estimated from the raw reads with Genomescope (Vurture et al., 2017) using the histogram calculated with the count and histo functions from Jellyfish 2.3.0 (Marçais & Kingsford, 2011) with canonical 21mers. Contamination introduced during the library preparation from various pro-and eukaryotic sources (Methods S1) was estimated from raw reads using Kraken 2.0.9 (Wood, Lu, & Langmead, 2019) and FastQ Screen 0.14 (Wingett & Andrews, 2018 for the ostracod samples (Table S3). Then, sequences of the 13 protein-coding genes as well as the 12S and 16S rRNA genes, as identified by MITOS (Bernt et al., 2013), were independently aligned with additional sequences from NCBI (Table S5) using the MAFFT online server with basic settings (Katoh, Rozewicki, & Yamada, 2019).
A maximum likelihood tree was reconstructed with IQ-TREE 1.6.9 (Nguyen, Schmidt, von Haeseler, & Minh, 2015). The accuracy of the estimated tree was estimated with 10,000 rounds of both ultrafast bootstrapping (Hoang, Chernomor, von Haeseler, Minh, & Vinh, 2017) and Shimodaira-Hasegawa-like approximate likelihood ratio test (SH-aLRT), which compares the likelihood value of the current tree with that of the best alternative (Guindon et al., 2010).

| Extraction
In total 1,321 single individuals (1,044 nonpreserved and 277 ethanol-preserved samples) were extracted with five different kits ( The second best model had only BL as significant factor and was slightly less supported (∆ BIC ~ 0.8). MasterPure extractions gave significantly higher DNA yields than GeneJET (p = .0037; Figure 2b).
All statistical models with their respective AIC and BIC are listed in Table S4.
The fragment sizes of extracted DNA varied significantly between kits (p = 1.427*10 -7 , df = 5, Kruskal-Wallis χ 2 = 40.098), with column kit extractions (GeneJET and Qiagen Micro) resulting in the smallest fragments. The fragment length peak of samples extracted with noncolumn kits (DNAdvance and MasterPure) did not differ from each other, but were significantly longer than those extracted with column-based kits. Fragment lengths of ethanol-preserved samples were significantly smaller than those from nonpreserved samples for both GeneJET and MasterPure (p < .05 respectively), but MasterPure extractions of ethanol-preserved samples produced significantly larger fragment sizes than GeneJET extractions of nonpreserved samples (p < .05). The fragments typically showed a narrow distribution around the peak, even though the proportion of smaller fragments seemed to increase with preservation ( Figure S1).

| Whole genome sequencing
All library preparation methods produced libraries with concentrations and size distributions suitable for sequencing ( Figure S2). Libraries produced using the protocol by Therkildsen and Palumbi (2017) showed an extra peak around 3,000 bp, for which we currently lack an explanation. This protocol also had longer hands-on time due to a second bead clean-up step compared to the protocol by Baym et al. (2015).  (Figure 3). The mapping rate to the Daphnia genomes was strongly dependent on phylogenetic distance to the reference genome ( Figure S3). Only libraries from samples with complete mitochondrional sequence divergence <30% from a reference had mapping rates above 40% to the respective nuclear

| D ISCUSS I ON
Genome-wide high-throughput sequencing of single individuals offers not only large improvements, such as better phylogenomic estimation, over previous techniques with fewer markers (Gilbert et al., 2015), it is also the basis for many analyses such as GWAS or QTL mapping (Korte & Farlow, 2013). Using single individuals instead of pooled samples improves estimates of allele frequencies (Dorant et al., 2019), aids the identification of genes associated with environmental variation (Rellstab, Gugerli, Eckert, Hancock, & Holderegger, 2015) or phenotypes (Kratochwil, Urban, & Meyer, 2019), and the identification of population structure (Ekblom & Wolf, 2014). Preserved samples from archives and collections stored in museums, institutes, or universities, offer vast opportunities for phylogenomic analyses (Evans et al., 2019) or to study temporal changes. Time series allow quantification of the effects of environmental variation or the strength of selection (Hauser et al., 2002;Schraiber, Evans, & Slatkin, 2016), the investigation of extinct taxa (Murray et al., 2017;Shapiro et al., 2002), and can lead to the identification of new species (Thandar, 2018), or clarification of species status which is relevant for conservation (Montano et al., 2018).
Despite these numerous research opportunities, samples of small-bodied individuals or museum samples are strongly underutilized (Cruaud et al., 2019;Derkarabetian, Benavides, & Giribet, 2019) for approaches using high-throughput sequencing techniques due to difficulties in extracting sufficient amounts of high-quality DNA (Grealy et al., 2019;Staats et al., 2013). In this study, we test which DNA extraction methods are best suited for different downstream applications and how sample preservation impacts the results. We successfully extract DNA from individual Daphnia and Ostracods from fresh material as well as specimens stored in ethanol for up to 29 years. Moreover, by reducing the required DNA input to 0.35 ng our workflow allows WGS without the need for whole genome amplification (Cruaud et al., 2019;Lack et al., 2017), cultures in the laboratory (Cornetti et al., 2019;Lynch et al., 2017), pooling of multiple individuals for extraction (Cornetti et al., 2019;Lynch et al., 2017), or using complete specimens for extractions (Scherz et al., 2019). We provide a workflow (Figure 1) that illustrates the process of getting high-quality sequencing results from single small-bodied and preserved samples.

| DNA extraction
Based on our results, we suggest different approaches depending on the downstream applications ( Figure 1). In general, we recommend a homogenization step using a lysing matrix, as it improves DNA yield, for example by breaking the carapace of small crustaceans (Athanasio et al., 2016). However, if morphological features need to be preserved for later analyses, it is advisable to replace the homogenization step with a proteinase K digestion, which chitin exoskeletons can withstand unharmed (Cornils, 2015). For short-read sequencing (e.g., whole genome resequencing) or simple sequence repeat analyses of any type of sample we recommend the GeneJET DNA extraction kit, as it gave the highest DNA yields (Figure 2), had the shortest hands-on time, and lowest price per sample of all tested commercial kits. If the aim is to get the maximum yield from ethanol-preserved samples, MasterPure is better suited, as it produced higher yields from ethanol-preserved samples (Figure 2b). If long reads are needed due to their advantages in the characterization of structural variants and de novo assembly of genomes, we also recommend using MasterPure, which resulted in longer fragment sizes for both preserved and nonpreserved samples (Figure 2c). However, due to the high requirements of long read sequencing (minimum 10-100 ng DNA; >20 kb fragment size) only a single extraction out of all the extractions we performed would qualify for long-read sequencing, indicating that larger specimens or further modifications of the extraction methods are needed for this kind of sequencing.
We note that some ethanol-preserved samples yielded very small DNA fragments (Figure 2c) indicating strong degradation and we suggest checking the fragment size distribution prior to library preparation. While we did not test this, these samples could possibly be processed using our workflow, by adapting the fragmentation time of the library preparation. Alternatively, procedures specifically designed for degraded or ancient DNA, such as the use of base-repair enzymes (Carøe et al., 2018;Fulton & Shapiro, 2019;Gamba et al., 2016) could be used. Additionally, while we observed lower DNA quantities from ethanol-preserved samples (Figure 2b), duration of ethanol preservation had no effect, despite using samples that had been stored in ethanol for over 29 years. Storage in 95% ethanol is assumed to preserve samples well (Camacho-Sanchez et al., 2013;Vink, Thomas, Paquin, Hayashi, & Hedin, 2005), but several studies working with ethanol-preserved museum samples have shown that there is a decrease in the recovery of ultra-conserved elements with increasing preservation time (Blaimer, LaPolla, Branstetter, Lloyd, & Brady, 2016;Derkarabetian et al., 2019;McCormack et al., 2016).
As the effects of preservation (lower DNA yield, smaller fragment sizes) are not correlated with time of preservation in our study, we speculate that handling before and during extraction are causing the increased fragmentation and that degradation over time is not detectable in this study due to relatively young age of samples.
It is worth mentioning that there are commercially available DNA preserving solutions (e.g., Zymo DNA/RNA Shield, Monarch DNA/ RNA Protection Reagent) and an increasing number of studies that propose to use RNAlater also for DNA preservation (Choo, Leong, & Rogers, 2015;Gray, Pratte, & Kellogg, 2013;Vink et al., 2005) due to its DNA-preserving properties.

| Whole genome sequencing
We present an improved protocol for the Nextera library preparation kit, that facilitates working with very small-bodied samples or small amounts of available starting material, for example tissue samples, or skin swabs from amphibians. To the best of our knowledge, we are the first research laboratory to routinely and successfully use only 0.35 ng of DNA for shotgun whole genome library preparation of small animals, considerably pushing the lower limit for input DNA from the 1 ng of DNA that Sproul and Maddison (2017) have used. We acknowledge that there are other sophisticated methods optimized that deal with low DNA inputs , in particular in the field of ancient DNA analysis working with samples thousands of years old (Willerslev & Cooper, 2005) or in "museomics" which typically deals with younger (up to 200 years) samples. However, these protocols are primarily optimized to deal with poor DNA quality due to contamination and fragmentation (Rohland, Harney, Mallick, Nordenfelt, & Reich, 2015), designed to enrich endogenous DNA over contaminations (Horn, 2012) and therefore target only specific parts of the genome (Knyshov, Gordon, & Weirauch, 2019;Suchan et al., 2016). Many of the protocols also still require higher amounts of input DNA or tissue than used in this study (Gamba et al., 2016;Shapiro et al., 2019;Tsai et al., 2019;Vershinina, Kapp, Baryshnikov, & Shapiro, 2020). Therefore, these methods are more targeted to fragmented DNA and less cost-and time-efficient compared to the presented workflow. The choice of method depends on the sample of interest and biological question and should always be re-evaluated at the different quality control steps (Figure 1).
We use technical replicates of libraries and a suite of analyses, to comprehensively assess the quality of our workflow. Technical variation introduced during the library preparation seems to have been very small, as the technical replicates from the extraction had very low mismatches in the genotyped sites (0.11 ± 0.10%) that significantly decreased with mean sequencing depth ( Contamination levels were assessed, and they were highly variable (2.82%-25.64%; Figure 3), mostly due to bacterial contamination. Somewhat surprisingly, all samples had very low levels of algal contamination (<0.1%) although individuals were neither treated with antibiotics nor Sephadex beads. Such a laborious decontamination procedure is commonly implemented to remove food algae and the associated microbiome (Cooper & Cressler, 2020) in high-throughput sequencing studies on Daphnia water fleas (Cornetti et al., 2019;Fields et al., 2018). This additional step requires keeping and treating individuals in laboratory culture over several days prior to extraction. Our simple and time-saving approach of washing the samples in autoclaved water before extraction therefore seems to have reduced most contamination from algae. Mapping to eukaryotic contamination sources was low, with a single exception (laevis_3), which was contaminated with human DNA (>80%), most likely due to a handling error during extraction. Low levels of mapping to distant genomes is expected even in the absence of contamination due to highly and universally conserved regions between genomes, such as the ribosomal 16S and 23S sequences (Isenbarger et al., 2008) or UCEs (Meiklejohn, Faircloth, Glenn, Kimball, & Braun, 2016). In conclusion, we strongly recommend to follow protocols to reduce contamination, such as using a sterile bench or specific rooms (Fulton & Shapiro, 2019  , the mapping rates were rather low (<50%, Figure 3). We attribute this in large parts to the relatively high genetic divergence between the samples and the reference genomes (>30%) as mapping rates improve with reduced mitochondrial genetic divergence ( Figure S3). If the low mapping rates were merely a result of low input DNA for the library preparation, we would expect a difference in mapping rate between the Nextera libraries created with 0.35 ng of DNA and the libraries produced with more DNA (ga-leata_4: 6.38 ng DNA; galeata_2-2:1 ng DNA) which is not the case.
As expected, the ostracod samples had low mapping rates to any of the tested references.
To validate our data set against external data, we combined the most complete Daphnia phylogeny (Adamowicz et al., 2009), which is based on mitochondrial sequences, with mitochondrial sequences assembled from our samples. In the constructed mitochondrial tree ( Figure 4a) the position of the samples matched the a priori classification based on morphological characters showing that it is possible to retrieve the mitochondrial sequence. The same result was achieved for the ostracod samples ( Figure S4) for which no other comparable data is available. As whole genome data of only one of the Daphnia species (Daphnia obtusa) used in this study is available and mapping rates of more distantly-related species are very low the value of a phylogenetic approach for the nuclear variants is strongly reduced. Instead a PCA was used, which separated all species and the corresponding complexes in the first two axes of the PCA (Figure 4b). This approach also allows distinguishing closely related populations of the same species and shows the similarity of the technical replicates ( Figure S5). This demonstrates the validity of our proposed method also for nuclear variants, which naturally have a much lower sequencing coverage than mitochondrial sequences.
Sequencing to high coverage with more unique reads will facilitate further analysis of closely related populations, such as the D. cristata samples used in this study.
Despite the good results obtained by using our method, we advise using more than the minimum amount of 0.35 ng of DNA whenever possible, especially for species with larger genome sizes for which the risk of missing large parts of the genome is increased. Dedicated kits such as NEBNext Ultra II FS can be viable alternatives to the modified Nextera Kit, as its low-cost advantage (Baym et al., 2015;Therkildsen & Palumbi, 2017) diminishes when the amount of input DNA is increased due to more of the relatively expensive transposome being required. While, the Nextera kit is no longer manufactured, there are new protocols available for modifying the current Nextera DNA Flex Kit from Illumina (Gaio et al., 2019), which can be modified in a similar way to the changes proposed here to reduce DNA input below the suggested 10 ng.
In conclusion, in this study, we assessed, compared and optimized previously published methods for DNA extraction, various library preparation methods and their modifications. We suggest different kits depending on the type of starting material. The workflow presented here allows for the cost-efficient use of single individuals of small-bodied organisms collected during field trips or routine sampling, recent or historic, live or preserved samples. Instead of extracting a few loci from these samples using Sanger sequencing, the presented workflow allows extracting genome-wide information via reliable high-throughput sequencing. This is achieved without any laborious and costly intermediate steps, such as whole genome amplification or establishing laboratory cultures. While our protocols were tested and optimized using aquatic invertebrates, there is no reason to assume that similar approaches should not be applicable to other small-bodied taxa. It could be used for small insects, both aquatic and terrestrial, tissue samples of larger specimens when limited tissue is available, e.g., arthropod legs, or if multiple analysis are planned for each specimen. Therefore, we encourage other scientists to use and adapt the workflow we present in this study and to consider the application of high-throughput methods even for samples with limited material and projects with limited funds to take full advantage of the possibilities offered by genome-wide data. and future" within the framework of the Interreg V programme "Alpenrhein-Bodensee-Hochrhein (Germany/Austria/Switzerland/ Liechtenstein)" whose funds are provided by the European Regional Development Fund as well as the Swiss Confederation and Cantons.

ACK N OWLED G EM ENTS
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.  Table S3.

AUTH O R CO NTR I B UTI
The accession numbers for mitochondrial sequences constructed in this study are given in Table S5. The assembly (CWD21 v0.01) is available on GenBank (JAAVJA000000000).