Genome‐scale target capture of mitochondrial and nuclear environmental DNA from water samples

Abstract Environmental DNA (eDNA) provides a promising supplement to traditional sampling methods for population genetic inferences, but current studies have almost entirely focused on short mitochondrial markers. Here, we develop one mitochondrial and one nuclear set of target capture probes for the whale shark (Rhincodon typus) and test them on seawater samples collected in Qatar to investigate the potential of target capture for eDNA‐based population studies. The mitochondrial target capture successfully retrieved ~235× (90× − 352× per base position) coverage of the whale shark mitogenome. Using a minor allele frequency of 5%, we find 29 variable sites throughout the mitogenome, indicative of at least five contributing individuals. We also retrieved numerous mitochondrial reads from an abundant nontarget species, mackerel tuna (Euthynnus affinis), showing a clear relationship between sequence similarity to the capture probes and the number of captured reads. The nuclear target capture probes retrieved only a few reads and polymorphic variants from the whale shark, but we successfully obtained millions of reads and thousands of polymorphic variants with different allele frequencies from E. affinis. We demonstrate that target capture of complete mitochondrial genomes and thousands of nuclear loci is possible from aquatic eDNA samples. Our results highlight that careful probe design, taking into account the range of divergence between target and nontarget sequences as well as presence of nontarget species at the sampling site, is crucial to consider. eDNA sampling coupled with target capture approaches provide an efficient means with which to retrieve population genomic data from aggregating and spawning aquatic species.


| INTRODUC TI ON
Population genomic analyses have become an efficient way of studying population structure, demography, selection and dispersal in numerous species. Advances in DNA sequencing technology have enabled researchers to sequence hundreds of whole-genomes with reasonable effort and for reasonable costs (Schwarze et al., 2020).
While these advances have improved opportunities for studying natural populations in the wild, many species, especially large marine species, remain difficult to sample efficiently and noninvasively.
Furthermore, obtaining permits (e.g., CITES, export, import) for sampling tissue can be cumbersome and time-demanding, and international transport of animal samples can be disruptive to project logistics.
While eDNA studies have already made large contributions to biodiversity research at the species level, the potential for eDNA methods in retrieving population genetic information has only just begun to be explored (Adams et al., 2019;Sigsgaard et al., 2020).
It was recently shown that mitochondrial control region (CR) haplotype frequencies found in tissue samples of whale shark (Rhincodon typus) in Qatar were mirrored in eDNA metabarcoding (Taberlet et al., 2012) of seawater samples from the same study site (Sigsgaard et al., 2017). Similar results were later obtained by Parsons et al. (2018) for harbour porpoises Phocoena phocoena and by Baker et al. (2018) for killer whales Orcinus orca.
However, the fragmented nature of eDNA, along with the limited read lengths available using Illumina sequencing, have restricted eDNA metabarcoding to focus on relatively short amplifiable regions of the mitochondrial DNA (mtDNA). While a short variable marker can successfully provide haplotype information (Sigsgaard et al., 2017;Turon et al., 2020), it provides limited resolution. Future population genetic eDNA studies would therefore benefit from a greater coverage of mtDNA variation, and ideally from incorporating nuclear DNA (nuDNA) markers. As all mtDNA segments are physically linked, they do not provide independent information. Hence, analysis of nuclear DNA should, if possible, be the preferred option, also to avoid bias due to mtDNA being maternally inherited.
For eDNA research to produce more powerful population genetic inferences, the potential for analysing a greater part of the mitogenome and to include multiple markers of nuDNA therefore needs to be investigated. Because environmental water samples contain DNA from various nontarget species, for example more than 99% when working with eukaryotes as the target group (Stat et al., 2017), target capture approaches are a promising alternative to shotgun sequencing (Sigsgaard et al., 2020). Target enrichment via DNA hybridization capture ("target capture") (Gnirke et al., 2009), is a well-tested method for obtaining DNA data from samples with high nontarget content. In short, custom biotinylated RNA baits hybridize with complementary DNA sequences from the sample, and nonhybridized sequences are washed away, ultimately enriching the sample for the target DNA, while avoiding issues related to PCR bias (Polz & Cavanaugh, 1998).
Target capture is well known from ancient DNA (aDNA) research, for enriching endogenous components of DNA from samples of, for example, bone or hair (Carpenter et al., 2013;Cruz-Dávalos et al., 2017;Paijmans et al., 2016). Recently, the approach has also been implemented on ancient (Slon et al., 2017) and contemporary as a potential alternative approach for species detection in ichthyoplankton swarms. Furthermore, single taxon capture probes have been developed for contemporary eDNA from water samples to evaluate species detection efficiency (Wilcox et al., 2018). Pinfield et al. (2019 applied whole-genome enrichment capture with RNA baits followed by subsequent shotgun sequencing of eDNA samples, but not enough killer whale DNA was retrieved to conduct population genetic analyses and infer a potential source population. The whale shark feeding aggregation studied by Sigsgaard et al. (2017) provided ideal conditions for testing a population-level eDNA approach, as many individuals are concentrated in a small area, and reference mtDNA sequences were available from both Qatar and other parts of the world. Recent efforts into sequencing the whale shark genome (Hara et al., 2018;Read et al., 2017;Weber et al., 2020) have now enabled the design of genome-wide capture probes for the species and mapping of potential whale shark sequences obtained from eDNA target capture.
We developed and tested one mitochondrial and one nuclear set of target capture probes for the whale shark to investigate the potential for extracting population genomic data from eDNA samples. We successfully retrieved (a) eDNA reads spanning the entire mitochondrial genome of whale sharks, which furthermore matched previously known haplotypes, and (b) nuclear reads covering multiple loci in the whale shark genome. As an interesting addition, we also retrieved a large amount of reads from the fish species mackerel tuna (or kawakawa) (Euthynnus affinis), the eggs of which are the probable cause of the whale shark aggregations in the area (Robinson et al., 2013). These data enabled us to investigate patterns of mitochondrial sequence-to-probe similarity in relation to coverage obtained and to estimate allele frequencies at multiple loci from nuclear reads of E. affinis.

| Sample collection, extraction and initial testing
Two 1-L water samples were filtered through sterile 0.22-µm Sterivex-GP filters (Merck Life Science) directly from a boat at the Al Shaheen oil field in Qatar on September 1, 2016. The two samples were collected from surface water in the middle of an aggregation of >50 whale sharks visible by eye. We did not investigate the presence of other species at the sampling site, but the whale sharks are thought to aggregate in these waters to feed on the eggs of spawning Euthynnus affinis (Robinson et al., 2013;Sigsgaard et al., 2017), which we thus expected to be highly abundant. The filters were immediately put on ice and stored at −20°C until DNA extraction. Separate DNA extractions were carried out for the two samples using the DNeasy Blood & Tissue kit (Qiagen).
The manufacturer's protocol was slightly modified, using four times more AL buffer and proteinase K and 3 hr of incubation.
Samples were initially screened for whale shark eDNA with two sets of species-specific TaqMan qPCR systems (TAG Copenhagen) (Text A in the Appendix S1).

| Development of nuclear target capture system
Using the published whale shark genome (Read et al., 2017), we designed a bait system (59,941 probes total, targeting ~0.1% of the genome) aimed at enriching primarily for nuclear intron fragments of whale shark DNA. We expect introns to be more variable, as they are subject to fewer functional constraints, and thus to provide more information for population genetic inferences (Li, 1997).
However, a small proportion of exon baits were also included. The nuclear bait set costs ~€160 per reaction including the design process (but not including library kit and indexing), with a minimum of 16 reactions. For details on the nuclear capture design see Text B in the Appendix S1.

| Development of the mitochondrial target capture systems
A "myBaits Mito" kit (Catalogue no. 303096) was designed by Arbor Biosciences from the mitogenome of the Taiwanese whale shark specimen sequenced by Read et al. (2017) (NCBI accession no. NC_023455.1). A probe system with 80-bp probes and 4× tiling was created to capture the entire mitogenome. We specifically kept nuclear and mitochondrial target capture separate, as the multicopy nature of organellar genomes is known to cause sequencing output to be dominated by organellar DNA, with minimal amounts of potential nuclear DNA being captured and sequenced (Andermann et al., 2020;Falk et al., 2012). The mitochondrial bait set costs ~€30 per reaction including the design process (but not including library kit and indexing), when purchasing 96 reactions at a time.

| Library preparation and sequencing
Fragment sizes of the raw eDNA extracts were initially visualized on a 4200 TapeStation (Agilent). The two samples were then pooled into one in equal volumes to ensure sufficient starting material, and thus now represent a single sample of 2 L of filtered water. The pooled sample was sonicated on an S220 Focused-Ultrasonicator (Covaris), aiming for a fragment size of ~250 bp. A single library was built using the Accel-NGS 2S Hyb DNA Library Kit (Cat. No. 23096) (Swift Biosciences) and used as input for both the mitochondrial and nuclear capture. We used 13.33 µl eDNA template (~200 ng total) in the library preparation, and the capture reactions were carried out following the supplied protocol, running seven precapture PCR cycles, 48 hr of hybridization at 65°C, and 14 post-capture PCR cycles. The final, enriched products from the mitochondrial and nuclear capture were then purified and sequenced (301 bp paired-end) in two separate runs on a MiSeq (Illumina) at the Department of Biology, Aarhus University.

| Mitochondrial whale shark capture
Mitochondrial paired-end reads were filtered and collapsed using adapterremoval version 2 (Schubert et al., 2016), specifying a minimum Phred quality score of 20 and a minimum read length of 40 bp. Reads were first searched against the published whale shark mitogenome using blastn and with only a ≥70% sequence similarity criterion, acknowledging the highly variable D-loop region (Brown et al., 1986).
The retained reads were then searched against the entire nucleotide database in GenBank. As this database contains mitochondrial sequences from multiple whale shark individuals, only reads with whale shark as best blastn hit and a minimum sequence similarity of 98% were retained after dereplication of identical sequences with vsearch-2.14.2 (Rognes et al., 2016). All retained mitochondrial reads were imported into geneious (Kearse et al., 2012), where data were visualized and all subsequent analyses were carried out. All plots were made using the R package "ggplot2" (Wickham, 2016).

| Mitochondrial nontarget capture
As cocaptured nontarget reads could be of interest for evaluating capture efficiency, we performed a blastn search on all quality filtered reads for the mitochondrial capture and extracted all sequences from the three species with higher numbers of hits than whale shark, that is E. affinis, skipjack tuna (Katsuwonus pelamis) and striped bonito (Sarda orientalis). We mapped these sequences to the mitogenomes of their respective species and inspected both coverage distribution and variable sites, with a minor allele frequency (MAF) filter of 5%.
In order to investigate capture efficiency, we aligned mitogenomes of the three scombrid species (accession nos. E. affinis NC_025934, K. pelamis JN086155, and S. orientalis AP012949) to the whale shark mitogenome (accession no NC_023455). We calculated similarity between the mitogenomes of each scombrid species to the whale shark using a sliding mean across the entire mitogenome. For every base pair, we included the 5 bp before and after that position to determine similarity (11 bp in total). As aligning these sequences inevitably led to gaps in the alignment, the alignment was longer than the actual mitogenome. In order to relate sequence similarity to the sequencing depth, we therefore disregarded gaps inserted in the scombrid mitogenomes for both similarity and coverage scores (Liu et al., 2019).

| Nuclear whale shark capture
Nuclear paired-end reads were filtered and collapsed exactly as the mitochondrial reads. The reads were blastn searched against the whale shark genome, and all hits with ≥97% match were retained. These reads were then blastn searched against four other chondrichthyan genomes (Australian ghostshark Callorhinchus milli, little skate Leucoraja erinacea, cloudy catshark Scyliorhinus torazame, and brownbanded bamboo shark Chiloscyllium punctatum) used for probe design (Text B in the Appendix S1), as well as against the entire nucleotide database downloaded from GenBank (downloaded September 2019). All reads with a highest or tied match to the whale shark genome were retained after dereplication, and subsequently mapped to the whale shark genome using bwa-0.7.17 ). The mapped reads were filtered for a minimum mapping quality (MAPQ) of 20 using samtools-1.9 . Variants were called using samtools and bcftools, and filtering of nuclear variants was done using snpsift-4.3t (Cingolani et al., 2012).

| Nuclear nontarget capture
As a large proportion of nuclear reads were assumed to stem from E. affinis, and as there is no complete nuclear genome available for this species, reads were mapped to the genome of the confamilial species Atlantic bluefin tuna (Thunnus thynnus) (accession no. GCA_003231725.1) using bwa-0.7.17 ) and samtools-1.9 ). Here, we implicitly assume that a high abundance of mitochondrial reads from E. affinis would correlate with a similarly high abundance of nuclear reads within a sample. The genetic distance between E. affinis and T. thynnus is about 11.3% (inferred from aligning mitochondrial genomes, accession nos. E. affinis NC_025934 and T. thynnus AP006034). Reads were dereplicated and mapped and variants were called and filtered exactly like the whale shark reads above.

| Mitochondrial capture of whale shark eDNA
Both samples tested in the initial qPCRs were confirmed to contain mitochondrial whale shark DNA (C t -values of replicates: 28.63, 29.07, and 31.42, 31.45, respectively). MiSeq data for the pooled sample provided an initial 14.7 million reads passing the quality and length requirements. After filtering to ensure whale shark was the best hit with at least 98% sequence similarity, 27,875 reads were retained (~0.19% of all reads on target). After dereplication, a total of 16,486 unique reads were retained, of which 16,474 mapped to the whale shark mitogenome (NC_023455). With an average read length of ~240 bp (min: 63 bp, max: 589 bp; Figure S1) and a very F I G U R E 1 A graphic and clockwise overview of the coverage obtained from mapping putative whale shark reads from the mitochondrial capture to the whale shark mitogenome (accession no. NC_023455.1). The innermost line depicts individual base pair coverage (coloured from low [red] to high [light blue]), and the outer circle represents the annotated whale shark mitogenome for reference [Colour figure can be viewed at wileyonlinelibrary.com] even distribution of reads across the mitogenome, we thus obtained a ~235× (min: 90×, max: 352×) coverage per base position of the whale shark mitogenome (Figure 1).
The genetic variation found in the data set reflected known haplotype variation from Qatar, as we found four sequences with complete coverage of three different D-loop haplotypes previously found by Sigsgaard et al. (2017) using both tissue samples and eDNA samples (2 × DL1-A, 1 × DL1-C and 1 × DL1-D). We furthermore observed the single nucleotide polymorphisms (SNPs) responsible for the haplotypes DL1-B and DL1-E, although we did not recover any single sequences spanning the entire region of these haplotypes.
Furthermore, when applying a 5% MAF filter, we found a total of 29 variable sites throughout the mitogenome (Table 1). In general, these variants corresponded well with previously known variants in the D-loop region based on tissue samples, but we also recovered eight putatively new variants from gene regions that have not been sequenced exhaustively for Rhincodon typus.  Figure S3). For details on coverage distribution, see Figure S4. From the mapped reads, we found 12,411 raw variants, but most of the variants only had 1-2 × coverage ( Figure 3)  The majority of the variants retained (82%) reside in regions targeted by the nuclear capture system. Furthermore, our approach was also successful in capturing both intronic and exonic regions (Text C in the Appendix S1), although only a single exonic variant was retained with a 10× coverage filter.  we here generated a data set with minimal bacterial dominance (see Table S1). While we initially intended to focus solely on optimizing the sequencing output for whale shark sequences, the target capture turned out not to be strictly species-specific. This is in accordance with a previous study on chondrichthyans with myBaits probes, which have reported up to 39% divergence between baits and captured targets , although their protocol was optimized for divergent homologue sequence capture through a touchdown gene capture (Mason et al., 2011). In our study, the vast majority of F I G U R E 3 Overview of variants found using nuclear capture probes when mapping putative whale shark reads to the whale shark genome (accession no. GCA_001642345.2) (a,b) and when mapping quality-filtered, dereplicated raw reads to the Thunnus thynnus genome (accession no. GCA_003231725.1) (c,d). (a,c) The number of variant sites retained as depth filter (i.e., minimum coverage required for a variant to be retained) increases for both "all variants" and "polymorphic variants" (variants where both the reference allele and another allele are present in the data the sequencing data for both capture protocols were from Euthynnus affinis, and although we do observe a pattern of high similarity leading to higher coverage, DNA input from bony fishes cannot be avoided, especially when conserved regions are targeted. We kept the incubation temperature to the maximum recommended (65°C) throughout the capture process, and we would thus expect lower capture rates of highly divergent sequences. However, with our estimated average sequence divergence between the E. affinis and Rhincodon typus mitogenomes of 29.3%, and the large differences in between-species similarity across the mitogenome (Figure 2), some level of nontarget capture is inevitable. While the nuclear target capture retrieved a larger relative proportion of whale shark sequences than the mitochondrial capture (0.55% vs. 0.19% of reads), nontarget capture is probably also unavoidable for nuclear data, especially if probes are designed for exonic regions. Nevertheless, our results indicate that these nontarget data can be highly informative.

| Capture efficiency
While designing capture probes enables us to retrieve far more targeted genetic information than metabarcoding and direct shotgun sequencing approaches would permit on water samples targeting single species, it is also an expensive solution. However, it is highly scalable once the probes have been developed, and the price will drop markedly when reordering previously designed probe sets as well as larger quantities. It would be interesting to compare the resulting fold-increase of targeted capture approaches with direct shotgun sequencing on the same samples in relation to the price, in order to fully undertand its merits for environmental samples.

| Mitochondrial capture
The results from mitochondrial capture highlight the strong applicability of this approach, as we were able to obtain a ~235× coverage mitogenome of the whale shark. We acquired data containing a large amount of both known and unknown variation across the mitogenome, with the variation in the D-loop region being in concordance with previous studies on the same whale shark aggregation (Sigsgaard et al., 2017). The finding of three complete previously known D-loop haplotypes, as well as two SNPs indicative of two additional haplotypes, provides support for the whale shark origin of the captured sequences, and we can conservatively suggest that at least five different whale shark individuals contributed mtDNA to the sequenced water sample. To discern between rare variants and TA B L E 2 Overview of nuclear variants retained with a coverage of ≥10 when mapping to the whale shark genome used for bait design (accession no. Importantly, a great advantage of the target capture approach is that high-coverage sequence data from entire mitogenomes can be retrieved, instead of relying on a single species-specific metabarcoding marker, as is the current standard (Baker et al., 2018;Parsons et al., 2018;Sigsgaard et al., 2017). Obtaining accurate estimates of the number of individuals contributing to an eDNA sample based on mitogenomic data is at present unfeasible (see Sigsgaard et al., 2020 for a discussion of the challenges associated with identification of individuals), and we here limit ourselves to a conservative minimum As an unexpected advantage, the massive amounts of nontarget E. affinis DNA contributing to the sequence data simultaneously allowed us to explore mitochondrial variation with much higher coverage from a phylogenetically distant co-inhabitant of the sampling site (Table S2), providing additional insight into the applicability of target capture for eDNA studies.

| Nuclear capture
While the shortcomings of relying exclusively on mtDNA for population inferences have long been recognized (Ballard & Whitlock, 2004), eDNA researchers have focused on mtDNA due to its abundance as a multicopy marker, as well as to the large amount of reference data available in public databases. Nuclear target capture is largely unprecedented in eDNA studies, and with a lack of genomic reference data for most nonmodel organisms, this approach warrants extra caution. We have used here relevant available nuclear resources (i.e., genomes from R. typus, Callorhinchus milli, Leucoraja erinacea, Scyliorhinus torazame and Chiloscyllium punctatum) to ensure the best possible validation of the whale shark origin of eDNA sequences. Our study was not designed to explicitly estimate error rates, but a conservative estimate would suggest that about half of the raw whale shark variants found here represent sequencing and PCR errors. The coverage levels of the nuclear data retrieved for whale sharks were not sufficient for conducting in-depth population genetic inferences, with only 22 variants passing the 10× depth filter (see also Figure S4 for raw mapping coverage). Furthermore, we are unable to exclude the possibility that some, if not all, of these variants simply represent sequencing errors, as the majority are represented by a single sequence deviating from the reference genome. However, this is an important first step and the first successful attempt at retrieving nuclear information from eDNA samples. Our probe design was optimized for ~60K probes, but it may well have been more beneficial with fewer probes in higher concentration, focusing on fewer loci, but with higher coverage per locus. We would also recommend incorporating more genomic resources of both closely and distantly related species during probe design, and to consider using stricter criteria regarding probe similarity (e.g., only including probes with <85% similarity to all other genomic resources), assuming that single species capture is the aim. Additionally, assuming that the quantity of target template could be the limiting factor, filtering more water could perhaps increase the efficiency of capture. Nevertheless, with higher sequencing output, and improved probe design and protocols, higher coverage may well be obtainable from environmental samples. This would enable researchers to shed light on population genomic variation through environmental sampling, rendering eDNA an increasingly useful noninvasive tool for population geneticists in the future.
Importantly, we would argue that the enormous amount of putative nuclear eDNA from E. affinis found in the sequencing data demonstrates this point. As we do not have a genome available for E. affinis, we cannot verify that nuclear reads stemmed from this species. However, the large amounts of mtDNA from E. affinis found here would suggest that E. affinis is the most likely source of nuclear reads. We were able to retain thousands of polymorphic variants with a minimum coverage of 20× (but sometimes as high as >1,000×).
Analysis of nuDNA from environmental samples ultimately resemble a pooled sequencing approach (Sigsgaard et al., 2020), and it is noteworthy that such sequence coverage as obtained here in some respects fulfils the "best practices" requirements for pooled sequencing (Schlötterer et al., 2014). As compared to these "best practices," the problem nevertheless remains that it is difficult to know how many individuals have contributed DNA to the environmental samples and whether DNA contribution is reasonably balanced between individuals. Moreover, when using read counts and MAF to infer an approximated allele frequency spectrum, our data indicate that there is an abundance of rare alleles (an "L-shape"; Figure 4).
Reconstructing allele frequency spectra might serve as a preliminary test of the reliability of allele frequency estimates derived from environmental samples. In a relatively stable population, as would be expected to be the case for E. affinis, L-shaped allele frequency spectra would be expected (Luikart et al., 1998), representing a high abundance of rare alleles. This was roughly in accordance with our results, although the L-shape was not entirely clear-cut, perhaps due partly to sequences from other species misidentified as E. affinis sequences. Multiple other scombrid fishes are known to occur at the sampling site, and if nuDNA sequences from some of these species were retained in the filtered reads and successfully mapped to the Thunnus thynnus genome, this may have obscured our view of allele frequencies.
Importantly, while tissue-based population genetic studies can rely on genotyping data of single individuals to identify rare alleles, environmental data will have to rely on allele frequencies and allele counts within a sample. Some of the singleton allelic variants obtained will undoubtedly represent sequencing or PCR errors, but the influence of these can be mitigated by applying either an MAF filter or a minor allele count (MAC) filter. However, if sequencing depth is insufficient this will result in a simultaneous loss of true rare variants, and thereby render analyses based on rare variants unfeasible.
Consequently, a high sequencing depth is needed to ensure multiple, independent hits to the minor allele for trustworthy inference.
However, sequencing depth does not seem to influence MAF if the infrequent minor alleles (MAF < 0.05) are disregarded ( Figure S5). An enormous challenge for population genetic inference from environmental samples based on nuDNA will be the ability to discern between (a) novel genotypes from diverging populations or individuals of the target species, and (b) co-occurring closely related species.
Elucidating allele frequencies from environmental samples will be entirely dependent on the ability to confirm species-level identification at each locus independently. If working in environments where multiple closely related species occur simultaneously, it would be worth considering designing probes in regions with diagnostic SNPs to safely infer species identification. However, population genetic inference would then only be possible if the flanking regions of the diagnostic SNP also hold population-level information for the target species. We would argue that for proper probe design, researchers will need a reference genome of the target species as well as reference genomes from closely related species. To safely infer species-level identification, it would be extremely useful to have multiple genomes of both target and nontarget species. As enough reference data are compiled, even novel genotypes from the target species could be determined based on eDNA, for example by presence of nuclear barcode gaps associated with each locus targeted.

| Implications
This study demonstrates the wealth of genomic information on macroorganisms hidden in water samples and adds evidence to the increasing potential of eDNA as a population genetic tool. The relative read proportion of scombrid sequences from both the mitochondrial and the nuclear capture experiments proves that the capture approach is highly efficient in removing bacterial DNA, providing a massive advantage over shotgun sequencing for eukaryote monitoring (Stat et al., 2017). We show that whale shark DNA from entire mitochondrial genomes and multiple nuclear targets can be retrieved using target capture on water samples from an aggregation site. Sampling in the middle of groups of feeding and spawning fish was probably a close to ideal setting for testing the approach.
However, any sort of aggregating behaviour that concentrates many individuals of a species of interest would probably be advantageous for this approach, and it is possible that even better results could be achieved in different settings, such as by sampling near marine mammals with aggregating haulout behaviour, seasonal shorebirds with feeding stopovers in coastal areas, fishes with schooling behaviour, mass migrating species or species present in plankton blooms.
We show that eDNA from water samples taken near aggregating individuals holds considerable potential in exploring both frequent and rare variants, which would otherwise require many individuals to be directly sampled. The emerging field of population genetics from environmental samples remains in its infancy, but as databases continue to expand to encompass complete mitochondria and nuclear data for more nonmodel organisms, we argue that the usefulness of this approach will increase substantially.
With cross-capture previously being documented for nontarget organisms with up to 39% genetic divergence from target capture probes , combined with the cross-capture seen in this study, we argue that genomic target capture would furthermore hold promise as a multispecies approach in projects focused on entire organism groups (e.g., bony fish or mammals).
In conclusion, our study provides the first steps of baseline information on expected outcomes from population-level target capture experiments on contemporary environmental samples. We show for two marine fish species, R. typus and E. affinis, that population genetic inference from entire mitogenomes and nuclear loci is indeed feasible with eDNA samples. Our study opens new frontiers in eDNA research, and holds great promise for future population genomic research on aggregating and spawning species in aquatic environments.

ACK N OWLED G EM ENTS
We thank Annie Brandstrup, Susanne Nielsen and Britta Poulsen for assistance in the laboratory, and Jesper Bechsgaard, Aslak Kappel Hansen and Sune Agersnap for valuable input on data analysis and data representation. We also thank the three reviewers for valuable We thank GenomeDK and the Bioinformatic Research Center (BiRC) at Aarhus University for providing essential computational resources and support. Lastly, we thank the Qatar Ministry of Municipality and Environment for providing vessel and crew support for the water sampling and Mohammad Al-Jaidah for facilitating the sampling process in Qatar.

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interest.