Invasion genetics from eDNA and thousands of larvae: A targeted metabarcoding assay that distinguishes species and population variation of zebra and quagga mussels

Abstract Identifying species and population genetic compositions of biological invasions at early life stages and/or from environmental (e)DNA using targeted high‐throughput sequencing (HTS) metabarcode assays offers powerful and cost‐effective means for early detection, analysis of spread patterns, and evaluating population changes. The present study develops, tests, and applies this method with a targeted sequence assay designed to simultaneously identify and distinguish between the closely related invasive Eurasian zebra and quagga mussels (Dreissena polymorpha and D. rostriformis) and their relatives and discern their respective population genetic patterns. Invasions of these dreissenid mussel species have markedly changed freshwater ecosystems throughout North America and Europe, exerting severe ecological and economic damage. Their planktonic early life stages (eggs and larvae) are morphologically indistinguishable, yet each species exerts differential ecological effects, with the quagga often outcompeting the zebra mussel as adults. Our targeted assay analyzes genetic variation from a diagnostic sequence region of the mitochondrial (mt)DNA cytochrome oxidase I (COI) gene, to assess temporal and spatial inter‐ and intra‐specific genetic variability. The assay facilitates analysis of environmental (e)DNA from water, early life stages from thousands of individuals, and simultaneous analysis of 50–100 tagged field‐collected samples. Experiments evaluated its accuracy and performance using: (a) mock laboratory communities containing known DNA quantities per taxon, (b) aquaria with mixed‐species/haplotype compositions of adults, and (c) field‐collected water and plankton versus traditional sampling of adult communities. Results delineated species compositions, relative abundances, and population‐level diversity differences among ecosystems, habitats, time series, and life stages from two allopatric concurrent invasions in the Great Lakes (Lake Erie) and the Hudson River, which had separate founding histories. Findings demonstrate application of this targeted assay and our approach to accurately and simultaneously discern species‐ and population‐level differences across spatial and temporal scales, facilitating early detection and ecological understanding of biological invasions.


| INTRODUC TI ON
The growing global prevalence of invasive species profoundly shapes contemporary ecosystems (Pimentel, 2014;Simberloff, 2013) and negatively affects commerce (Keller, Lodge, Lewis, & Shogren, 2009;Pimentel, Zuniga, & Morrison, 2005). Ecological effects of invasive species range from competition with native species for food and space, alterations in habitat structure, and spread of pathogens and parasites (Gallardo, Clavero, Sánchez, & Vilà, 2016;Kvach & Stepien, 2008;Simberloff, 2013). The North American Laurentian Great Lakes is one of the most serious examples of a heavily invaded ecosystem, where >184 aquatic invasive species (AIS) have established (NOAA, 2018). Of these, the invasions of two closely related bivalve dreissenid mussels from Eurasia ( Figure 1)-the zebra mussel Dreissena polymorpha (Pallas, 1771) and the quagga mussel D. rostriformis (Deshayes, 1838)-are regarded to have exerted some of the greatest ecological and economic impacts. Dreissenids are extensive biofoulers, prolific filterers, and ecosystem "engineers" that convert soft to hard benthos, often hosting entire communities of accompanying organisms on their shell surfaces and interstices (Stepien et al., 2013;Ward & Ricciardi, 2007).
Over the past 300 years, the zebra mussel has spread throughout much of Europe from its native Ponto Caspian region, and the quagga mussel for the last 130 years (van der Velde, Rajagopal, & Bij de Vaate, 2010;Zhulidov et al., 2010). The first North American appearance of dreissenids was reported in 1986, with the zebra mussel's discovery in Lake St. Clair of the Great Lakes, attributed to ballast water released from one or more trans-oceanic vessels (Hebert, Muncaster, & Mackie, 1989). This was followed by the 1989 detection of the quagga mussel in Lake Erie (May & Marsden, 1992).
Zebra mussel populations in the Great Lakes initially increased very rapidly, whereas the quagga mussel remained rare for several years, being primarily confined to deeper water habitats in eastern Lake Erie (Karatayev et al., 2014;Stepien, Hubers, & Skidmore, 1999). Near-complete replacement of the zebra mussel by the quagga mussel since has occurred throughout the central and eastern Lake Erie basins, with the western basin now harboring a mixed-species community, where shallower waters provide refugia for the zebra mussel (Karatayev et al., 2014). Lake Erie has the longest history of harboring established populations of both species (Benson, 2013), constituting a focal area for this study.
In comparison, the zebra mussel first invaded the Hudson River in 1991 (Strayer et al., 1996), followed by the quagga mussel in 2008 (Strayer

K E Y W O R D S
Dreissena, environmental DNA, Great Lakes, high-throughput sequencing, invasive species F I G U R E 1 (a) Adult zebra mussel Dreissena polymorpha (left) and quagga mussel D. rostriformis (right) (each specimen about 2.0 cm in length, (photographs not copyrighted and used courtesy of USGS NAS; https://nas. er.usgs.gov/)). Dreissena veliger larvae from our plankton samples (b) under compound light microscope (single larva) and (c) using cross-polarized light (multiple larvae) taken by NTM in this study  & Malcom, 2013). This zebra mussel population initially grew very rapidly, accounting for >50% of the benthic biomass by 1992 (Strayer et al., 1996). Genetic analyses using microsatellites showed that zebra mussel populations from the Hudson River and Lake Erie significantly differed, implicating different founding origins (Brown & Stepien, 2010;Stepien et al., 2013). By 2010, the quagga mussel had spread throughout the Hudson River's entire freshwater reach, yet has remained only a small fraction (<10%) of the dreissenid community (Strayer & Malcom, 2013;Strayer, pers. comm.). The respective histories and relative compositions of the two species in the Hudson River and Lake Erie thus comprise an ecological contrast for the present investigation.
Early detection and continued monitoring of dreissenid introductions and spread patterns are high management agency priorities in the United States (NOAA, 2018;USGS, 2018), Canada (DFO, 2018;MNRF, 2018), and Europe (van der Velde et al., 2010;Zhulidov et al., 2010). Ecological effects of dreissenid invasions largely depend on their respective species composition, time elapsed since their establishment, population density, and hydrological characteristics of the waterbody (Benson, 2013). Predicting their ecological impacts thus relies on up-to-date information about their relative species composition and abundances. However, these sister taxa are difficult to distinguish to species as adults, sometimes even by trained malacologists, and cannot be differentiated at early life stages (as eggs or larvae; Nichols & Black, 1994;Claxton & Boulding, 1998).
Dreissenids are prolific, reproducing multiple times each season, with >1,000,000 eggs per female zebra mussel (Borcherding, 1991), and have free-swimming veliger larvae that widely disperse (Ackerman, Sim, Nichols, & Claudi, 1994;Figure 1b-c). Adult mussels attach in tremendous numbers to vessels, tackle, and drifting objects and can survive out of water for a week or longer (Collas et al., 2014;Stepien et al., 2013); they thus are readily transported among bodies of water. New introductions often pose significant biosecurity risks (Robinson, Burgman, & Cannon, 2011), since dreissenid populations often are not identified until they are already established and widespread, and by then are difficult (or impossible) to eradicate (Zaiko, Minchin, & Olenin, 2014). Genetic tools that effectively identify and diagnose dreissenids at early life stages (from plankton) and at low densities (e.g., environmental (e)DNA from water samples) would significantly improve timely detection, identification, and monitoring.

| Research objective and applications
The research objective was to develop, test, and apply a new targeted HTS metabarcode assay that simultaneously identifies and differentiates among all six Dreissena species (see Stepien et al., 2013;Marshall & Stepien, 2019), including zebra and quagga mussels, and discerns their respective population-level variability (i.e., haplotype representation and relative abundances) at all life stages. Experiments tested the assay's accuracy and performance on mixed communities of zebra and quagga mussel species and their haplotypes in: (a) a series of mock laboratory communities containing known quantities of DNA, (b) aquarium experiments, and (c) field-collected eDNA water and plankton (larvae) versus results from traditional sampling.
Dreissenid mussel community compositions were compared between two allopatric ecosystems in Lake Erie and the Hudson River, which have separate invasion histories. Field experiments evaluated community differences across a temporal collection series in Lake Erie encompassing the reproductive season, and along a spatial gradient along the Hudson River. We also compared and contrasted speciesand population-genetic variability from two sites in Lake Erie, one in the western basin where the zebra mussel is more prevalent and the other in the central basin, which is dominated by the quagga mussel.
These are statistically compared with results from the Hudson River sites, which also are dominated by the zebra mussel.
Our assay aimed to simultaneously assess inter-and intraspecific genetic variation from up to thousands of individuals in environmental samples, allowing for detection from eDNA in water and plankton of early life stages. It was designed to alleviate morphological mis-identifications. At the community level, accurate identification of dreissenids to species and assessment of their population genetic variability are projected to facilitate understanding their relative reproductive outputs, spread patterns, ecological interactions, successes, and adaptations over temporal and spatial scales.

| HTS assay design
First, we screened and aligned sequences for the "barcode" region of the mitochondrial (mt) cytochrome oxidase I (COI) gene from NIH NCBI GenBank (https://www.ncbi.nlm.nih.gov/genbank) and our own laboratory databases for all six species of Dreissena along with the two species of its sister genus Mytilopsis (Marshall & Stepien, 2019;Stepien et al., 2013). Primer pairs were designed by evaluating the aligned sequences with ClustalX in MEGA7 (Kumar, Stecher, & Tamura, 2016) and selecting regions of diagnostic variation that are flanked by conserved regions. The aligned database identified 20 unique zebra mussel haplotypes (labeled ZM-A through ZM-T) and 17 quagga mussel haplotypes (labeled QM-A through QM-Q) in a 570-nucleotide (nt) region of COI (Supporting information Tables   A1-A4). Two markers were developed and tested to target different nonoverlapping variable regions of ~200 nucleotides (nt), which encompassed intraspecific variation for zebra and quagga mussels and species-specific diagnosis for all six species of Dreissena, along with the two species of the closely related genus Mytilopsis (see Stepien et al., 2013 for taxonomy). We then focused on testing the two markers on zebra and quagga mussels, which were designed to each distinguish different single nucleotide polymorphisms (SNPs).
It delineates seven zebra mussel and nine quagga mussel haplotypes and also distinguishes among all six Dreissena species (Tables   A3-A4). Our two primer sets distinguish seven of the nine quagga mussel COI haplotypes described by Marescaux et al. (2016), found throughout its native and invasive ranges.

| Mock communities and aquaria experiments
Performance of the assays was tested on eight mock communities containing various known concentrations of DNA zebra and quagga mussel haplotypes (termed amplicon sequence variants (ASVs; see Callahan, McMurdie, & Holmes, 2017)). We thus tested whether the mock communities statistically maintained the relative ASV proportions during HTS (following our method previously published in Klymus, Marshall, & Stepien, 2017). Mixtures of purified DNA containing various proportions of 47-6,000 copies each of five zebra mussel haplotypes (ZM-A, ZM-C, ZM-G, ZM-M, and ZM-O) and three quagga mussel haplotypes (QM-A, QM-F, and QM-G) for the COI gene were analyzed (Table 1). We quantified the target copy number per DNA extraction for each ASV in a series of competitive polymerase chain reactions (PCR) that co-amplified the native template (NT) and an internal standard (IS; see Klymus et al., 2017). Our COIA assay (Tables A1 and A2) was designed to differentiate among three zebra mussel haplotypes (ZM-C, ZM-G, and the ZM-A,M,O group) and three quagga mussel haplotypes (QM-A, QM-F, and QM-G; Tables A1 and A2); thus, six ASVs were identifiable in the mock community with this marker. The COIB assay was designed to differentiate four zebra mussel haplotypes (ZM-G, ZM-M, ZM-O, and the ZM-A,C group) and one quagga mussel haplotype group (QM-A,F,G group); thus, five ASVs would be diagnosable in the mock community (Tables A3 and A4). Adult quagga mussels were collected from Belle Isle in the Detroit River, Detroit, MI (42.34,, and zebra mussels from the western Lake Erie basin in Oregon, OH (41.67,, both in April 2016 ( Figure 2). Three 38-L aquaria, air stones, and tubing were cleaned with 10% bleach and filled with 15 L of dechlorinated tap water. Aquaria were covered in clean plastic wrap and oxygenated throughout the experiments. Prior to introduction of the mussels, the water in each aquarium was thoroughly mixed, and a 500 ml aliquot was taken using a sterile container for eDNA analysis; this served as a negative control. The removed water was replaced with dechlorinated water from a sterile container. Mixed proportions of adults of the two species, ranging from 26 to 36 individuals total, were placed into each aquarium (Table 2). At days 2, 7, and 14, the water in each aquarium was sampled as above for eDNA analysis.
Water samples (including the negative controls) each were filtered through a 0.2-µm polyethersulfone (PES) filter, from which eDNA was extracted using the Qiagen DNeasy ® Blood and Tissue Kit (Qiagen Inc., Valencia, CA, USA), with the amounts of proteinase K and ATL doubled in order to fully submerse the filter. Extracted At the end of the experiment (day 14) and after the final water collection, mussels were removed, air dried for one hour, and individually weighed (g). They then were slowly cooled in a refrigerator for 24 hr and then individually labeled and frozen. DNA from each was extracted, amplified, and Sanger sequenced for the COI gene using primers and methodology from Folmer, Black, Hoeh, Lutz, and Vrijenhoek (1994  They were collected in sterilized bleach-washed containers, one from 15 cm below the surface and the other 15 cm above the benthos (the latter were taken and capped by a team of SCUBA divers from the Cary Institute of Ecosystem Studies, who were careful not to touch or disturb the benthos). Water samples were labeled, placed on ice in the field, and then frozen at −20°C in the laboratory.

| Field eDNA water and plankton samples
Simultaneously, the SCUBA diving team surveyed the adult dreissenid mussel communities by collecting 10-12 rocks (15-40 cm in size), from which the mussels were counted in the Cary Institute laboratory (totaling 2,984 mussels from Stuyvesant, 2,174 from Coxsackie, and 3,345 from Stockport). Length-mass regression analysis was performed on a representative range of sizes for ~300 mussels to estimate respective biomass per species. In the laboratory, half of each water sample (500 ml) was pelleted in 10 sterile 50-mL falcon tubes at 7,500 g centrifugation for 30 min. DNA was extracted with a cetyl-trimethyl ammonium bromide (CTAB) protocol (Turner et al., 2014), and purified with a Zymo Research One Step PCR Inhibitor Removal kit, with the resultant DNA used for HTS.
F I G U R E 2 Site map for dreissenid mussel plankton and eDNA water samples, from the Hudson River (HR, locations 1-6) and Lake Erie (Maumee River's mouth (MRM) and South Bass Island (SBI)) Triplicate plankton samples were collected from four Hudson River sites ( Figure 2) Hosler, 2011) was determined from the mean of the three aliquots per sample, multiplying by the sample volume (15 ml) divided by the filtered volume (~190 L), and reported as the mean of replicates ± standard error (SE). Numbers of Dreissena larvae were statistically compared using paired Student's t tests (Sokal & Rohlf, 1994).
One plankton sample from each site was centrifuged at 6,000 rpm for 5 min, and the ethanol decanted. DNA was extracted from the resultant pellet using the CTAB soil extraction protocol (Turner et al., 2014), with the resultant DNA amplified for HTS.

| MiSeq HTS library preparation
A four-step PCR library preparation was used prior to paired-end HTS on the Illumina ® MiSeq platform (https://www.illumina.com/systems/ sequencing-platforms/miseq.html). After each PCR, products were column cleaned with QIAquick ® PCR purification kits (Q iagen; https:// www.qiagen.com), yielding the template for the subsequent step. The first step of four cycles used assay-specific primers (COIA or COIB) to append 21-nt APEX tails (F:5'GATCAGGCGTCTGTCGTGCTC3', R:5'CTCGACGACAGACGCCTGATC3') to the target regions, which aimed to reduce PCR bias by retaining the relative ASV proportions of the sample throughout library preparation (Krjutškov et al., 2008).
PCRs contained 1X PCR buffer, 0.2 mM dNTPs, 0.5 μM of each primer, 1U AmpliTaq ® , 2.5 μl template DNA, and the necessary volume of ddH 2 0 to constitute 25 μl. Conditions were as follows: 30-s initial denaturation at 95°C, followed by four cycles of 95°C for 30-s, 62°C for 30-s, and 72°C for 1-min, with 2-min final extension at 72°C. Next, the APEX primers were used to amplify targets adhered with the APEX tail, using the same PCR conditions for 35 cycles.
The third step amplified previous products using APEX primers appended with 7-to 17-nt (termed spacers) and a 33-to 34-nt sequencing primer on the 5' end. The spacer region increased nt diversity of sequence reads and enhanced cluster formation, improving sequencing quality (Fadrosh et al., 2014;Wu et al., 2015). Four primer sets were used, which differed in their spacer combinations (see Klymus et al., 2017 primer set followed the library preparation, which was published in Klymus et al. (2017).
To avoid sequencing dimer product, targeted fragments first were size-selected with a 1.5% agarose gel cassette on Pippin Prep (Sage Science). PCR products were sized and quantified on a 2100 Bioanalyzer (Agilent Technologies). A negative control was run for each library preparation step and checked for contamination with gel electrophoresis, with the final step checked on the 2100 Bioanalyzer to verify the absence of product. No contamination was found in any instance. Pooled product concentrations TA B L E 2 Numbers (N), weight (g), and percentage of species and haplotypes (lettered) used in aquarium environmental DNA experiments (Aquaria 1-3) with assay COIA. Mussels were weighed at the end of the experiment (at 14 days)

Species and Haplotype
Aquarium Imaging Center in Wooster, OH (http://mcic.osu.edu/). An additional 40%-50% PhiX DNA spike-in control was added to improve data quality of low nt diversity samples (Klymus et al., 2017). HTS was performed on four MiSeq runs, which also contained tagged samples from other projects to increase diversity; all were pooled to yield >100,000 reads per sample.

| Bioinformatic and statistical analyses
MiSeq results were analyzed using the R package DADA2 (Callahan et al., 2016), except for three mock community samples that had low read depths (COIA mock community 5, and COIB mock communities 2 and 7). We instead merged and trimmed those with the Python package OBITools 1.2.11 (Boyer et al., 2016). For the other samples, the primer and spacer regions first were trimmed using a custom Perl script. Next, using the default settings in DADA2, error rates were estimated, sequences were merged and grouped into ASVs, and any erroneous sequences and chimeras were removed. Chimeras were not removed from the COIB dataset, since DADA2 mistakenly regarded haplotype ZM-E as a chimeric sequence. Sequence ASVs were identified to dreissenid species and haplotypes in reference to COI sequences on GenBank and our own database. Possible erroneous ASVs were eliminated from our environmental datasets by removing any sequence present only in a single sample.
For mock communities, the percentage of reads per ASV was calculated and analyzed using R v3.4.3 (R Core Team, 2015) by regressing the logarithm of their observed versus expected percent read abundances to evaluate goodness of fit and whether the slopes differed from 1.00. Contingency tests (Sokal & Rohlf, 1994) were used to compare HTS reads from (a) aquarium eDNA water having known ASV compositions, (b) Hudson River eDNA water samples with traditional morphological census counts, and (c) plankton across temporal and spatial scales. To compare relatedness among multiple-ASV community assemblages, Nonmetric multidimensional scaling (NMDS; Kruskal, 1964) ordinations were performed using Bray-Curtis dissimilarity indices of the relative abundances of ASVs using the R package vegan (Oksanen et al., 2018) and the metaMDS and anosim functions to parse the data according to collection locations.

| HTS of mock communities and aquaria
Number of HTS reads passing cluster quality filtering ranged from 14,329,380 to 15,237,283 per run (Table A5). Regression analyses revealed significant matches (p < 0.05) between the observed and expected sequence read percentages for the mock community samples with the COIA primers ( Figure 3a; Table 3), corresponding to the expected slope = 1.00. In contrast, the COIB marker displayed a weaker relationship between observed and expected ASV proportions for mock communities 3-5 (p > 0.05; Figure 3b; Table 3 alone (***p < 0.001, Figure 4). For #3, the zebra mussels were relatively small in size in comparison with the quagga mussels; thus, the F I G U R E 4 Species and haplotype proportions of zebra mussels (ZM; Dreissena polymorpha) and quagga mussels (QM; D. rostriformis) from environmental DNA in three aquarium experiments (A-C) using our targeted high-throughput sequencing COIA assay, in relation to biomass (weight, g) and number of individuals (N). Water samples taken on days 2, 7, and 14. Haplotypes are lettered and colored. Roman numerals denote significant concordance and/or differences in relative proportion of sequence reads for each aquarium using contingency G-tests   I  I  I  I  II  I  I  II  III  III  I  II  II  I  F I G U R E 5 Species proportions of zebra mussels (ZM; Dreissena polymorpha) and quagga mussels (QM; D. rostriformis) from plankton and eDNA field experiments using our targeted high-throughput sequencing COIA assay (this paper) and our previously published 16S assay (Klymus et al., 2017). Sampling locations were as follows: Maumee River's mouth (MRM) in Lake Erie (July 1, 2016), South Bass Island (SBI) in Lake Erie ( relative biomass of the latter was much greater than their relative numbers (Table 2).
Species proportions in the HTS reads from aquarium #1 at days 2 and 7 did not significantly differ from the relative numbers of individuals per species or from biomass ( Figure 4a). On day 14, aquarium #1's sequence reads significantly differed from both its numbers of individuals per species (***p < 0.001) and biomass (***p < 0.001). Sequence reads from aquarium #2 significantly differed from numbers of individuals per species and biomass on all three days (Figure 4b). A quagga mussel died in aquarium #2, which went unnoticed. It likely died early in the experiment, increasing the amount of quagga mussel DNA throughout the duration. Sequence reads per species in aquarium #3 did not differ from the number of individuals per species at day 2, but differed from their respective biomass (***p < 0.001; Figure 4c). However, at days 7 and 14, its sequence reads did not differ from biomass per species, but varied from the numbers of individuals per species (***p < 0.001). This result resembled that of aquarium #1.
By the end of the experiment, the relative proportions of all sequence reads per species significantly differed from those at day 2 (day 2 vs. 14: ***p < 0.001 for all three aquaria; Figure 4), revealing a trend that more closely matched biomass at day 14, rather than the numbers of individuals (as occurred on day 2). The change to resembling biomass per species was observed at day 7 for aquaria #2 and #3, but at day 14 in #1.

| HTS of field eDNA water and plankton samples
The  Figure 5). The COIA was dreissenid specific, as 100% of the HTS results corresponded to zebra and quagga mussels. In contrast, the proportion of dreissenid sequences was much lower for the 16S assay (ranging from 0.36% to 80.25%).
Additionally, the proportions of species-specific sequence reads from the Hudson River eDNA water samples were statistically similar to the species ratios of the benthic adults determined by the Cary Institute's traditional morphological identifications ( Figure 6).
In both data sets and at all locations, zebra mussels greatly outnumbered quagga mussels. The Cary Institute's proportions were 95.0%-96.5% (mean = 95.7%) zebra mussels from morphological counts and 91.2%-92.3% (mean = 91.7%) zebra mussels using estimated biomass. Their species' proportions did not significantly differ among the three Hudson River locations. Our HTS reads from eDNA water (surface and benthic samples) similarly yielded 83.5%-97.7% (mean = 92.1%) zebra mussels. Coxsackie, *p = 0.04; Figure 6). The benthic eDNA sequence reads (83.5%-95.5% zebra mussels, mean = 87.8%) more closely matched the Cary Institute's biomass estimates (no significant differences at any site), than their adult mussel counts (significant difference for Stuyvesant, *p = 0.02; Figure 6). The numbers of veligers significantly differed between the Lake Erie sites (Maumee River's mouth and South Bass Island) throughout the sampling season (***p < 0.001; Figure 7a,b). The relative HTS proportions of zebra mussels to quagga mussels In western Lake Erie, the relative HTS proportions of ZM:QM in plankton averaged 70:30% (ranging from 37%-97% ZM) at the Maumee River's mouth and 34:66% (22%-41% ZM) at South Bass Island, which differed significantly in depth (with the Maumee River's mouth being shallower) and were located ~36 km apart.

Physical counts of
Overall HTS proportions of ZM:QM significantly differed between the two locations (***p < 0.001), matching their adult communities.
Species composition of the veliger larvae significantly varied temporally at the Maumee River's mouth (mean **p = 0.008), with fewer zebra mussels and more quagga mussels at the earliest  (Table A6).
When samples were grouped by location, the lowest KDI values occurred between the Hudson River and Maumee River's mouth (KDI = 0.10), and the largest between the Hudson River versus South Bass Island (KDI = 0.58; Table A6).

| Haplotypic diversity
Two rare zebra mussel haplotypes (ZM-C and ZM-G) and one rare quagga mussel haplotype (QM-G) were detected in the HTS analyses of eDNA water and plankton, with all three haplotypes previously known in GenBank. Haplotype ZM-C was present in all but one Lake Erie plankton sample on August 2 (0.87% at MRM and 0.33% at SBI) and occurred throughout all Hudson River sites (0.78% in plankton and 1.37% in eDNA; Figures 7 and 8b). We found the rare haplotype ZM-G throughout the Hudson River (four sites at 0.76% in plankton and all locations at 0.65% in eDNA), which was scarce in Lake Erie (one MRM sample at 0.07% and four from SBI at 0.04%). The QM-G haplotype was discerned in plankton at both Lake Erie locations (0.39% at MRM and 0.98% at SBI) and in the Hudson River (0.18%), but was absent from our Hudson River eDNA samples (Figure 8b).

| D ISCUSS I ON
This study developed and tested a new HTS metabarcoding assay that simultaneously differentiated among dreissenid species and their intraspecific haplotypes from single samples containing DNA from thousands of individuals. Since a single dreissenid mussel adult can release >1,000,000 gametes in a spawning season (Borcherding, 1991), and subsequent larval stages persist throughout the water column from early May through September (Borcherding, 1991;Sprung, 1993), water and plankton provide an opportunity to rapidly survey DNA from vast numbers of individuals, facilitating early and accurate detection of AIS. Estimated numbers of veliger larvae analyzed per our single plankton samples ranged from 2,220 to 14,860 individuals, demonstrating the utility of our assay and approach to assess the genetic variation of thousands of individuals. Thus, our metabarcoding approach offers greater ability to effectively discern species and population genetic diversity versus time-intensive, single individual processing with traditional sampling and sequencing.
This HTS assay also accurately reveals species proportions within varying environmental samples (both water and plankton), as our sequencing results matched those estimated using our laboratory's previously published mollusk 16S assay (Klymus et al., 2017). That F I G U R E 6 Species and haplotype proportions of zebra mussels (ZM; Dreissena polymorpha) and quagga mussels (QM; D. rostriformis) from field eDNA samples using our targeted highthroughput sequencing COIA assay, in relation to estimated biomass (weight, g) and number of individuals (n) in the adult community of the Hudson River. eDNA water samples collected from the surface and the benthos at three sites (Stuyvesant, NY (HR1), Coxsackie, NY (HR2), and Stockport, NY (HR3)).
Haplotypes are lettered and colored. Roman numerals denote significant concordance and/or differences in relative proportion of sequence reads per site using contingency G-tests

HR2 HR3
16S assay, unlike our new COI assay, does not resolve intraspecific variation. Furthermore, the 16S assay amplifies a wide range of taxa and thus might fail to detect a small dreissenid population.
The COI HTS assay was successful for evaluating representative haplotypic proportions of zebra and quagga mussels in labo- For example, a study by Diggins (2001) found that quagga mussels had higher filtration rates (>37%) than zebra mussels of the same size (20 mm), which could lead to higher rates of eDNA shedding, but remains to be experimentally verified. In our study, when the surface and benthic eDNA results were averaged together, their species' proportional sequence reads did not differ from either their morphological counts or estimated biomass at any field location.
These results suggest that slight differences in DNA concentrations can occur throughout the water column, and water collected from near the benthos (where most adults live) would best represent the true dreissenid biomass composition. A qPCR study of zebra mussel eDNA also found correspondence to benthic biomass of adults (Amberg, Merkes, Stott, Rees, & Erickson, 2019). As in the present study, eDNA surveys in the Rhine River using species-specific quantitative (q)PCR assays matched the respective species proportions from field surveys (De Ventura et al., 2017). Their approach differed from ours as they required two different probes to differentiate between the species and were unable to evaluate haplotypic (intraspecific) variation.
F I G U R E 7 Species and haplotype proportions of zebra mussels (ZM; Dreissena polymorpha) and quagga mussels (QM; D. rostriformis) from field plankton samples collected from ( 2432  2223  1881  2527  6384  8379  8037  11704  5263  7087  6897  5783  13053  14858  6840  6175  11780   I  II  III  IV  I  I  I  I  I  I  I  I  II  II  II  II  I  Sequencing results with our HTS assay accurately depicted the two allopatric dreissenid invasions, with the Hudson River and South Bass Island in Lake Erie sites being the most divergent. These results matched the morphological species compositions of the two invasions, with the quagga mussel dominating at South Bass Island (Karatayev et al., 2014) and the zebra mussel most prevalent in the Hudson River (Strayer & Malcom, 2013). As expected, our assay showed that the zebra mussel dominated the plankton from the Maumee River's mouth, since that region has been described as having the sole zebra mussel refugium in Lake Erie (Karatayev et al., 2014). Additionally, our assay displayed temporal resolution at the Maumee River's mouth, whose larval species composition varied throughout the sampling season. Throughout June, its species composition more closely resembled that from the Hudson River, The quagga mussel has been outcompeting the zebra mussel in many invaded European rivers, increasing by an estimated 26% per year in Germany and the Netherlands (Heiler et al., 2013). Large fluctuations in their densities, with quagga mussels eventually displacing zebra mussels, were observed in the northern Dnieper River (Ukraine; Zhulidov et al., 2010). A similar takeover by the quagga mussel might substantially alter ecological processes in the Hudson River. For example, the quagga mussel's displacement of the zebra mussel in the lower Great Lakes increased total dreissenid biomass, as the quagga mussel more readily spreads across soft substrates, magnifying overall ecological impacts (Nalepa, Fanslow, & Lang, 2009). Moreover, selective filtering and phosphorous excretion appear to be contributing to increased harmful algal blooms in the lower Great Lakes and may significantly differ between the two dreissenid species (Conroy et al., 2005;Vanderploeg et al., 2001).
Therefore, it is important to effectively discern the species compositions of dreissenid communities spatially and temporally. This can be F I G U R E 8 (a). Nonmetric multidimensional scaling (NMDS) plot (Bray-Curtis distance metric of relative abundance) showing the relationships among dreissenid mussel communities from field samples using our targeted high-throughput sequencing COIA assay. Purple = Hudson River eDNA water, Red = Hudson River plankton, Green = Maumee River's mouth, Lake Erie plankton, Blue = South Bass Island, Lake Erie plankton. (b) Network showing relationships among zebra mussel (ZM; Dreissena polymorpha) and quagga mussel (QM; D. rostriformis) haplotypes for the mtDNA COI gene region used for our COIA assay. Hatch marks designate the number of nucleotide differences. Circle sizes represent the proportion of sequence reads attributed to that haplotype per sample location. Black circles denote haplotypes from NIH NCBI GenBank that were not found in our environmental samples Although our assay solely was used here for zebra and quagga mussel populations, it could further be applied to distinguish the presence and species identities of the four other Dreissena species, as well as the two related-and also highly invasive-Mytilopsis species (M. leucophaeata and M. sallei). In marine and estuarine harbors worldwide, invasive Mytilopsis spp. exert similar ecological and economic fouling problems as do zebra and quagga mussels (Laine, Mattila, & Lehikoinen, 2006;Sousa, Gutiérrez, & Aldridge, 2009). In addition to the well-known zebra and quagga mussels, two Dreissena species native to the Balkans (D. blanci and D. carinata) recently expanded their ranges and may become invasive throughout Europe (Wilke et al., 2010). Furthermore, D. anatolica is endemic to just a few lakes in Turkey (Marshall & Stepien, 2019;Stepien et al., 2013;van der Velde et al., 2010) and therefore is at risk from competition pressures by future expansions of congeneric invaders. Our assay thus has apparent wide application to track and monitor the distributional trajectories of these taxa, by providing both species identification and their respective intraspecific population data.
Due to the much larger numbers of individuals surveyed (thousands vs. ~100), rarer haplotypes are more likely to be analyzed using HTS than with conventional sampling and sequencing of separate individuals. In our results, the rare haplotypes identified from surface and benthic collections at the same location sometimes differed; we thus recommend that future eDNA studies survey a variety of depths, with triplicate samples, to facilitate thorough evaluation of haplotypic diversity. Samples taken from different depths could be combined during DNA extraction, thereby yielding an enhanced representation of the entire water column, increasing the likelihood of surveying rare haplotypes.
Site occupancy models also could be employed to evaluate possible failure to detect rare haplotypic variants (Dorazio & Erickson, 2017).
All haplotypes identified here for the zebra and quagga mussels are previously known sequences deposited in GenBank. This produces strong evidence that our screening procedure successfully removed any erroneous haplotypes resulting from possible PCR/sequencing error. Haplotype ZM-C previously was identified only in the Mohawk River, NY, a tributary of the Hudson River (Molloy et al., 2010), but here occurred in all of our Hudson River sites (plankton and eDNA), as well as in all but one of our Lake Erie plankton samples. ZM-G previously was described from the Dnieper River (a native habitat; Soroka, 2010) and Lake Superior in the Great Lakes (Grigorovich, Kelly, Darling, & West, 2008), yet appears to be widespread, since we found it in Lake Erie and throughout the Hudson River. Previous investigations described haplotypes QM-G as being widespread in the Dnieper River (where the quagga mussel natively occurs) and throughout its invasive range in Europe (Romania, Germany, France, and The Netherlands), as well as in Lake Huron and Lake Cayuga, NY (Marescaux et al., 2016). That haplotype was not previously known from Lake Erie, where Marescaux et al. (2016) surveyed just ten individuals. The veliger larvae collection and HTS analysis approach used here is much more likely to elucidate the presence of rare haplotypes, such as these, due to the tremendous number of larvae sampled and analyzed.

| The present approach versus other eDNA techniques
Some previous studies found that qPCR is a useful tool for estimating the number of dreissenid mussel sequence copies in a plankton or environmental sample (Amberg et al., 2019;De Ventura et al., 2017;Ram et al., 2011). However, those each required two separate markers (one for zebra mussels and another for quagga mussels) and neither discerned population genetic variability. Ram et al. (2011) used mt16S RNA for zebra mussel and mtDNA COI for quagga mussel, with the two targeted regions differing by 181 nt in length. Amberg et al. (2019) designed COI primers for the zebra mussel, with their future aspirations stated to develop a probe to differentiate quagga mussel eDNA. De Ventura et al. (2017) employed two overlapping regions of mtDNA COI, which had different primers and differed in length by 100 nt. Thus, their respective targeted regions would have differential amplification, and could degrade at different rates in the environment, rendering the two products of the two species not directly comparable. This likely would lead to inaccurate estimates of their relative abundances (see Dejean et al., 2011) in both methods. Two other qPCR assays for dreissenids (Gingera et al., 2017;Peñarrubia et al., 2016) lacked the ability to distinguish between the species. Egan et al. (2015) used light transmission spectroscopy to test water from the hulls of ships from Great Lakes harbors, and were able to detect eDNA from the quagga mussel, but not the zebra mussel, from localities where both were known to be present. Our approach overcomes such issues by using a higher-resolution HTS metabarcoding approach, amplifying the same gene region for both species.
The present targeted metabarcoding HTS assay yields important intraspecific population-level information for dreissenid mussels, beyond the scope of the earlier published methods. For example, the Klymus et al. (2017) 16S assay accurately differentiates between the two species, but that gene region evolves too slowly to provide population genetic resolution. Our study results here discerned haplotypic diversity of zebra and quagga mussels that was undetected by conventional sequencing of adults (Marescaux et al., 2016;Stepien et al., 2013). Similar intraspecific approaches using the mtDNA control region have been developed for the whale shark Rhincodon typus (Sigsgaard et al., 2016) and harbor porpoise Phocoena phocoena (Parsons et al., 2018), and with the cytochrome b region for the invasive silver carp Hypophthalmichthys molitrix .
Our assay is designed to accurately estimate haplotypic frequencies, as well as relative species proportions, which will be useful for elucidating invasion pathways, along with documenting population relationships across temporal and spatial scales.
In summary, this diagnostic approach identifies and determines the relative representation of closely related species and their populations from plankton and eDNA, facilitating community-and population-level detection and ecological comparisons. The new targeted HTS metabarcoding dreissenid assay yields key information about the relative proportions of inter-and intra-specific variation.
Tremendous numbers of individuals can be rapidly and simultaneously assessed, discerning and quantifying the relative abundances of species and their population-level haplotypic diversity among locations. Present-day morphological techniques for early detection of dreissenid invasions from plankton have been deemed insufficient for western U.S. reservoirs (Counihan & Bollens, 2017), with eDNA suggested as a useful alternative (Hosler, 2017). In comparison, our method, which jointly evaluates inter-and intra-specific diversity of plankton and eDNA water samples, allows for analysis of the temporal and spatial trajectories of dreissenid invasions, including potential displacements of zebra mussel populations by quagga mussels. Our assay and approach thus show broad and widespread application as an effective monitoring and research tool for understanding biological invasions and resultant ecological changes. Elz, K. Klymus, E. Kramer, and M. Snyder) for laboratory and/or editorial assistance, and M. Neilson for assistance in obtaining photographs from USGS of zebra and quagga mussels.

CO N FLI C T O F I NTE R E S T
None Declared.

APPENDIX A
TA B L E A 1 Single nucleotide polymorphisms (SNPs) that differentiate zebra mussel (ZM; Dreissena polymorpha) and quagga mussel (QM; D. rostriformis) haplotypes (lettered) for the COIA region (from NIH NCBI GenBank and our records). Haplotypes used for our mock communities are in bold italics. Primer regions are shaded grey.