eDNA metabarcoding in lakes to quantify influences of landscape features and human activity on aquatic invasive species prevalence and fish community diversity

Our goal was to use eDNA metabarcoding to characterize fish community diversity, detect aquatic invasive species (AIS) and assess how measures of community (or AIS) diversity are influenced by lake physical and environmental covariates, measures of hydrological connectivity and human accessibility.


| INTRODUC TI ON
Accurate and efficient monitoring of species community composition and quantification of causal environmental factors associated with species presence, abundance and distribution in aquatic environments necessitates effective sampling methodology and rigorous and innovative analytical designs. There is increasing demand for large-scale studies to quantify aquatic biological diversity at the community or ecosystem scale (Green, 1979;Kennedy et al., 2020;Magurran, 2013). In freshwater ecosystems, environmental DNA (eDNA) metabarcoding is an increasingly popular method for surveying biodiversity without the need to directly observe individual organisms (Hänfling et al., 2016;Olds et al., 2016). The utility of eDNA assays has rapidly been recognized by managers and policymakers (McElroy et al., 2020;Rees et al., 2014). Monitoring approaches that allow for accurate representation of rare and invasive species are important in conservation of biological diversity (Rice et al., 2012;Yates et al., 2019). Unfortunately, traditional survey methods (e.g. seines, electrofishing) are often laborious (in the case of multiple sites and gears), invasive and rely on taxonomic expertise (Creer et al., 2016). The detection of low abundance species can be difficult because of gear biases (Ruetz et al., 2007), as well as the large effort required to detect small numbers of individuals in large aquatic systems (Jerde et al., 2011). Using molecular methods for species identification, however, provides alternatives to traditional fisheries gears for surveying fish biodiversity that yield consistent estimates of species richness for aquatic communities (Hänfling et al., 2016;McElroy et al., 2020;Sard et al., 2019).
The probability of detection can vary greatly from species to species depending on the biology and behaviour of the target organism, and the environmental characteristics and complexity of the lakes they inhabit (Bracken et al., 2019;Li et al., 2019;Rees et al., 2014). In turn, diversity of aquatic habitats may also depend on environmental characteristics, including aquatic connectivity (Fagan, 2002;Lamy et al., 2013;Perkin & Gido, 2012), land-use/land-cover of surrounding landscapes (Jocelyn et al., 2005;Schlosser, 1991) and levels of human disturbance (Esselman et al., 2013). Additionally, the likelihood of false negatives is often high for rare and/or difficult to detect species (Kumar et al., 2020;Mathieu et al., 2020;Pinfield et al., 2019). To address this problem, occupancy modelling has been used to provide improved estimates of species occurrences by integrating data from multiple samples while evaluating the probability of detecting a given species (Kumar et al., 2020;MacKenzie et al., 2002;McClenaghan et al., 2020;Schmidt et al., 2013), in combination with limnological characteristics of aquatic habitats (Harper et al., 2020).
Introduction, establishment and increases in abundance and distribution of AIS are of concern within the Great Lakes region because of the increasing risk of secondary spread into inland water bodies, and the ecological impacts that these species may have on native communities (Coble et al., 1990;Escobar et al., 2018;Sard et al., 2019).
Threats in Michigan are substantial because of the large amount of Great Lakes shoreline (>3,000 miles) and abundance of lakes (~11,000) in close proximity to source Great Lakes AIS populations. Early detection of AIS when populations are in low abundance and restricted in distribution is widely recognized as a critical component of AIS eradication and control (Vander Zanden & Olden, 2008). Environmental DNA methodology, whether on a single species or multiple species level of interrogation, provides sensitive and accurate early detection for AIS surveillance programs and has the added benefit of being able to simultaneously characterize community diversity when using metabarcoding. Of equal or greater importance, and as a call for future work in the field, hypothesis-oriented research is needed to develop predictive relationships between eDNA-based species presence, abundance and diversity measures and the biological and physical features of aquatic ecosystems and adjacent landscapes.
In addition to their utility as methods for early detection of AIS (McElroy et al., 2020;Sepulveda et al., 2020), eDNA metabarcoding surveys also provide an appealing approach for addressing questions related to the assembly of aquatic communities (Sard et al., 2019).
These questions are of increasing importance in aquatic systems as species invasions, climate change and habitat fragmentation pose substantial threats to biodiversity in the 21st century. Theory suggests environmental features (e.g. the size and isolation of habitat patches; MacArthur & Wilson, 1967) play an important role in determining the composition and structure of ecological communities.
Thus, lake size and patterns of connectivity may drive differences in species diversity and AIS presence and prevalence across habitats.
Similarly, biotic interactions (including those with AIS) and anthropogenic influences may also influence the number of species in a habitat or the establishment of invasive species (Gallien & Carboni, 2017).
At the same time, aquatic connectivity could lead to downstream dispersal of DNA molecules from established populations upstream.
Previous studies have documented declining eDNA copy number and rates of detection with distance (Deiner & Altermatt, 2014;Jane et al., 2015;Tillotson et al., 2018). However, results from Deiner and Altermatt (2014) suggest that positive detections may extend as much as 50 km downstream in flowing systems, raising concerns about false positive detections in highly connected networks of aquatic habitats. eDNA metabarcoding surveys, particularly when they are combined with traditional fisheries survey gear and data on features that correlate with diversity and abundance and allow agencies to more effectively direct AIS management activities.

K E Y W O R D S
aquatic invasive species, environmental DNA, fish community diversity, landscape ecology, metabarcoding, occupancy modelling the abiotic characteristics of lakes and surrounding landscapes, can thus help to clarify the relative influences and importance of characteristics such as area and isolation on aquatic community composition and the presence or prevalence of AIS.
When conducting landscape-scale analyses that quantify associations between measures of ecological diversity and environmental covariates or measures of anthropogenic disturbance, grain size, extent size and the spatial distance among sampling locations are important aspects of landscape data sampling (Anderson et al., 2010).
For example, previous studies have demonstrated strong associations between the composition of fish communities and environmental features when sampling was conducted over expansive areas (Sharma et al., 2011). In this study, we sample aquatic communities at a state-wide spatial scale. Information was gathered for 22 lakes in Michigan to compare eDNA metabarcoding estimates of fish community composition and detection of invasive species across a range of environments differing in lake area, connectivity to other aquatic systems, and degree of anthropogenic disturbance. The goals of this project were to identify associations between physical lake characteristics and fish community diversity or relative abundance of AIS, compare detection probabilities of traditional fish sampling gears and eDNA, and use eDNA to identify "hot spots" of AIS to inform proactive management actions in Michigan's inland lake habitats.

| Lake selection
Twenty-two lakes in Michigan were sampled over three consecu- Lakes were chosen based on physical and abiotic attributes including lake size (ha), depth (m), and the lake classification scheme of Wehrly et al. (2012), which incorporates information on lake depth, growing degree days and mean water temperature during the icefree period, to include a representative collection of lakes with diverse fish communities and limnological characteristics (Table 1).
Additional landscape features were also considered during lake selection, including hydrologic connectivity and land use patterns (forest, agriculture, urban).
Physical and environmental information was collected for all 22 sampled lakes from the Lake multi-scaled geospatial and temporal database (LAGOS-NE, https://lagos lakes.org/; Soranno et al., 2017).
Selected lakes include those frequented by anglers (Houghton Lake and Higgins Lake) as well as representatives from counties with the highest numbers of invasive species detections in the state (Cass Lake, Oakland Co. and Thompson Lake, Livingston Co.; Midwest Invasive Species Information Network, https://www.misin.msu.edu).

| Sampling and eDNA extraction
The study also included MDNR Fisheries Status and Trends (ST) survey information for 13 lakes that were sampled in 2016 and 2018 (Table 1). MDNR's ST survey deploys multiple gear types and standardized techniques to characterize fish community composition and detect invasive species in Michigan's lakes (Schneider, 2000). ST survey data were used to compare eDNA metabarcoding and traditional sampling gear (Table 2). Traditional gear types included boat electrofishing (boomshocker), seines, large mesh fyke nets, small mesh fyke nets, trap nets and experimental gill nets (for details on net dimensions and mesh sizes, see Schneider, 2000). Given the variation in sizes and maximum depths of lakes sampled in this study (Table 1), not all traditional gear types were used in all lakes and there was substantial variation in the total number of gear deployments across lakes (Table 2). A total of 499 traditional gear samples were collected across the 13 lakes and the number of sampling locations varied between 18 and 88 locations per lake. Fish collected by the MDNR field crews were identified to species. Fish counts per species were tabulated by gear type, sampling location and lake. eDNA and traditional gear sampling were conducted in the same locations within a lake when possible, with additional randomly selected eDNA samples in lakes with fewer traditional gear sampling locations. In lakes where traditional gear surveys were not conducted, eDNA sample coordinates were randomized. Sample numbers ranged from 30 to 59 across the surveyed lakes, with more sampling effort allocated to larger lakes, although samples were not proportionally distributed based on lake size (  Note: Lake sizes (ha) and landscape-scale environmental covariates are derived from the LAGOS-NE database (Soranno et al., 2017). Lake locations are shown in Figure 1.
a The maximum depth of the lake in metres (UNK -unknown, max depth for Dumont, Haithco, and Ocqueoc Lakes were not available in LAGOS-NE. Ocqueoc max lake depth was retrieved from the MDNR Aquatic Habitat Viewer).
b Lake classification according to Wehrly et al. (2012). No. boat ramps-The total number of boat accesses that were seen on satellite images of the lakes via Google Earth that meet criteria for public accessibility.
h Using estimates of county population from 2017 (from MI Dept. Health & Human Services) and Great Circle distance between county centroids and lake centroids. For each lake, the sum (across counties) of the population size of the county divided by the distance between the county and lake centroids (Euclidean distance, in km). Values in the column are converted to a 0-1 scale by dividing each by the maximum of these summed, distance-weighted, population sizes per lake.  of 2016 water samples (n = 446, from eight lakes), including benthic and surface eDNA sampling, was conducted according to Sard et al. (2019). In 2017, 95 water samples were collected and filtered from two lakes, and in 2018, 565 water samples were collected from 12 lakes. On average ~25.1% of samples per lake were of benthic zone water (range 20.0% to 29.6% per lake) and were collected using a Van Dorn sampler near the lake bottom. For each lake, six to nine negative control samples, consisting of one litre of distilled water ("no-DNA"), were filtered to provide a measure of quality control and quantify any contamination that occurred during water sampling. A Van Dorn negative (distilled water "no-DNA") was taken prior to use in each lake to quantify levels of contamination associated with the sampling device. Filtering (field) negatives were interspersed randomly with water samples to quantify contamination associated with the filtering apparatus. One litre of distilled water was also filtered as a negative control immediately after the last sample was processed to quantify levels of contamination arising during the sample processing. For each of the negatives, distilled water was poured into sterile 1-L wide-mouth Nalgene bottles and filtered for eDNA analysis.

TA B L E 2
Prior to each sampling event, all bottles were soaked in a 20% (V/V) bleach solution for 10 min to eliminate residual DNA contamination (Prince & Andrus, 1992), rinsed with distilled water, and left to dry for 24-48 hr at room temperature. Water samples were filtered on the boat immediately after collection using a Smith-Root ANDe™ eDNA extractions were carried out in a separate room away from the laboratory where genomic DNA was handled. All necessary materials and bench spaces were cleaned prior to use with either 25% (V/V) bleach, UV light, and/or DNA Away (Thermo Fisher Scientific, Inc.). All pipetting was done with pipettors used exclusively for eDNA samples and sterile filter tips. Also, one extraction negative control was included in each extraction event for each lake (some lakes had up to three extraction negatives) to test for contamination during the DNA extraction procedure.

| eDNA library preparation and sequencing
eDNA samples were amplified using primer sets targeting two mitochondrial rRNA gene regions, 12S (Riaz et al., 2011) and 16S (Deagle et al., 2009), to characterize fish community composition.

| Database development
The 12S

| Analysis of eDNA community matrix
Sequencing data from all 22 lakes were combined for subsequent analysis in Mothur v1.39.5 (Schloss et al., 2009)  Prior to analyses, contamination was considered for each sam- All community matrix analyses were conducted in r v3.5.3 (R Core Team, 2020) using the "tidyverse" (Wickham, 2017) package.
For each sample within a lake, the number of species (species richness) detected by 12S and 16S markers was calculated. Estimates of the Shannon index of diversity (Shannon, 1948) for each lake were also generated, under the assumption that metabarcoding read counts reflect relative abundance of fish species.

| Generalized Linear modelling
Generalized linear models (GLMs) were used to identify landscape and lake covariates that may explain variation in alpha diversity and invasive species prevalence across the 22 sampled lakes. Dependent variables evaluated included species richness, Shannon diversity, the proportion of sequencing reads attributed to AIS, and the number of AIS per lake.
All analyses were conducted separately for the two eDNA metabarcoding markers (12S and 16S). We also used species richness from traditional gear surveys as a response variable, for the subset of lakes (n = 13) where MDNR ST survey data were available. Shannon diversity, number of AIS and AIS prevalence from traditional gear survey data were not used as response variables in our GLMs, given expected gear selectivity and variation in effort and gear deployments across lakes.
Climatological, lake morphometric and connectivity data from the LAGOS-NE database (which includes information for more than 50,000 lakes >4 ha in surface area across 17 states ;Soranno et al., 2017) were included as independent variables in modelling analyses for measures of alpha diversity. Specific variables in our modelling analyses included: log-transformed lake area (in hectares), lake class (from Wehrly et al., 2012), stream density in the inter-lake watershed, log-transformed area of upstream lakes (greater than 4 hectares) and per cent developed or agricultural land in a 500 m buffer around each lake. Modelling analyses for AIS species richness and read proportions also included square-root transformed boat ramp density per lake and distance-weighted human population density (calculated as the sum of county census numbers from 2017, weighted by distance to each lake) as surrogates for potential anthropogenic influences. See Table 1 for complete descriptions of the environmental variables used in GLM analyses.
For each response variable, we evaluated single-variable models and two candidate models focused on "natural" (i.e. area, class, stream density, upstream area) and "anthropogenic" characteristics (distance-weighted population size, boat ramps per area and per cent development) of the sampled lakes. Poisson regressions were used for count-based response variables (i.e. number of AIS, total species richness), while normal regressions were used for continuous response variables (i.e. the proportion of reads attributed to AIS, Shannon diversity). Variables were log-or square roottransformed to meet model assumptions. GLMs were generated and plotted by using R functions from the "ggplot2" (Wickham, 2016), "tidyr" (Wickham, 2020), "dplyr"  and "varhandle" (Mahmoudian, 2020) packages. For each set of models, we conducted AICc and BIC model selection using functions from the "MuMIn" (Barton, 2020) and "stats" (R Core Team, 2020) packages.

| Occupancy modelling
A Bayesian hierarchical occupancy model (Kery, 2010;MacKenzie et al., 2002) was used to estimate gear-specific detection probabilities for three individual fish species in lakes that were sampled using both eDNA and traditional gear (n = 13). We focus this analysis on three surrogate species to represent species with similar body morphology and patterns of habitat usage (Gallien & Carboni, 2017;Procheş et al., 2008)  In the model, occupancy was defined at the lake level and it was assumed that occupancy did not change between sampling events.
Therefore, the detection probabilities generated from the analysis can be interpreted as the probability of a species being detected with a single sample of a given gear type, given that the species was present in the lake. The biological model of occupancy was defined as Z ls~B ernoulli(ψ s ), where ψ was the probability of occupancy for each species (s) and Z was the true occupancy for each species at each lake (l). The observation model was defined as Y i~B inomial(p sg , Z ls ), where Y was the detection at a given sample (i) and p was the detection probability for each species (s) and gear (g). The model was analysed using Markov Chain Monte Carlo (MCMC) methods in R with the "jagsUI" package (Keller, 2018). Three independent Markov chains were run with 4,000 iterations following a 1,000 iteration burn-in. Model convergence was assessed by visually inspecting chains and based on Rhat values near one.

| Heat maps
We used spatial interpolation methods to visualize patterns in species diversity and relative abundance of N. melanostomus (a common AIS in the Great Lakes drainage basin; Jude et al., 1992) within lakes (similar to maps in Jo et al., 2017). Given the limited number of water samples collected from each lake, surfaces of total fish community diversity (species richness) and relative abundance (to account for unequal sequencing effort across samples) of N. melanostomus were interpolated using inverse distance weighting (IDW; Bartier & Keller, 1996). Optimal weighting coefficients were identified by leave-one-out cross-validation for each lake (similar to the approach applied for interpolation of rainfall data in Chen & Liu, 2012). Briefly, species richness (or N. melanostomus relative abundance) values associated with each sampled location were sequentially removed from the dataset and predicted by IDW interpolation from the remaining data. Then the root mean squared error (RMSE) of the predictions associated with each weighting coefficient was calculated as, where SR i is the observed species richness (or relative abundance) associated with location i, Ŝ R i is the predicted species richness when that sample was removed, and n is the total number of water samples collected from a lake. Evaluated IDW coefficients ranged from 2 to 21 (larger values more severely limit the influence of distant samples) and we chose the coefficient that minimized RMSE for each lake.

| RE SULTS
The total number of sequence reads (excluding no-DNA controls) generated for the 12S and 16S markers were 9,165,523 and

| Alpha diversity
The number of fish species detected per lake using eDNA markers varied by over 3-fold (11 to 38 species) for 12S, while for 16S the number of species detected varied from 16 to 42 species (Table 2).
The highest number of species was detected in Pentwater Lake, which has a direct connection to Lake Michigan, and Five Channels Dam Pond. The number of species detected per lake was highly positively correlated between 12S and 16S markers (Pearson correlation 0.904, p < .001). eDNA metabarcoding data were also used to estimate Shannon diversity for each of the 22 lakes (Table 2).
There was no significant correlation between Shannon diversity  Table 2).

| Aquatic invasive species detection
Aquatic invasive fish species were detected in 21 lakes using at least one sampling method (Table S1). One to four invasive species were detected per lake (except Manistique Lake where no AIS were detected). Preliminary examination of detection patterns suggested that eDNA sampling was a more sensitive means of AIS detection than traditional gear approaches, as both eDNA markers detected AIS in a larger proportion of lakes (10 out of 13 lakes;  Table S1).
The number of fish AIS detected based on 12S and 16S metabarcoding markers was concordant (Pearson correlation R = 0.647, p = .001), differing by at most one species across the sampled lakes (Table 2). In cases where one marker detected an additional AIS, the detection was based on either one or two positive samples and few total reads (<0.1% of sequencing reads in all instances; Table S1), suggesting that differences are due to detections of low-abundance species (based on eDNA detections and proportional sequence counts). In some cases, the lack of detection may be explained by differences in the power of the markers to resolve sequences to different species or their failure to amplify certain taxa (e.g. P. marinus in Ocqueoc Lake; Table S2), which serves as a further justification for recommendations to use multiple markers in eDNA surveys (McClenaghan et al., 2020;McElroy et al., 2020;Zhang et al., 2018). The fish taxonomic databases for 12S and 16S also differed in species representation, depending on the availability of the genetic information in GenBank.
Fish AIS sequence reads as a proportion of total sequences (an eDNA surrogate measure of species biomass; Kelly et al., 2019) were positively correlated between 12S and 16S metabarcoding markers (Pearson correlation 0.928, p < .001) and varied among lakes from 0.000 (both 12S and 16S; Manistique Lake) to 0.688 (12S, Mullett Lake) and 0.719 (16S, Holloway Reservoir). Thus, at the high extreme, in Holloway Reservoir and Mullet Lake, F I G U R E 2 Proportion of aquatic invasive fish species reads per lake (a-12S, b-16S) for 22 Michigan lakes. More detailed information on the number of aquatic invasive species (AIS) and native species (Non_AIS) is given in Table 2 over half of the lake fish community sequences were from AIS ( Figure 2, Table S1).

| Associations between environmental variables and fish community diversity or AIS presence
Upstream lake area, a measure of hydrological connectivity, was the top supported model for 12S and 16S species richness (Akaike weights = 0.79 and 0.97, respectively), and exhibited a significant positive relationship with observed species richness from both markers (p < .05; Figure 3a,d, Table S3). Lake area was the top  These results further support our interpretation that area upstream is positively associated with increased species richness in the sampled lakes.

| Occupancy analysis
Occupancy analyses were conducted for a subset of species de-

F I G U R E 3
Associations between landscape-scale environmental covariates and fish community diversity using 12S and 16S eDNA metabarcoding: (a, d) upstream lake area (log +1 transformed) vs. species richness, (b, e) lake area (log transformed) vs. Shannon diversity, and (c, f) upstream lake area (log +1 transformed) vs. proportion of sequencing reads attributed to aquatic invasive species (AIS). Lake area is in hectares the traditional gear types deployed for ST surveys (where detection probabilities ranged from 0.01-0.53, 0.01-0.15 and 0.01-0.31 for the three species, respectively; Figure 4). However, eDNA was only significantly better than all six traditional gear types for C. carpio (Figure 4, centre). eDNA detection probability for N. melanostomus was not significantly higher than that of small mesh fyke nets and only the surface eDNA sample detection probabilities were higher than those for seines. In contrast, eDNA detection probability for A.
calva was not significantly different from trap net, boomshocker, or large mesh fyke net sampling gears. Jerde (2021)

| Spatial patterns of species richness and AIS abundance within lakes
Data from Pentwater Lake are presented here as an example of how information from eDNA metabarcoding surveys can be used to characterize spatial patterns in overall fish community diversity and AIS relative sequence abundance within a lake ( Figure 5; see Figure S1 for interpolated surfaces from other lakes). Interpolations for Pentwater Lake suggest higher species richness near the Pentwater River inlet ( Figure 5; see Table S4 for IDW power information), while reads attributed to N. melanostomus appear to be concentrated on the southern and western portions of the lake. Spatially explicit "hot spots" like these could be used in the future to prioritize AIS response efforts by identifying potential locations that would maximize the effectiveness of removal efforts.

| D ISCUSS I ON
The objectives and results from this case study extend historical single species and multi-species descriptive characterizations of species presence/absence, estimates of relative abundance and community diversity, and comparative analyses of the efficiency of different gear types (Hänfling et al., 2016;Sard et al., 2019). Specifically, our research goals were to spatially characterize fish biodiversity and AIS dispersion within lakes and to quantify detection probabilities and evaluate hypotheses regarding associations between AIS presence or species diversity and measures of hydrological connectivity, surrounding landscape features, and measures of human access to lakes.
Regression analyses and modelling as conducted here have been embraced as effective ways to identify causal explanations for invasive species distribution and abundance (Gallien & Carboni, 2017). Novel applications and results described for our case study here may guide future lakescape-scale monitoring and modelling that will advance agency abilities to identify factors that can negatively affect aquatic biodiversity and impact AIS distribution and abundance.
Analyses of associations between environmental variables and measures of fish community diversity and AIS presence demonstrate the importance of watershed connectivity. Best-supported models highlight the influence of upstream lake area (a measure of aquatic connectivity) on species richness of the fish community and the proportion of reads that are attributable to AIS (Figure 3, Table S3). In this study, the highest species richness was detected in relatively small Pentwater Lake, which has one of the largest upstream lake networks (250 lakes larger than 4 ha) among lakes sampled in this study. One potential explanation for the relationship between upstream area and fish species richness is that larger, and presumably more diverse, upstream habitats provide more sources for downstream dispersal and establishment in highly connected lakes (i.e. increasing colonization rates; MacArthur & Wilson, 1967).
Alternatively, the pattern could be the result of downstream transport of eDNA (Shogren et al., 2017) leading to increased detections F I G U R E 4 Estimates of gear-and species-specific detection probability in 13 lakes. Circles (eDNA) and triangles (traditional) represent posterior medians and bars represent 95% credible confidence intervals based on a hierarchical occupancy model in lakes with large amounts of upstream aquatic habitat. However, when species richness from ST survey data (using traditional sampling gears) was used as the response in our GLM analysis, the upstream area model was still favoured (Table S3), supporting our interpretation that increased connectivity, rather than eDNA transport, is a driver of fish diversity in the sampled lakes. Anecdotally, Pentwater Lake (where species richness was highest in both eDNA metabarcoding datasets and ST survey data) has a direct connection to the Great Lakes, which could also increase the diversity of the fish community by allowing for connectivity with, and immigration from, the diverse fish assemblage present in Lake Michigan. Our results also suggest an inverse relationship between lake area and measures of alpha diversity that incorporate relative abundance of community members (Shannon Diversity; Figure 3, Table S3). The relationship between upstream area and proportion of AIS reads is consistent with findings that species diversity, as quantified using either traditional fisheries survey gear or eDNA, scales positively with increasing upstream area. Larger upstream areas could also result in increased human access and use, additional suitable habitat for AIS, and propagule pressure into the focal lake. However, we were not able to explicitly quantify human use in lakes upstream of those sampled in this study. Future work could look to quantify human use and impacts at multiple spatial scales to explore this relationship. Results such as these illustrate the potential for eDNA metabarcoding data to address basic questions in community ecology, including those related to community assembly and invasibility.
In addition to their potential applications in community ecology, Previous studies have suggested that differences in physical environments may affect the distribution of species and that visualization of fish communities within each lake would help to determine representative study sites (Sakata et al., 2020). Here, novel applications of spatial interpolation of eDNA signal (species richness and relative abundance of N. melanostomus) are demonstrated, which could be used in the future to prioritize AIS response efforts by identifying potential locations that would maximize the effectiveness of removal efforts.
Similar spatial interpolation approaches were applied to examine relationships between eDNA signal and abundance of Japanese jack mackerel (Trachurus japonicus) in Maizuru Bay, Japan (Jo et al., 2017). These authors demonstrated a positive correlation between longer (719 bp) eDNA fragments and echo sounder surveys of T. japonicus distribution, suggesting that eDNA data can provide information on patterns of fish abundance and distribution (Jo et al., 2017). Other studies have demonstrated the usefulness of snapshot sampling for species monitoring (Bracken et al., 2019). As in our study, Bracken et al. (2019) used eDNA to reveal "hot spots" of vital habitat locations (including spawning sites) of their target species P. marinus, to help with possible conservation efforts. Implementing similar spatial interpolation approaches, potentially coupled with extensive sampling efforts, as necessary for more sophisticated interpolation methods, to identify "hot spots" of diversity or AIS prevalence would improve future large-scale community diversity studies. While these interpolated surfaces represent hypotheses relating to spatial patterns in relative abundance and species richness, managers may be able to make use of these patterns to guide control and eradication efforts targeting AIS. Specifically, future work should focus on determining the mechanisms that describe the distribution of eDNA in lakes, similar to what has been evaluated for lotic and marine systems (Andruszkiewics et al., 2019;Jane et al., 2015;Shogren et al., 2017;Wilcox et al., 2016). For example, environmental covariates, such as in-lake physical and biological characteristics should be collected to better quantify the relationship between eDNA detections and realized species occupancy and relative abundance Pilliod et al., 2014).

| CON CLUS IONS
The development and implementation of eDNA metabarcoding methods allow the rapid and cost-efficient collection of information on species diversity and composition of fish assemblages in aquatic habitats, which is of particular importance given the current increase in anthropogenic disturbance and associated declines in aquatic biodiversity in those ecosystems. Data in this study demonstrate that eDNA metabarcoding provides a sensitive means of collecting fish community information with several advantages relative to traditional survey gears.
Our study also documents significant positive relationships between the area of upstream lakes and the diversity of the fish community inhabiting a given lake, stressing the role of connectivity in determining fish community diversity. Our case study on 22 Michigan lakes provides several recommendations for future projects using eDNA metabarcoding datasets to address questions related to community assembly and invasibility. Data collected for this project identified several aquatic invasive species in Michigan's lakes that were not detected with traditional gear (Table S1)

CO N FLI C T O F I NTE R E S T
The authors declare that they have no conflict of interest.

PEER R E V I E W
The peer review history for this article is available at https://publo ns.com/publo n/10.1111/ddi.13370.

DATA AVA I L A B I L I T Y S TAT E M E N T
Datasets and R code for statistical analyses and raw eDNA metabarcoding sequencing data are available on Dryad (https://doi.