Coalescent models characterize sources and demographic history of recent round goby colonization of Great Lakes and inland waters

Abstract The establishment and spread of aquatic invasive species are ecologically and economically harmful and a source of conservation concern internationally. Processes of species invasion have traditionally been inferred from observational data of species presence/absence and relative abundance. However, genetic‐based approaches can provide valuable sources of inference. Restriction site‐associated DNA sequencing was used to identify and genotype single nucleotide polymorphism (SNP) loci for Round Gobies (Neogobius melanostomus) (N = 440) from 18 sampling locations in the Great Lakes and in three Michigan, USA, drainages (Flint, Au Sable, and Cheboygan River basins). Sampled rivers differed in size, accessibility, and physical characteristics including man‐made dispersal barriers. Population levels of genetic diversity and interpopulation variance in SNP allele frequency were used in coalescence‐based approximate Bayesian computation (ABC) to statistically compare models representing competing hypotheses regarding source population, postcolonization dispersal, and demographic history in the Great Lakes and inland waters. Results indicate different patterns of colonization across the three drainages. In the Flint River, models indicate a strong population bottleneck (<3% of contemporary effective population size) and a single founding event from Saginaw Bay led to the colonization of inland river segments. In the Au Sable River, analyses could not distinguish potential source populations, but supported models indicated multiple introductions from one source population. In the Cheboygan River, supported models indicated that colonization likely proceeded from east (Lake Huron source) to west among inland locales sampled in the system. Despite the recent occupancy of Great Lakes and inland habitats, large numbers of loci analyzed in an ABC framework enable statistically supported identification of source populations and reconstruction of the direction of inland spread and demographic history following establishment. Information from analyses can direct management actions to limit the spread of invasive species from identified sources and most probable vectors into additional inland aquatic habitats.

provide valuable sources of inference. Restriction site-associated DNA sequencing was used to identify and genotype single nucleotide polymorphism (SNP) loci for Round Gobies (Neogobius melanostomus) (N = 440) from 18 sampling locations in the Great Lakes and in three Michigan, USA, drainages (Flint, Au Sable, and Cheboygan River basins). Sampled rivers differed in size, accessibility, and physical characteristics including man-made dispersal barriers. Population levels of genetic diversity and interpopulation variance in SNP allele frequency were used in coalescence-based approximate Bayesian computation (ABC) to statistically compare models representing competing hypotheses regarding source population, postcolonization dispersal, and demographic history in the Great Lakes and inland waters. Results indicate different patterns of colonization across the three drainages. In the Flint River, models indicate a strong population bottleneck (<3% of contemporary effective population size) and a single founding event from Saginaw Bay led to the colonization of inland river segments. In the Au Sable River, analyses could not distinguish potential source populations, but supported models indicated multiple introductions from one source population. In the Cheboygan River, supported models indicated that colonization likely proceeded from east (Lake Huron source) to west among inland locales sampled in the system. Despite the recent occupancy of Great Lakes and inland habitats, large numbers of loci analyzed in an ABC framework enable statistically supported identification of source populations and reconstruction of the direction of inland spread and demographic history following establishment. Information from analyses can direct management actions to limit the spread of invasive species from identified sources and most probable vectors into additional inland aquatic habitats.

| INTRODUC TI ON
The establishment and spread of invasive species have caused devastating trophic cascades (Strayer, 2010) that have reduced the abundance (Gallardo, Clavero, Sánchez, & Vilà, 2016) and diversity (Hejda, Pyšek, & Jarošík, 2009) of native species. Rapid and undesirable shifts in biological community diversity often also result in substantial negative economic costs (Walsh, Carpenter, & Vander Zanden, 2016). Given the deleterious effects associated with biological invasions, conservation agencies have emphasized prevention of human-assisted (Lockwood, Cassey, & Blackburn, 2005) introductions of invasive species from native sources along with efforts to limit spread of invasive species in habitats that have experienced introductions. These actions are widely believed to be the most cost-effective control strategies (Mack et al., 2000).
However, unambiguous identification of source populations, vectors of dispersal, and demographic changes associated with colonization are often unavailable (Sakai et al., 2001;Lee, 2002). Such information is critical for recently colonized invaders that are increasingly targeted for management actions because this information can be used to prevent future introductions and spread of invasive species.
Alternatively, genetic data are widely used to infer founding history and population demographic events (Gaither et al., 2013;Lombaert et al., 2011) and to characterize the magnitude and direction of gene flow among locales (Alp, Keller, Westram, & Robinson, 2012;Estoup, Beaumont, Sennedot, Moritz, & Cornuet, 2004). Modelbased approaches that allow researchers to reconstruct histories of species invasions using demographic models (Estoup & Guillemaud, 2010) are becoming more common because models enable the rigorous evaluation of multiple competing hypotheses concerning the founding sources and demographic histories to inform conservation actions.
Much of the ABC work involving invasive species has focused on identifying invasion origins over large spatial scales or on inferring colonization history after long periods of time (dozens to hundreds of generations) following founding event(s) (Estoup et al., 2001;Pascual et al., 2007). Because rapid responses to recently detected invaders are important to the success of management actions (Anderson, 2005), agencies would benefit from information about colonization history during the early invasion stages. However, the strength of bottlenecks, as well as the limited time for population divergence in allele frequency due to genetic drift or occurrence of new mutations in different populations, makes inferences based on population genetic data challenging. Importantly, the generation time of the species and the effective population size (which determines the rate of coalescence in the population, Charlesworth, 2009) need to be considered because they affect the efficacy of using population genetic data for demographic inference (Gutenkunst, Hernandez, Williamson, & Bustamante, 2009).
Prior simulation studies and empirical applications have shown that recent events (on a coalescent time scale) pose substantial difficulty for accurate estimation of model parameters (Benazzo et al., 2015;Robinson, Bunnefeld, Hearn, Stone, & Hickerson, 2014;Robinson, Coffman, Hickerson, & Gutenkunst, 2014;Shafer, Gattepaille, Stewart, & Wolf, 2015). However, inferences on short timescales (<20 generations) may be improved by sampling larger numbers of loci (Robinson, Bunnefeld et al., 2014;Shafer et al., 2015) and/or individuals (Keinan & Clark, 2012;Lombaert et al., 2014). Next-generation sequencing technologies, including massively parallel sequencing of reduced representation genomic libraries (Andrews & Luikart, 2014), have progressed to the point that it is now possible to genotype hundreds of individuals at thousands of loci. Encouragingly, simulation studies suggest that aspects of colonization history can be correctly inferred in as few as 10 generations when using genomic-scale datasets in an ABC framework (Elleouet & Aitken, 2018 (Jude et al., 1992), as a result of human-assisted movements in the ballast water of container ships originating from the Black and Caspian Seas (Brown & Stepien, 2009). Round Gobies were observed in the St. Clair (STC) River in 1990, but by 1993 they had established populations near shipping ports in southern Lake Michigan (LKM) near Chicago, Illinois and central Lake Erie (LKE), near Cleveland, Ohio (Fuller et al., 2018). By 1995, the Round Goby had invaded all five Great Lakes. To date, Round Gobies have colonized much of the Great Lakes shoreline with estimated population sizes in the billions (Johnson, Allen, Corkum, & Lee, 2005). In 1996, the first inland invasion was reported in the Saginaw River Basin in Michigan, and currently, there are numerous examples of inland invasions that resulted from secondary spread from established Great Lakes sources (Fuller et al., 2018). The establishment of Round Goby populations in the Great Lakes and inland waterways within the basin has affected predator-prey dynamics, led to direct competition with native species for resources, and enabled the spread of pathogens (reviewed by Kornis, Mercado-Silva, & Vander Zanden, 2012).
The rapid colonization of the Great Lakes by Round Gobies and their secondary spread to inland systems may have resulted from multiple Great Lakes source populations and different dispersal mechanisms. First, the secondary spread of Round Gobies may have involved both limited natural dispersal and long-distance movements in the ballast water of container ships operating within the Great Lakes (Bronnenhuber, Dufour, Higgs, & Heath, 2011;Johansson et al., 2018;LaRue, Ruetz, Stacey, & Thum, 2011). In addition, anglers moving and releasing unused live bait inland may have aided secondary invasions (Drake & Mandrak, 2014;Johnson, Ricciardi, & Carlton, 2001;Leung, Bossenbroek, & Lodge, 2006). However, the specific source(s) of the inland Round Goby invasions, the demographic history associated with founder events, and the mechanisms by which the species colonized fragmented inland systems remains unclear.
In this study, population genomics data and ABC were used to compare models of the Round Goby invasion around the Lower Peninsula of Michigan in Great Lakes waters, as well as within three inland river basins that are fragmented by one or more dams and impoundments. The candidate models included alternative source populations for each inland invasion and several potential colonization routes. Similarities across the three sampled drainages (e.g., in putative source populations) were of particular interest, in that they could indicate shared mechanisms of inland spread. The power of the ABC approach for model selection and the accuracy and precision of parameter estimates were assessed using simulated data. Management actions informed by research outcomes as described in this study for a single target invasive species would have wide-ranging implications because numerous invasive species have similar life histories and spread dynamics.  Figure 1). These three river basins were chosen because they vary in levels of recreational use, distance from populated areas, and number or type of humanconstructed barriers that impede natural upstream movement, all of which have been identified as important factors associated with the secondary spread of aquatic invasive species (Drake & Mandrak, 2010Johnson, Olden, & Vander Zanden, 2008).

| Field sampling and library preparations
All tissue samples were preserved in 95% ethanol in the field and returned to the lab for DNA extraction.
DNA libraries were prepared using the "BestRAD" protocol de- Detailed sequencing and bioinformatics methods are included in Supporting Information Appendix S1.

| ABC model development
We defined a series of plausible models representing a series colonization of events in the Great Lakes in nearshore habitats around Michigan to characterize relationships among Round Goby populations in the Great Lakes. Specifically, based on USGS observation data (Fuller et al., 2018), ports near Chicago in LKM (1993)  to ascertain whether Round Goby populations in LTB, or in the lower Cheboygan River, were colonized as a result of range expansions from the original introduction in the STC River, or as a result of expansion from the early colonization of LKM harbors. We used the ABC framework to quantitatively evaluate four hypotheses that differed in the location of the juncture between populations derived from population expansion from Lake STC through Lake Huron and expansion through LKM (Supporting Information Appendix S1, Table A1.1). We delineate populations geographically given the large distances separating all sampled populations in the Great Lakes ( Figure 1) and the man-made barriers that separate the inland segments of sampled river drainages (e.g., Holloway Reservoir, Mott Lake, etc.; Figure 1). While downstream connectivity of individual segments is likely, the presence of dams in the system prevents natural dispersal upstream. Furthermore, as our goal is to provide useful information to managers, we maintained the natural geographic organization of individuals into populations, rather than pooling to reduce complexity of population tree topologies projected under alternative hypotheses of invasion spread (gene flow). Pooling sampling locales within drainages would limit plausible colonization and dispersion scenarios to be explored in our models and the efficacy and interpretability of parameter estimates from our analysis.
We tested multiple competing hypotheses to explain how Round Gobies colonized each of the three sampled drainages (Flint, Au Sable, Cheboygan Rivers) in an ABC framework to assess whether consistent patterns among secondary invasions from the Great Lakes to inland systems could be identified (i.e., inland locales shared the same source, exhibited similar directionality [upstream vs. downstream] following colonization, exhibited similar founding effective sizes, etc.). Models shared a common underlying topology (i.e., population tree, see Supporting Information Appendix S1, Figure A1.1), but varied in the source population associated with each inland invasion. For each inland system, we also considered models that allowed separate introductions into multiple inland populations from a shared source (as in Benazzo et al., 2015). Given the spatial separation of the sampled systems ( Figure 1), both alternative and shared source populations were considered in candidate models. Models were informed by data available in the USGS Nonindigenous Aquatic Species database (Fuller et al., 2018). The number of models differed between systems (13 Flint models; 17 Au Sable models; 21 Cheboygan models; Supporting Information Appendix S1, Tables A1.2-A1.4). Given the scale of the analyses conducted, we provide model development details in Supporting Information Appendix S1.

| Simulations and statistical analysis
We used the STACKS parameters noted above to set minimum sample sizes from each population (80% of the number of sampled individuals) to allow assembly of a complete genotypic matrix (i.e., no missing data) for each population to be used in analyses. Briefly, for each SNP locus, n i nonmissing individual genotypes (with n set to 80% of the per-population sample size) were randomly selected without replacement from the initial dataset for each of the i sampled populations. This practice preserved associations between variants at each locus within individuals, but breaks associations between SNP loci within individuals by randomly sampling genotypes for the n i individuals included in the reduced dataset from each population. This process was conducted to improve computational efficiency because preliminary simulations indicated that replicating the pattern of missing data in the observed dataset was timeintensive (~10× increase in time). Importantly, the above procedure Statistics related to genetic divergence included pairwise (and total) F st (using a simple, but computationally efficient estimate based on heterozygosity, G ST ; Nei, 1975), the number of SNP loci that fall along the x and y-axes of the joint site frequency spectrum (jSFS) for each pairwise population comparison (i.e., the number of private sites in each pairwise jSFS), and the mean frequency of private (minor allele) sites in each pairwise jSFS. The additional paired statistics provide information related to the directionality associated with gene flow (Slatkin, 1985). Given the paired nature of several of the above measures of divergence, the total number of summary statistics varied with the number of simulated populations considered for the four ABC analyses conducted here (see Supporting Information Appendix S1). All statistics were calculated in the R statistical computing environment (R Core Team, 2016), using a combination of functions from the "strataG" R package (Archer, Adams, & Schneiders, 2017) and custom scripts (available on GitHub: nicksard/2019_Goby_ABC_scripts). Approximate Bayesian computation analyses were conducted using the R packages "abc" (Csilléry, François, & Blum, 2012) and "abcrf" (Marin, Raynal, Pudlo, Robert, & Estoup, 2017). Model selection analyses were carried out using neural network  and random forest (Pudlo et al., 2015) methods.
Neural networks included five hidden layers and models in the Flint, Au Sable, and Cheboygan analyses were compared at tolerances of 0.001 and 0.005, whereas models tested for the Lower Peninsula analysis were compared at tolerances of 0.01 and 0.005, due to the smaller number of total simulations considered for this analysis. Each random forest analysis used 1,000 trees to predict the most likely model. Models receiving substantial support from these analyses were subsequently used for parameter estimation. Estimated model parameters included contemporary effective size of inland populations, bottleneck severity parameters, and migration rates among locales (see Supporting Information Appendix S1 for more details), again using neural networks  to estimate posterior distributions for model parameters at tolerances of 0.01 and 0.05.

| Evaluating the quality of inferences
Pseudo-observed datasets (PODS) were used to assess the power of our ABC approach for model selection and the accuracy and precision of parameter estimates (Bertorelle et al., 2010). Individual simulations from the reference table (consisting of summary statistics for each of the 1 million replicates per model tested, and the randomly drawn parameter values used to generate each dataset) were used as PODS in leave-one-out cross-validations to test the ability of the ABC analysis to correctly assign known datasets to the model under which they were simulated (Csilléry et al., 2012;Robinson, Bunnefeld et al., 2014;Robinson et al., 2013;Shafer et al., 2015). One hundred cross-validation replicates per model were conducted for each of the simulated models. These analyses allowed the estimation of misclassification rates for each candidate model (similar to type I and type II error rates; Bertorelle et al., 2010, Lombaert et al., 2014. For the random forest model selection analysis, we recorded the "out-ofbag" error rates (proportion of datasets misassigned to alternative models) returned during the construction of the random forest classifier (Pudlo et al., 2015). Due to the number of models considered and the number of populations included in each model, cross-validations to assess model identifiability in the Cheboygan River system were based on a smaller reference table including 100,000 simulations per model (rather than the full one million simulated) and a single tolerance (0.01, corresponding to 21,000 accepted simulations) to reduce the computational burden associated with this analysis.
We expect that this reduced dataset provides a conservative estimate of the ability of ABC to distinguish the candidate models for this system (i.e., we expect either no difference or improvements in model identifiability with the larger reference table used to analyze empirical data).
To   shared histories. For brevity, we report only F st pairwise comparisons; however, information from pairwise jSFS was also used in the ABC analyses below (see Section 2 for more information).

| Reconstruction of secondary dispersal
Our ABC analyses of the colonization history of nearshore Great Lakes waters around Michigan's Lower Peninsula indicated that Round Goby populations from LKM, including LTB, Grand Traverse Bay, and Muskegon River (Figure 1), were colonized from a LKM source, whereas the Cheboygan River population was derived from populations in Lake Huron (Table 2) Lakes sources with the highest support included SAB, Lake STC, and an unsampled LKM population. The same colonization process (multiple introductions from the same source) was also favored using random forest model selection, but from a LKE source (Pr = 0.83).
Analyses of Round Goby colonization of the Cheboygan river system indicated east (Mullett Lake) to west (Burt Lake) colonization within the system ( connects the Cheboygan River and Mullett Lake to Lake Huron, which is similar to the Local_SAB in the Flint River analysis (Figure 1). We lacked the power to estimate parameters associated with migration rates among populations in all inland systems. However, we did observe parameter posterior distributions that differed from prior distributions for bottleneck severity parameters (proportional reduction from contemporary N e ). Initial Round Goby movements within the Great Lakes appear to have been accompanied by relatively minor reductions in N e . We estimate that N e for Round Gobies presumed to have been established via ballast water assisted movement within the Great Lakes were 15% (SF) that of contemporary N e (Supporting Information Appendix S2: Figure A2.2). Posterior distributions for the severity of bottlenecks associated with natural colonization events following initial colonization indicate that initial founding size for Round Goby dispersing naturally from shipping ports was 9% to 10% (95% HDI, Supporting Information Appendix S2: Figure A2.2) that of contemporary N e per population.   Cross-validation analyses indicate that posterior distributions provided informative estimates for some parameters (Supporting

F I G U R E 3
Model support for each hypothesis tested within each approximate Bayesian computation analysis. Model support values for neural network at reported at high and low tolerances. High and low tolerances for the Flint River, Au Sable River, and Cheboygan River analyses were 0.005 and 0.001, respectively, whereas they were 0.01 and 0.005 for the Lower Peninsula analysis. Random forest model support values represent the proportion of votes for a given model during the decision tree process. See Table 1 for location abbreviations  and Tables A1.1-A1.4 (Supporting Information Appendix S1) for descriptions of each model tested. Abbreviated hypothesis names represent the source of the introduction (name to the left of the underscore), and the first location(s) that were founded with the system (to the right of the underscore). If multiple introductions occurred, they are indicated with a "+" (e.g., MTL + HWR represents introductions into both Mott Lake and Holloway Reservoir at the same time) Information Appendix S2: Figure A2.10-A2.13). Importantly, prediction errors were smaller for bottleneck severity parameters associated with founding a river (RF) within an inland system (range: 0.15-0.38) compared to founding a new port in the Great Lakes via movement associated with the shipping trade (SF, range: 0.34-0.71).
Meanwhile, prediction error associated with migration parameters were high (>0.98), indicating that reliable estimates of migration rates could not be obtained in any of the four ABC analyses. In contrast, prediction errors associated with contemporary N e of inland sampling locations were relatively low across all four ABC analyses (range: 0.18-0.57).
Posterior predictive simulations were used to assess how well the most supported models, and associated parameter estimates, fit observed summary statistics from each dataset. In all four ABC analyses, we found that simulated models produced distributions of summary statistics that overlapped with observed data in a majority of cases,

| D ISCUSS I ON
We used population genomic data in an ABC framework to evaluate multiple competing hypotheses to explain the sources and demographic parameters associated with the secondary spread of Round

Gobies around Michigan waters of the Great Lakes and into inland
Michigan waterways. The use of genomic data was critical because thousands of SNP loci allowed detection of subtle signals of colonization history, despite short timescales (<15 generations) since the initial introduction of the species into the Great Lakes and associated inland waters. The quality of inferences from ABC was assessed using cross-validation analyses, which quantified the power for ABC model selection and the accuracy and precision of parameter estimates. Cross-validation results indicated that invasion routes (e.g., upstream or downstream, multiple introductions) in the Great Lakes nearshore habitats and into inland waters could be reliably inferred.
However, model selection was unable to always distinguish source populations. Strong support for specific source populations was indicated for the Flint River, which allowed robust estimation of demographic parameters associated with founding events (N e , bottleneck severity). In other inland systems, model selection supported general F I G U R E 4 Prior (gray lines) and posterior probability densities for contemporary effective population size estimated in the Flint analysis. Dotted and solid lines represent posterior estimates at 0.05 and 0.01 tolerances for each supported model. LocalSAB (blue) and SAB_HWR (green) models are represented. See Table A1.2 (Supporting Information Appendix S1) for description of the models model features (e.g., multiple introductions into the Au Sable River system), though power was insufficient to identify a single source population. Broadly, the model identifiability dynamics in these analyses were consistent with previous research that noted the difficulty in distinguishing structurally similar models in ABC analyses (Cabrera & Palsbøll, 2017). Overall, our results show that ABC provides a flexible framework for reconstructing recent colonization histories for nascent invaders that, when coupled with population genomic datasets, can provide robust inference of certain aspects of invasion history.

| Colonization histories of inland waterways
Findings from the Lower Peninsula analysis support earlier reports based on an unsupervised clustering (STRUCTURE) analysis (Snyder & Stepien, 2016)  In the Flint River, the two well-supported models indicated the system was founded by Round Gobies from SAB. Both models also indicated that initial colonization of the Flint River involved a severe population bottleneck. This could explain both the low diversity and high population divergence as previously noted by other authors (Bronnenhuber et al., 2011;Johansson et al., 2018) and observed as part of this study. The supported models differed in inferences F I G U R E 5 Boxplots depicting 100 leave-one-out cross-validations for neural network analysis at tolerances 0.005 (Light gray) and 0.001 (dark gray) for the two supported models in the Flint River analysis. Abbreviated hypothesis names represent the source of the introduction (name to the left of the underscore), and the first location(s) that were founded with the system (to the right of the underscore). If multiple introductions occurred, they are indicated with a "+" (e.g., MTL + HWR represents introductions into both Mott Lake and Holloway Reservoir at the same time). See Table 1 for location abbreviations and Table A1.2 (Supporting Information Appendix S1) for descriptions of each model tested concerning system colonization events. The local model (Local_SAB) infers that humans (potentially anglers) moved Round Gobies upstream above dams in a stepwise manner, while the SAB_HWR model indicates a greater geographic overland movement and single introduction into Holloway Reservoir from SAB followed by downstream dispersal of Round Gobies to uninhabited river segments below the Holloway Reservoir dam. While both models infer "local" (e.g., SAB) sources, either model could indicate invasions were mediated by local angler-assisted movements or by the commercial bait industry, as has been documented in other systems (Drake & Mandrak, 2010).
Similarities in competing model structure likely affect our inability to distinguish them (Cabrera & Palsbøll, 2017)  obtained under alternative models receiving support (Robinson, Bunnefeld et al., 2014). The bottleneck severity parameter posterior distributions across the supported models indicate weaker founding bottlenecks than estimated for the Flint River. There are several possible explanations for why ABC inference of source population was not strongly supported. First, as noted above, the similarity among candidate models may reduce model identifiability (Cabrera & Palsbøll, 2017). Additionally, the supported models indicate the possibility of considerable migration into the system from multiple sources and subsequent admixture. Furthermore, several features of the parameterized models, including multiple introduction events into the system and a lack of a strong bottleneck associated with colonization, could contribute to the retention of diversity in the Au Sable River populations relative to populations sampled from inland segments of the Flint River system. Given that N e priors performed reasonably well in other ABC analysis, it seems likely that the similarity in alternative models, short timescale of the invasion (i.e., limited genetic differentiation of potential source populations), and influx of diversity into the Au Sable from multiple founding events interact to limit our ability to reliably identify source populations.
In the Cheboygan River, we found support for Round Gobies invading the system through the lock (LocalEast) or via long-distance movement from Saginaw Bay (SAB_MLL), potentially associated with bait release from transient anglers. Differences in contemporary N e for Mullett Lake between the two supported models could be used to test their relative support in subsequent analyses. Such analyses were not completed as part of this study because we did not collect additional, independent genotypic information to estimate contemporary N e . Importantly, both supported Cheboygan River models involve colonization of Mullett Lake before Burt Lake, providing corroborating evidence for this specific colonization pattern. Posterior predictive simulations were conducted as post hoc assessments of the goodness of fit of the parameterized models

| CON CLUS IONS
Testing competing hypotheses concerning aquatic invasive species colonization history using population genetic summary statistics in an ABC framework has become increasingly common. However, many studies have evaluated colonization histories that span broad spatial and temporal scales, which help to amplify signal in the data and facilitate identification of source populations and estimation of associated demographic parameters. The application of SNP data from reduced representation genomic sequencing libraries in an ABC framework is an area of active research (Robinson, Bunnefeld et al., 2014;Shafer et al., 2015). Here, from a genomics perspective, we used SNP data generated from reduced representation libraries and ABC to reconstruct invasion history on a very short time frame (<15 generations). Our results provide crucial insight into the sequence of inland colonization events and estimates of effective population sizes and bottleneck severity in individual systems (along with uncertainty). Importantly, using the ABC framework, we were able to evaluate confidence in inferences being made through cross-validations and posterior predictive simulations. Despite limited signal and a recent time frame, ABC analyses provide useful inference of routes of inland invasion and, to a lesser extent, possible source populations for Round Goby across three Michigan drainages.

| Management implications
Modeling outcomes indicate a pattern of stratified dispersal during inland invasions, defined as an initial colonization event in each system (likely facilitated by human-assisted movements) followed by natural dispersal. Conceptually, the invasion patterns described here are similar to patterns previously suggested for the Round Goby invasion of the Great Lakes (Brown & Stepien, 2009;LaRue et al., 2011;Snyder & Stepien, 2016) and their tributaries (Bronnenhuber et al., 2011). The stratified dispersal hypothesis has repeatedly been suggested as a mode of movement for other invasive species, including crayfish (Puth & Allen, 2005), multiple native and invasive plant species (Myers, Vellend, Gardescu, & Marks, 2004), mussels (Heiler et al., 2013), insects (Muirhead et al., 2006;Suarez, Holway, & Case, 2001), and fish (Johnson et al., 2008).
The ABC analyses conducted here enabled elimination of specific sources or routes for each invasion (despite little structure among potential sources tested, Figure 2) and provided reliable estimates of founding and contemporary N e of the populations evaluated.
Based on the supported models, inferences of how Round Gobies entered and dispersed through the inland system(s) can be made.
Each of these components enables the identification of targeted areas to mitigate against further spread of the species. For instance, SAB source models were supported, at least to some extent, in each This threat can be mitigated with restrictive bait use regulations and working with the commercial bait industry to develop aquatic invasive species hazard analysis and critical control point plans.
Broadly, the identification of invasion routes and estimates of N e aid in furthering our understanding of colonization and spread dynamics that can be useful for rapid response for new invasive species (e.g., fish or crayfish) that would have similar mechanisms for spread (e.g., bait bucket release, natural dispersal, overland dispersal via recreational equipment). For example, understanding Round Goby spread dynamics in a few infested rivers could inform actions for other rivers if they become invaded. Actions could include bait bans, lock closures, targeted outreach, targeted enforcement initiatives, or installation of control or deterrent technologies (e.g., Parker et al., 2015). Collectively, combining genomic-scale datasets in an ABC framework provides useful information to resource agencies regarding historic, recent, and potential future species invasions. Meek, for helpful discussions regarding models development and earlier drafts of the manuscript. Finally, we thank two anonymous reviewers for helpful comment that greatly improved the manuscripts.

CO N FLI C T O F I NTE R E S T
None declared.