Modelling the invasion history of Sinanodonta woodiana in Europe: Tracking the routes of a sedentary aquatic invader with mobile parasitic larvae

Abstract Understanding the invasive potential of species outside their native range is one of the most pressing questions in applied evolutionary and ecological research. Admixture of genotypes of invasive species from multiple sources has been implicated in successful invasions, by generating novel genetic combinations that facilitate rapid adaptation to new environments. Alternatively, adaptive evolution on standing genetic variation, exposed by phenotypic plasticity and selected by genetic accommodation, can facilitate invasion success. We investigated the population genetic structure of an Asian freshwater mussel with a parasitic dispersal stage, Sinanodonta woodiana, which has been present in Europe since 1979 but which has expanded rapidly in the last decade. Data from a mitochondrial marker and nuclear microsatellites have suggested that all European populations of S. woodiana originate from the River Yangtze basin in China. Only a single haplotype was detected in Europe, in contrast to substantial mitochondrial diversity in native Asian populations. Analysis of microsatellite markers indicated intensive gene flow and confirmed a lower genetic diversity of European populations compared to those from the Yangtze basin, though that difference was not large. Using an Approximate Bayesian Modelling approach, we identified two areas as the probable source of the spread of S. woodiana in Europe, which matched historical records for its establishment. Their populations originated from a single colonization event. Our data do not support alternative explanations for the rapid recent spread of S. woodiana; recent arrival of a novel (cold‐tolerant) genotype or continuous propagule pressure. Instead, in situ adaptation, facilitated by repeated admixture, appears to drive the ongoing expansion of S. woodiana. We discuss management consequences of our results.


| INTRODUC TI ON
Invasions of non-native species, defined as a rapid geographical spread and increase in local population size outside the species original range, often threaten native species, communities, and ecosystems (Lockwood, Hoopes, & Marchetti, 2013). Invasive species can cause major ecological and economic problems (Blackburn et al., 2014) and an understanding of what predisposes particular species (and particular populations of those species) to become invasive is one of the central questions in contemporary research in ecology and evolution (Strayer, 2012).
Admixture between individuals from different source populations that meet and interbreed in a non-native range has been considered as one of the key characteristics driving successful biological invasions (Verhoeven, Macel, Wolfe, & Biere, 2011). Admixture can explain an apparent genetic paradox of invasions when genetically depauperate populations adapt rapidly to a new environment (Roman & Darling, 2007). Admixture ameliorates inbreeding depression (Lallias et al., 2015) and enables mixing of genotypes from a different genetic background and the evolution of novel, transgressive phenotypes (Bock et al., 2015). Admixture also implies a stronger propagule pressure, another outcome that is positively related to invasion success (Lockwood et al., 2013). Adaptive evolution on genetic variation in non-native populations may also result in strong selection on traits related to invasiveness, perhaps via genetic accommodation of phenotypic plasticity through its exposure in a novel environment (Bock, Kantar, Caseys, Matthey-Doret, & Rieseberg, 2018).
Sinanodonta woodiana (Lea) is a freshwater mussel with a native distribution in East Asia that has invaded several regions outside its native range over the last 50 years. It is a benthic species associated with soft sediment and is known to tolerate low water quality in terms of organic and inorganic pollution (Li et al., 2015).
Adult S. woodiana combine benthic and filter-feeding modes (Kim, Lee, & Hwang, 2011) and are highly efficient in depleting sestonic food in the environment (Douda & Čadková, 2017). Female S. woodiana brood offspring in their gills and release ripe larvae (glochidia) into the water column where they attach to a fish host to complete development and metamorphose into a free-living juvenile mussel.
While detected in the wild in Europe in 1979 in western Romania (Sárkány-Kiss, 1986) and in southern France in 1982 (Adam, 2010), the species may have been present in aquaculture facilities in Eastern Europe since 1959 (Watters, 1997 Watters, 1997). After an isolated record from France in 1982 with a trace to the Hungarian source (Adam, 2010), local establishment in southern Hungary (1988) and in a system of artificially heated lakes in central Poland (1993), the species appeared to show no further signs of dispersal and it was long considered a thermophilic species (Kraszewski & Zdanowski, 2001) with low invasive potential. However, S. woodiana started to spread in the first decade of the 21st Century (Lajtner & Crnčan, 2011) and its populations are now recorded from much colder habitats, including subalpine lakes in northern Italy (2010) (Kamburska, Lauceri, & Riccardi, 2013) and regions that are subject to relatively long winters (southern Sweden in 2005) (Svensson & Ekström, 2005).
At least two European populations of S. woodiana possess separate morphotypes on the basis of their shell shape, with a distinct coevolutionarily driven response to a parasitic fish in Europe . A recent study of two Italian S. woodiana populations demonstrated that the two morphotypes are not genetically distinct at a mitochondrial marker (Guarneri et al., 2014), suggesting that morphological variability of S. woodiana populations in Europe represents a plastic response to environmental conditions (a common feature of unionid mussels) rather than genetically determined variability (Soroka & Zdanowski, 2001 a set of specifically designed nuclear microsatellite markers (Popa, Bartáková, Bryja, Reichard, & Popa, 2015;Popa et al., 2011 (Adam, 2010), and a group of populations from Poland (PLSZ, PLLI and PLKO) was from a region where S. woodiana was known since 1993 (Soroka, 2000). Mussels were sampled by visual and tactile searching of stream and lake substrates. A piece of mantle tissue was excised from each mussel (Berg, Haag, Guttman, & Sickel, 1995), and the tissue samples (or entire specimens) were preserved in 95% ethanol in the field and subsequently stored at −80°C prior to DNA extraction.

| DNA extraction and genotyping
Genomic DNA was extracted using the NucleoSpin ® Tissue kit. A 370 bp fragment of the mitochondrial gene for cytochrome oxidase c subunit I (COI) was amplified using the primers LCO1490 and HCO2198 (Vrijenhoek, 1994). A set of 17 microsatellite loci developed specifically for S. woodiana  was genotyped using the Qiagen Multiplex PCR kit (QiagenTM). Alleles were scored in GENEMAPPER v. 5.0 (Applied Biosystems, Foster City, USA) and double-checked manually. The presence of null alleles for each locus and population was assessed with FreeNA (Chapuis & Estoup, 2007). Detailed protocols are available in Supporting Information (Appendix S1).

| Mitochondrial data analysis
A total of 127 COI sequences were analyzed (62 produced in this study and 65 retrieved from GenBank; 78 from Europe and 49 from Asia) (Supporting Information Table S1). Genetic diversity within populations and within continents (i.e., native and non-native ranges) was estimated as haplotype diversity (Hd; a probability that two randomly sampled alleles are different) and nucleotide diversity (Pi; average number of nucleotide differences per site in pairwise comparisons among sequences) (Nei, 1987), calculated in DnaSP 5.10.1 (Librado & Rozas, 2009). Sequence variation was visualized as a haplotype network using the median-joining algorithm in Network 5.0.1 (Bandelt, Forster, & Röhl, 1999). The number of base differences per site between groups of sequences was computed in MEGA6 (Tamura, Stecher, Peterson, Filipski, & Kumar, 2013).

| Genetic diversity and structure
Of the 17 originally developed loci, a high frequency of null alleles (mean >0.07 per population) was detected at seven loci and they were excluded from all subsequent analyses to avoid the biases caused by null alleles such as an increase in interpopulation divergence (Chapuis & Estoup, 2007 Slatkin, 1995) and geographic distances (Euclidian distances in km) among European populations (isolation by distance) was analyzed using a Mantel test (999 permutations) in GenAlEx 6.501 (Peakall & Smouse, 2012).
Linear geographic distances were used because of an a priori expectation that long-distance dispersal (between river basins) arose from human-mediated transport rather than natural dispersal on fish hosts. Differences in genetic diversity (allelic richness, the unbiased estimate of the gene diversity (GD: Nei, 1978) and F ST between the introduced European and native Chinese populations were tested using a two-sided permutation test (1,000 permutations) implemented in FSTAT.
The spatial genetic structure of European populations was investigated with the Bayesian approach implemented in STRUCTURE version 2.3.4 (Pritchard, Stephens, & Donnelly, 2000). First, we analyzed native and non-native populations in a single analysis. This analysis provided a clear distinction between native and non-native populations at K = 2, with no mixing between Chinese and European populations at higher K values (Supporting Information Figure S1).
We, therefore, ran a second analysis, in which, only samples from non-native European range were included, because we were primarily interested in the relationship between populations in the nonnative range. In that analysis, we assumed an admixture model, in which the algorithm assigns proportions of individual genotypes to each of the clusters. We performed 20 independent runs for each K value (from 1 to 11), with different values of the Dirichlet α parameter for each assumed cluster. We used a model with correlated allele frequencies (Falush, Stephens, & Pritchard, 2003) and with sampling locations as prior information to assist clustering (i.e., LOCPRIOR model sensu Hubisz, Falush, Stephens, & Pritchard, 2009). Each run included 300,000 burn-in iterations followed by 600,000 iterations.
Post-processing of the STRUCTURE output was performed with the CLUMPAK software (Kopelman, Mayzel, Jakobsson, Rosenberg, & Mayrose, 2015) to infer pairwise similarity between Q-matrices for each K. We used the LargeKGreedy algorithm, random input order, and 2,000 repeats. We identified different modes from the results of the 20 runs for each K value at a threshold of 0.9 for similarity scores.
We averaged individual membership proportions for all runs in the same mode and produced summary bar plots for a given K value.
We displayed the probability of the data (ln p(K|D)) for each K value.

| Inference of invasion pathways in Europe by approximate Bayesian computation
For the analysis of invasion pathways in an Approximate Bayesian for ABC analysis (Table 1).
The relationship among the 11 populations was modelled by comparing posterior probabilities for various models (i.e., evolutionary scenarios), using DIYABC 2.0.4 (Cornuet et al., 2014). This coalescent-based method enables the definition and comparison of a set of demographic and evolutionary scenarios with the aim of explaining the evolutionary history and relationships among contemporary populations. DIYABC generates simulated datasets for each scenario and compares selected summary statistics of the real dataset to those of simulated datasets to calculate the posterior probabilities for each tested scenario.
In the absence of a robust a priori hypothesis for the invasion history of S. woodiana in Europe, and given high admixture, we used an exploratory ABC approach. We modelled relationships (origin) within each pair of populations by building 55 independent "pairwise ABC comparisons," an approach similar to Miller et al. (2005) on population triplets. Each comparison was based on the same set of competing models (evolutionary scenarios), the same set of microsatellite markers, and the same prior distribution of parameters. The most likely scenario was always inferred from a set of 12 models (i.e., possible scenarios) for each pair of populations, differing in the identity of the source populations and the presence of admixture (Supporting Information Figure S2). The population pair (populations labelled as F = "first" and S = "second" in Supporting Information Figure S2 and Table S2)  The definition of the 12 scenarios in the DIYABC is reported in Supporting Information Figure S2. The prior distributions of the parameters were uniform, and their ranges were as follows: effective population sizes 10-20,000 individuals, timing of events 2-100 generations ago, admixture rate 0-1, bottleneck duration 0-20 generations, and effective population size during bottleneck (i.e., the number of founders) 2-500. Both ancestral and unsampled populations were "ghost" populations, that is, neither of them was sampled. Given that the ABC analysis does not take geographic information into account, we were able to distinguish between ancestral and unsampled populations using different temporal conditions, with any split from A being considered, a priori, older than all other subsequent events (for the list of conditions between parameters see Supporting Information Figure S2).

| Mitochondrial diversity
An alignment of partial COI sequences (370 bp) obtained from 78 individuals from Europe and 49 individuals from Asia (Supporting Information Table S1)

| Microsatellite genetic diversity
Genetic variability over 10 microsatellite loci in 369 individuals from 16 European populations was two to 16 alleles per locus and population (mean 6.9), lower than in Chinese populations (three to 21 alleles per locus and population, mean: 11.1).

| Microsatellite genetic structure in Europe
A Mantel test indicated a lack of isolation by distance structure among BGIS and BGDA; Figure 3, Supporting Information Figure S4).

| Inference on invasion pathways in Europe
Eleven populations selected for ABC analysis were chosen to represent the genetic and geographic variability of S. woodiana in Europe (Table 1, Figure 3). From the total of 55 ABC analyses, 31 (56%) analyses identified a single best model (i.e., the model with the highest relative posterior probability and 95% CI not overlapping with the 95% CI of the other models in the analysis; Supporting Information Table S4). They are referred to as  Table S5).
The following results (summarized in Figure 4 and Supporting Information Figure S5 Table S2). In two pairs, both top scenarios in the double-winner comparisons indicated a concordant relationship; that is, PLSZ was cofounded from FR and PLLI was cofounded from ROMU (Tables 2 and Supporting Information   Table S2). The ABC analysis demonstrates that FR and ROMU, two "source" populations for further colonization of Europe, were not derived from two independent introductions from the native range but rather represent a single colonization event (Supporting Information   Tables S2 and S4: strongly supported scenarios 2, 7, 11 and 12 and no support for scenarios 1, 4, 5, 9 and 10).

| D ISCUSS I ON
The most likely colonization pathways (indicated by arrows) of S. woodiana in Europe inferred by ABC pairwise comparisons between 11 populations and a putative ancestral source. The relationships are derived from the single-winner scenarios and two cases of the double-winner scenarios. The relationships are supported by admixture scenarios, demonstrating that a population was cofounded from the population indicated by arrows and another (non-identified) population (summarized in Table 2). Red nodes refer to the source populations, blue nodes represent intermediate populations, and green nodes indicate derived populations. The thickness of lines leading from ANCESTRAL is weighted to indicate the strength of support. Dotted lines connect population samples whose pairwise ABC comparison resulted in multiple scenarios being equally supported. Sample codes are given in Table 1 historically derived from other non-native populations and subsequently served as a source of further invasion.
In the wild, Sinanodonta woodiana was first recorded in Europe in 1979 in western Romania (middle Danube basin) (Sárkány-Kiss, 1986), followed by a record of a shell in southern France near Arles (the lower river Rhone basin) in 1982 (Adam, 2010). Its initial spread was slow and restricted to artificially heated waters (Urbanska, Lakomy, Andrzejewski, & Mazurkiewicz, 2012). However, the species became widespread in Europe during the first two decades of the 21st Century , colonizing much of continental Europe, invading as far north as Sweden (Svensson & Ekström, 2005) and including all three southern peninsulas (Lajtner & Crnčan, 2011;Pou-Rovira et al., 2009;Solustri & Nardi, 2006). This rapid expansion by S. woodiana is well supported by our population genetic data. The Sinanodonta woodiana was localized in largely discrete regions of Europe until the early 21st Century, when its rapid expansion through the continent became manifest (Lajtner & Crnčan, 2011), following a typical demographic and temporal pattern seen in many invasive species (Sakai et al., 2001). Originally considered as a thermophilic species (Kraszewski & Zdanowski, 2007;Spyra, Jedraszewska, Strzelec, & Krodkiewska, 2016) with limited invasion potential, S. woodiana has now colonized habitats with low water temperatures (Kamburska et al., 2013). It is possible that overcoming a thermal limitation in its reproduction was the evolutionary innovation that triggered the invasion of S. woodiana across Europe (Douda, Vrtílek, Slavík, & Reichard, 2012;Galbraith & Vaughn, 2009;Kraszewski, 2007). It appears that novel (cold-tolerant) phenotypes arose through in situ adaptation and were perhaps facilitated by repeated admixture. Genetic accommodation of existing phenotypic plasticity driven by adaptive evolution on reaction norms has recently been implicated in the evolution of invasiveness (Bock et al., 2018) (Donrovich et al., 2017) before metamorphosing into a free-living juvenile mussel. That period is sufficient for successful long-distance dispersal associated with the trade in freshwater fishes for aquaculture and angling purposes across Europe (Litvak & Mandrak, 1999). While many unionid mussel species are host specialists (Modesto et al., 2017), S. woodiana has an extremely extensive host range and can utilize all European freshwater fish species hitherto tested . This feature is not unique to invasive S. woodiana populations, because S. woodiana also utilizes an exceptionally broad range of host species in its native range Dudgeon & Morton, 1984). This trait is a highly effective preadaptation for rapid and successful invasion (Torchin & Mitchell, 2004). It appears that the combination of a generalist host utilization and the widespread commercial trade in freshwater fishes has contributed to the rapid invasion of S. woodiana across Europe, as well as in other regions of the world (Lajtner & Crnčan, 2011;Vikhrev et al., 2017;Watters, 1997  . Given its more compatible climate, the River Amur basin is another potential source of S. woodiana in Europe (Watters, 1997), particularly given the well-documented historical commercial links between that region and Eastern Europe (Watters, 1997 (Watters, 1997) and belong to a different phylogenetic lineage ("tropical invasive lineage A") (Bolotov et al., 2016), except for the localized, recently discovered population in Myanmar . Second, a population in a large Siberian river, the Yenisei, has been discovered recently and is limited to a thermally polluted outlet from a power station (Bespalaya et al., 2017). Third, populations from North America were first recorded in 2010 in New Jersey (Bogan et al., 2011), well after the expansion of S. woodiana in Europe (Lajtner & Crnčan, 2011 parasitism by a partial immunization and cross-resistance after parasitism by S. woodiana glochidia, which has the effect of decreasing recruitment of native European mussels (Donrovich et al., 2017).
Unionid mussels are also hosts to a group of parasitic cyprinid fishes, the bitterling (Acheilognathinae). Bitterling oviposits in living unionid mussels, with their embryos developing in the gills of their host mussel over their first weeks of life (Smith, Reichard, Jurajda, & Przybylski, 2004 Testing alternative invasion scenarios in a species that readily overcomes natural dispersal barriers is challenging (Estoup & Guillemaud, 2010). Here, we used a novel exploratory approach by testing all pairwise associations between potentially linked populations using an ABC modelling framework. This approach was computationally feasible for our set of 12 potential scenarios across 11 highly admixed populations, enabling us to test many competing colonization scenarios and to generalize across a multitude of pairwise contrasts, without a priori exclusion of any potential relationship. This method is comparable to the approach of Miller et al. (2005) who compared three introduction scenarios for six populations and shares the rationale with the stepwise procedure that requires straightforward hypotheses about invasion pathways based on historical data and/or pronounced genetic structure (e.g., Konečný et al., 2013;Lombaert et al., 2014) and a tournament approach (Stone et al., 2017). Ultimately, it proved useful in providing insights into the origin of European populations of S. woodiana from complex evolutionary scenarios and a large sample of sampled populations.
Our understanding of biological invasions has been greatly facilitated through testing ecological and evolutionary invasion scenarios using population genetic data. Identifying the source, timing, and process of an invasion contributes to our capacity to respond to the ongoing spread of non-native species, thereby improving our ability to predict the invasiveness potential of particular species, lineages, and genotypes (Lockwood et al., 2013). We combined mitochondrial and nuclear markers to demonstrate that S. woodiana, a highly invasive species of freshwater mussel, has likely colonized Europe from a single source region. There is no evidence of the recent arrival of a novel genotype. A recently evolved cold tolerance that hypothet-