A few north Appalachian populations are the source of European black locust

Abstract The role of evolution in biological invasion studies is often overlooked. In order to evaluate the evolutionary mechanisms behind invasiveness, it is crucial to identify the source populations of the introduction. Studies in population genetics were carried out on Robinia pseudoacacia L., a North American tree which is now one of the worst invasive tree species in Europe. We realized large‐scale sampling in both the invasive and native ranges: 63 populations were sampled and 818 individuals were genotyped using 113 SNPs. We identified clonal genotypes in each population and analyzed between and within range population structure, and then, we compared genetic diversity between ranges, enlarging the number of SNPs to mitigate the ascertainment bias. First, we demonstrated that European black locust was introduced from just a limited number of populations located in the Appalachian Mountains, which is in agreement with the historical documents briefly reviewed in this study. Within America, population structure reflected the effects of long‐term processes, whereas in Europe it was largely impacted by human activities. Second, we showed that there is a genetic bottleneck between the ranges with a decrease in allelic richness and total number of alleles in Europe. Lastly, we found more clonality within European populations. Black locust became invasive in Europe despite being introduced from a reduced part of its native distribution. Our results suggest that human activity, such as breeding programs in Europe and the seed trade throughout the introduced range, had a major role in promoting invasion; therefore, the introduction of the missing American genetic cluster to Europe should be avoided.


| INTRODUC TI ON
Since their first definition in Charles Elton's book (Elton, 1958), biological invasions have been increasingly studied over the last few decades. Compared to the ecological impacts of many invasive species and the management issues surrounding them, the role of evolution in biological invasions has long been overlooked (Colautti & Lau, 2015). In order to fill this knowledge gap, it is crucial to identify the source populations of the introduction for a better understanding of the evolutionary mechanisms behind invasiveness, such as the role of selection, local adaptation or admixture (Colautti & Lau, 2015;Dlugosch, Anderson, Braasch, Cang, & Gillette, 2015;Keller & Taylor, 2008). The practical applications of such studies are the identification of source risk and the prediction of the invasive potential of a population (Chown et al., 2015).
When a species is introduced to a new range, it is generally expected to experience a genetic bottleneck leading to a loss of genetic diversity (allelic richness or heterozygosity) (Dlugosch et al., 2015).
For example, the invasive plant Heracleum mantegazzianum exhibited a lower diversity in the invasive range attesting a strong founder event (Henry et al., 2009). However, some studies (Dlugosch et al., 2015;Dlugosch & Parker, 2008) have emphasized that the loss of genetic diversity within the native and invasive ranges was generally weak (15%-20% on average); this can be explained by multiple introductions that have limited the loss of diversity, as in the case of Phalaris arundinaceae or Prunus serotina (Lavergne & Molofsky, 2007;Pairon et al., 2010). Genetic diversity is even likely to increase in the invasive range if population admixture is high (Dlugosch et al., 2015;Dlugosch & Parker, 2008), although a large increase is rare (Uller & Leimu, 2011); for example, the invasion of Phalaris arundinacea was shown to have been promoted by an increased genetic variation (Lavergne & Molofsky, 2007). Genomic admixture was likely to have favored the success of Silene vulgaris in its new American range (Keller, Fields, Berardi, & Taylor, 2014;Keller & Taylor, 2010).
Additionally to propagule pressure during introduction, the mating system can have a high impact on the diversity and genetic structure of populations. Clonal or self-fertilizing species are likely to experience a greater loss of genetic diversity, whereas a bottleneck effect may be reduced for outcrossing species (Baker, 1967;Pappert, Hamrick, & Donovan, 2000). For example, the loss of genetic diversity between native and invasive ranges was greater for purely clonal populations of the invasive Oxalis pes-caprae than for sexual ones (Ferrero et al., 2015).
Few studies have been carried out on the numerous invasive species of trees and shrubs, despite their great impact on ecosystems (Richardson & Rejmánek, 2011). Contrary to many herbaceous species, invasive trees (comprising 357 tree species, i.e., nearly 0.5% of all trees species; Richardson & Rejmánek, 2011) have often been voluntarily introduced to their new ranges for horticultural or forestry purposes (Richardson & Rejmánek, 2011), resulting in multiple repeated introductions which may have shaped the diversity of the trees in the introduced range (Hirsch, Richardson, & Le Roux, 2017).
Furthermore, invasive trees are characterized by a longer generation time compared to invasive herbaceous species; this life-history trait may influence differentiation rate between ranges. In addition, the fact that a few centuries and tree generations have passed since the first introduction presents a challenge to the study of evolutionary processes in invasive trees (Hirsch et al., 2017). Little research has been carried out on the population genetics of invasive trees in both their invasive and native ranges. To our knowledge, a few studies have evidenced mostly multiple introductions to the invasive range from the native range; for example, Acacia saligna (Thompson, Bellstedt, Richardson, Wilson, & Le Roux, 2015) and Pinus tadea (Zenni, Bailey, & Simberloff, 2014) have been widely sampled from the native range exhibiting a high level of admixture within the invasive range, and Prunus serotina (Pairon et al., 2010) has been introduced several times, but from a limited part of the native range.
Black locust Robinia pseudoacacia L. (Fabaceae) is native to North America and is considered as invasive on a world scale (eight regions among the 14 defined by Richardson & Rejmánek, 2011). In Europe, it is now recognized as one of the 100 worst invasive species (Basnou, 2009;DAISIE, 2006).
The native range of this species consists of two disjoint areas on both sides of the Mississippi watershed (Little, 1971); the largest area corresponding to the Appalachian Mountains and partially encompassing several current States (Pennsylvania, Maryland, West Virginia, Virginia, North Carolina, South Carolina, Georgia, Alabama, Tennessee, Kentucky, and Ohio) and the smallest area being located further west in the Ozark Mountains (Missouri, Arkansas and Oklahoma). In America, black locust was intensively displaced by settlers due to the undeniable interest in its wood as stated by Michaux in "Histoire des arbres d'Amériques septentrionales tome III" (1813), or by Cobbett in his book "The Woodlands" (1825). Such displacement has sometimes led to the misinterpretation of its native distribution; for example, in the Gardeners Dictionary (1756-1759), Miller wrongly stated that black locust was native to Massachusetts (Michener, 1988). To date, this species has spread to every state in the contiguous USA and also to British Columbia, Québec, Newfoundland and Labrador in Canada (Schütt, 1994). It was introduced to Europe during the early 17th century and it is now present in all European countries; however, when and how the first introduction occurred is not precisely known and is still shrouded in mystery.
It is widely written that Jean Robin (1550-1629), botanist of King Henri the Fourth in 1601, was responsible for the first European introduction. A century later, Carl Von Linné gave the black locust its current name, Robinia, in recognition of the work carried out by Jean Robin and son (Vespasien) in acclimating it to Europe. Actually, 1601 seems a very unlikely date since black locust is absent (Wein 1930 cited by Cierjacks et al., 2013) from the lists edited by Jean and Vespasien Robin in 1601 ("catalogue de son jardin") and in 1620 ("histoire des plantes nouvellement trouvées en l'isle de Virginie"). To our knowledge, the first citations of the species appeared in England in the John Tradescant list in "plantarum in horto Iohannem Tradescanti" (1634) (cited in "Early British botanists and their gardens" by Gunther, 1922) (Gunther, 1922); thus he probably sent some seeds to Vespasien Robin in France, who would have sown them and cultivated the trees (such as the one planted in the King's garden in Paris in 1634 ("jardin des plantes"; in Biographie Universelle, 1824)).
In Europe, the first introduction appears to have been followed by a period of interest in its ornamental aspect; however, it subsequently fell into disuse in the early 18th century, as explained in a dictionary from 1722 about the black locust, which was quoted by Nicolas François de Neufchateau in his book "lettre sur le robinier" (1807). In the middle of the 18th century, American explorers returned to Europe and promoted the use of black locust in forestry: For instance, Michaux (1813) described the abundance of this tree in the Allegheny mountains throughout Pennsylvania and West Virginia and indicated that after the end of the 18th century, the tree was appreciated more for the excellent qualities of its wood than for the beauty of its foliage and flowers. At the same time, the English politician William Cobbett, who emigrated to America in the late 18th century, emphasized all the qualities of this tree and promoted its plantation in Europe: "I sold the plants; and, since that time, I have sold altogether more than a million of them," adding that "My seed has always come from the neighborhood of Harrisburgh in Pennsylvania" (Cobbett, 1825).
From this information, we can conclude that the European dissemination of the black locust seems to have experienced a lag phase between the tree species' first introduction to Europe-possibly from Virginia during the early 17th century-and its rediscovery in the middle of the 18th century, leading to a new wave of introductions of the species, which probably came from Pennsylvania and the Virginias. More recently, black locust breeding programs have been carried out in central Europe since the beginning of the 20th century (Keresztesi, 1983;Liesebach, Yang, & Schneck, 2004;Straker, Quinn, Voigt, Lee, & Kling, 2015). Currently, Hungary is the European leader in the production of black locust seedlings, and their selected provenances for wood production are now widely distributed in Europe for new forest plantations (Keresztesi, 1983;Liesebach et al., 2004;Straker et al., 2015). In Europe, the black locust is now recognized as one of the 100 worst invasive species (Basnou, 2009;DAISIE, 2006) and it is considered as an invasive tree on a world scale (eight regions out of the fourteen defined by Richardson & Rejmánek, 2011).
Although knowledge about the black locust's genetic diversity is key to developing further ecological or evolutionary studies (Lawson Handley et al., 2011), little information exists about its genetic diversity and structure in introduced ranges, nor regarding its origin and differentiation from the population sources in North America. The only studies we know of in Europe compared four American populations with sixteen German and Hungarian populations (Liesebach & Schneck, 2012;Liesebach et al., 2004), but although the results suggested a high genetic differentiation among American populations, they were mostly inconclusive.
Modern molecular and statistical tools used in population genetics have proved to be useful for finding the geographic origins of invasive species, complementing or providing a solution to the lack of available historical knowledge (Besnard et al., 2014;Chown et al., 2015;Cristescu, 2015;Hoos, Whitman Miller, Ruiz, Vrijenhoek, & Geller, 2010). Using SNP markers developed for the black locust (Verdu et al., 2016), we investigated its introduction history and genetic diversity in its native range and European invasive range, in particular by answering the following questions: (a) Can we identify the native population sources of European black locust?
(b) What is the genetic differentiation within and between ranges?
(c) Can we detect a founder event associated with a loss of genetic diversity?

| Sampling
Sixty-three populations of black locust were sampled in both the native range (29 populations) and the European invasive range (34 populations). Sampling was conducted between spring 2014 and fall 2016 (Table 1 and Supporting Information Appendix S1) by different collaborators using the same protocol: Between 10 and 30 trees were sampled in each population. Samples were collected either in common gardens or in natural populations. A total of 818 individuals were sampled: 402 from Europe and 416 from North America.
Black locust propagates through sexual and asexual reproduction. In common gardens, since trees were grown from seeds of known origin, there was no risk of collecting clones. However, in natural populations, a minimal distance of 25 m was kept between two sampled trees in order to minimize the risk of collecting clones.
Either leaves, cambium, buds, or seeds were harvested depending on the season. For leaf sampling, a few leaflets on a green healthy leaf were collected using a manual tree pruner. For cambium sampling, external bark was removed from the trunk with a knife, then five rings of wood were collected using a 1-cm-diameter punch. In the field, samples were put into referenced tea bags and then placed into plastic boxes containing silica gel in order to dry the samples.
The silica gel was renewed after 24 hr and 48 hr and then until it no longer changed color. The plastic boxes were then stored at ambient temperature in closed cupboards.
In natural populations, GPS coordinates of either the population or each sampled tree were recorded using a portable GPS (GPSMAP62, Garmin, Olathe, KS, USA). On the campus of Michigan State University, the geographic origins of each mother tree were known and were used for the coordinates of the sampled trees, and the populations were defined by gathering trees from a close geographic location (see Supporting Information Appendix S1).  Note. The range corresponds either to Europe (EU) or the USA (US) and either the country or the state is indicated. X Long and Y Lat (longitude and latitude, respectively) corresponded to the GPS coordinates of the sampled population provided in the WGS84 geographic projection. N is the number of individuals genotyped per population. G is the number of unique genotypes in each population. R is the index of clonal diversity, as defined in the material and methods section. F IS mean, F IS LC95 and F IS HC95 indicate, respectively, mean F IS value and the 95% confidence interval computed using the hierfstat R package for each population. The F IS values in bold indicate that the 95% confidence interval, calculated using 1,000 bootstrap replicates, does not include zero. Ho is the observed heterozygosity and

TA B L E 1 General genetic information regarding the sampled populations
Hs the expected heterozygosity. Genetic diversity values were calculated using the initial dataset after clone removal (i.e., 113 SNPs and 720 individuals).
TA B L E 1 (Continued)

| DNA extraction and genotyping
For each individual, either a 1 cm 2 leaf sample was collected on a leaflet, cambium was manually extracted from one ring of wood or five buds were collected. The plant material was then crushed using an automated grinder (2010 Geno/Grinder, SPEX SamplePrep, Metuchen, NJ, USA). For four populations (Corphalie, Drewnica, Pinczow and Lewisburg), a few seeds from ten sampled mother trees were scarified and grown in the laboratory (Bouteiller, Porté, Mariette, & Monty, 2017 (Bouteiller et al., 2018).
The whole dataset will be made available on the Open Science Framework repository after acceptation.

| Clone removal
For the analysis of genetic diversity and structure within and between ranges, we chose to identify and remove clones from the analysis using R version 3.3.1 (R Development Core Team, 2016).
Within populations, only markers without missing values were kept, and a pairwise comparison of each genotyped individual was carried out in order to detect putative clones. As some populations were sampled from trees in common gardens or from laboratory-grown seedlings which originated from seeds (Table 1), they were unlikely to contain clones; however, we checked that no clone was present in these populations and removed them before carrying out the subsequent analysis. The difference in clonality between ranges was determined using a Pearson χ squared test with Yate's continuity correction using R version 3.3.1 (R Development Core Team, 2016).

| Molecular genetic structure
After removing clones from the dataset, molecular genetic differentiation was explored both between ranges and among populations within ranges using two approaches.
First, the typology of all sampled individuals from both ranges was assessed using a principal component analysis (PCA), developed in the R adegenet library (Jombart, 2008;Jombart, Devillard, & Balloux, 2010 Each run corresponded to a MCMC model with a burn-in period of 500,000 iterations followed by 500,000 iterations, which was repeated 10 times (Gilbert et al., 2012). The analysis was first performed using the initial dataset after clone removal to determine the structure of populations from both ranges (K varying from 1 to 20), then it was performed for each range separately (K varying from 1 to 15).
The most probable number of clusters was determined according to Evanno, Regnaut, & Goudet (2005) using the peak in the ΔK parameter calculated with the STRUCTURE HARVESTER software (Earl & von-Holdt, 2012

| Analysis of genetic differentiation and diversity
The genetic differentiation between populations (with values between 0 and 1, none-full differentiation) was analyzed using F ST indexes (Wright, 1931). Within and between ranges, F ST were calculated with the hierfstat v 0.04-28 R package (Goudet, 2005) according to the Weir and Cockerham method (1984). 95% confidence intervals (CI) were estimated by performing 1,000 bootstraps over loci.
Two datasets were analyzed in order to compare genetic diversity between ranges: the initial dataset ( Differences in AR were determined by performing a nonparametric Wilcoxon paired test among loci between ranges. In order to evaluate differences in total number of alleles between ranges, a bootstrap over all loci and individuals was computed using 1,000 simulations and the differences were determined using a nonparametric Mann-Whitney test using R version 3.3.1 (R Development Core Team, 2016).

| Isolation by distance analysis
In natural populations, genetic similarity is expected to be high between spatially close populations and then to decrease among populations with geographic distance; this pattern is known as isolation by distance (IBD). IBD was tested within each range, using  (Takezaki & Nei, 1996).

| More asexual reproduction in European populations
Overall, a higher clonality was detected in the European populations compared to the American ones, with a significant range effect (χ 2 = 29.04, df = 1, p = 7.10 × 10 −8 ). As expected, no clone was found within the common garden populations, nor in the populations obtained from seedlings germinated in the laboratory. When removing these populations from the analysis (thus leaving 280 European and 356 American individuals), 98 genotypes were found with a least one F I G U R E 2 (a) Individual assignation for the most likely number of clusters (where K = 2) as a result of the between range STRUCTURE analysis. Each colored vertical line represents one individual ancestry membership between the two clusters (orange, cluster K2_1, and blue cluster K2_2). Black vertical lines separate different populations. Both analyses were computed on the initial dataset after clone removal (720 individuals from 63 populations genotyped using 113 SNPs). (b and c) Pie charts of the population assignation in Europe and the USA for the most likely number of clusters (where K = 2) as a result of the STRUCTURE analysis between ranges. In blue, proportion of individuals significantly assigned to cluster K2_1; in orange, proportion of individuals significantly assigned to cluster K2_2; and in Purple, proportion of individuals admixed in each population. The native distribution of black locust within America (Little, 1971) is plotted in gray shading and in Europe it is present almost everywhere from Southern to Northern Europe  The overall F IS that was calculated among European populations of the "clonal dataset" (0.019, 95%CI: −0.018-0.062, estimated by bootstrapping over loci) was significantly lower than the overall F IS calculated among American populations of the "clonal dataset" (0.11, 95%CI: 0.077-0.14, estimated by bootstrapping over loci). At the population level, negative F IS values (Table 1)   In America, the proportion of admixed individuals per population ranged from 0% (Ouachita, Pleasant Hill, US Grp 1, US Grp 3, Victor)

| Significant genetic differentiation among populations in the native range, contrary to the introduced range
There was a significant genetic differentiation among all populations: Estimated F ST among all populations was 5.23% (95% CI: 4.77%-5.70%). Overall, within the native range, black locust populations were clearly genetically differentiated, matching with geographic structure, whereas in the introduced European range, the differentiation between populations was low and no structure was detected across the continent.  In the introduced range, the STRUCTURE analysis indicated that populations formed an optimal number of K = 2 clusters (Supporting Information Appendix S2C).
The first cluster ( Overall, mean individual admixture among populations assigned to this cluster was 70.9%.

| MAF distribution and detection of a bottleneck in the introduced populations
The MAF analysis performed on the initial and additional datasets highlighted a deficit in low frequency alleles when using 113 SNPs (MAF mode: 0.05-0.15) in both the native and invasive ranges, con-

| D ISCUSS I ON
Due to an extensive population genetics analysis in both native and European invasive ranges, we were able to show that black locust was likely to have been introduced to Europe from a limited part of its northeastern native distribution in the Appalachian Mountains.
This founding effect brought about a bottleneck, detected only when we increased the number of SNPs with low MAF markers. A strong genetic structure was observed in the USA, whereas a much weaker one was detected in Europe. Moreover, asexual propagation was probably more prevalent in the invasive range than in the native one.

| Populations genetics and introduction history: European black locust populations close to Northern Appalachian populations
The genetic results suggest that the black locust was introduced to The results obtained using a molecular approach are congruent with historical records pointing to the original sources of black locust in the northeastern part of its native range in the Appalachian Mountains. By reviewing historical studies, we were able to conclude that the first black locusts introduced to Europe during the early 17th century were likely to have come from Virginia and further black locusts introduced during the 17th and 18th centuries from Pennsylvania and West Virginia (Cobbett, 1825;Gunther, 1922;Michaux, 1813). Consequently, taking into account both the historical indications and the genetic proximity of all European black locust populations to a few native ones, it can be hypothesized that no subsequent introductions followed, and that the expansion of the species in Europe through asexual reproduction or seeds resulted from the original black locusts grown in Europe.

| Evidencing the bottleneck depends on the set of genotyped SNPs
Given that there are a few American populations close to European ones, a bottleneck is expected in European populations. The decrease in genetic diversity was only observed when using a larger number of SNPs, due to some particular properties of SNPs.
SSRs and SNPs are two widely used markers for genotyping nonmodel species (Morin, Luikart, & Wayne 2004;Coates et al., 2009;Helyar et al., 2011). SNPs have many advantages: They can be easily developed using NGS; genotyping is easily replicable among laboratories; and SNPs are widely distributed throughout the genome (Coates et al., 2009;Morin et al., 2004;Helyar et al., 2011). However, more SNPs are needed than SSRs in order to reach the same level of precision, essentially because SSRs are multiallelic, whereas SNPs are mainly biallelic (Morin et al., 2004). One major problem in using SNPs is the ascertainment bias (Coates et al., 2009;Morin et al., 2004;Helyar et al., 2011). In particular, SNPs with a high minor allele frequency are more susceptible to being sampled for genotyping populations and consequently this can alter diversity estimates (Helyar et al., 2011). As a consequence, a genetic diversity analysis conducted with SNP data may lead to false negative or false positive conclusions (Helyar et al., 2011;Morin et al., 2004). Some empirical studies showed that SNPs performed better than SSRs for studying population structure, whereas SSRs were more efficient for estimating diversity (Singh et al., 2013). Our study emphasizes the importance of taking the SNP ascertainment bias into account when comparing genetic diversity among several groups. Using the initial dataset, we observed more frequent minor alleles (modal class 0.05-015, Online Resource 3), which confirmed the sampling bias. We partially corrected this bias by using the additional dataset (251 SNPs), where actual MAF distribution is closer to the expected MAF distribution (modal class 0-0.05, Supporting Information Appendix S6A). By carrying out the analysis with this additional dataset, we were able to detect a bottleneck (decrease in allelic richness and total number of alleles in the introduced range, Supporting Information Appendix S6B), which would not have been the case if we had only used the initial dataset for studying genetic diversity between ranges (no difference in heterozygosity, allelic richness, and total number of alleles between ranges, Supporting Information Appendix S6B). Both a loss in allelic richness and in the total number of alleles point to the occurrence of a bottleneck, whereas heterozygosity is not expected to respond as well as allelic richness to a founding event (Dlugosch et al., 2015) as observed in our study. Uller and Leimu (2011) demonstrated that genetic variation between native and invasive ranges was influenced by taxonomy: Invasive animals often suffered a loss of genetic diversity between the ranges, whereas invasive plants often exhibited higher genetic diversity in the invasive range (Uller & Leimu, 2011). According to these authors, one factor contributing to this pattern is that invasive animal populations are often founded by single introduction events, whereas multiple introductions associated with admixture are more common for plants (Uller & Leimu, 2011 (Yang et al., 2017). Studies on invasive trees are less numerous, but they have generally concluded that multiple introductions occurred (Besnard et al., 2014;Merceron et al., 2017;Pairon et al., 2010;Thompson et al., 2015) without being able to clearly identify the population sources. A recent study on the genetic structure of invasive populations of Acacia saligna within several invasive ranges suggested that multiple introductions occurred from populations distributed throughout the Australian native range (Thompson et al., 2015).

| Structure in the native ranges was shaped by long-term evolutionary processes, whereas structure in the invasive range reflects anthropic action
Natural evolutionary processes seemed to have shaped the genetic diversity and structure of black locust populations in the native range.
Three genetic clusters were identified within the native range, with the greatest differentiation between the first cluster in the Ozark Mountains and the two clusters in the Northern and Southern Appalachian Mountains. Together with the pattern of isolation by distance, this suggests the action of natural and long-term evolutionary processes. On the contrary, in Europe, the weak structure in two clusters with a few outlying populations and no isolation by distance would suggest a recent evolutionary history marked by human actions.
The genetic structure detected in the American black locust is congruent with observations in other North American tree species.
In North America, glacial refugia have been identified on both sides of the Mississippi River (Hewitt, 2000;Swenson & Howard, 2005) throughout geographic areas closely related to the genetic clusters identified in this study. A large differentiation on each side of the Mississippi River has been recorded for at least one other tree species, the loblolly pine (Pinus taeda L.), which exhibited two distinct genetic clusters (Lu et al., 2016). It is likely that the Mississippi River acted as a physical barrier during postglaciation recolonization after the last glacial maximum during Wisconsinan 21,000 years ago (Pessino, Chabot, Giordano, & DeWalt, 2014).
Moreover, similar to our findings, distinct genetic clusters have been identified along a north-south axis of the Appalachian mountains in several woody species, such as Scirpus ancistrochaetus (Cipollini, Lavretsky, Cipollini, & Peters, 2017), Tsuga caroliniana (Potter, Campbell, Josserand, Nelson, & Jetton, 2017) and Pinus strobus (Nadeau et al., 2015). As described by Swenson and Howard (2005), a historical suture zone has been identified between the

Northern Appalachian Mountains and Southern Appalachian
Mountains. Consistent with this suture zone, two glacial refugia were detected for P. strobus in the Northern and Southern Appalachian Mountains. (Nadeau et al., 2015). It can therefore be assumed that Appalachian black locust genetic structure was driven by the same processes as for other North American trees and reflects postglacial colonization routes originating from glacial refugia on each side of the Appalachian Mountains.
On the contrary, in Europe the weak structure in two clusters with a few outlying populations and no isolation by distance would suggest a recent evolutionary history marked by human actions.  (Cierjacks et al., 2013;Vítková, Müllerová, Sádlo, Pergl, & Pyšek, 2017). Moreover, a genetic breeding program has been conducted since the beginning of the 20th century in Hungary (Keresztesi, 1983). The genetic clustering within Europe may result from the evolution caused by artificial selection due to human-oriented selection and tree breeding, which was initiated in Central Europe in the 18th century. Thus, we can say that the European black locust is partially domesticated, and we can ask which traits influenced invasiveness. Further investigations involving common garden surveys would be necessary in order to assess whether genetic differences resulted in phenotypic differences.

| The role of clonality in shaping genetic diversity in Europe
In general, we found that European populations of black locust were more clonal than American populations. This was demonstrated by comparing the number of clones detected within each range, as well as the higher index of clonal diversity R, and the analysis of the inbreeding coefficient F IS . The same sampling protocol was followed by all field workers, who respected a minimum distance of 25 m between sampled individuals; it can therefore be concluded that the black locust is able to spread more than 25 m by clonal propagation.
This result was confirmed when visualizing the mapping of individuals with GPS coordinates (data not shown, available upon request).
A lower F IS in European populations than in American populations indicated an excess of heterozygosity within the former. Clonality usually produces this pattern (Arnaud-Haond et al., 2007;Halkett, Simon, & Balloux, 2005;Stoeckel & Masson, 2014), as clonal reproduction can maintain heterozygosity over generations (Stoeckel et al., 2006). However, no significant relationship between the clonal diversity index (i.e., R) and F IS was evidenced (data not shown).
Moreover, within the European range, the two outlying populations, Meppen and Munchenberg, were clearly differentiated no matter which of the analyses was applied. The results in a previous study on the Munchenberg population (code N° 7 -Hasenholz in Liesebach et al., 2004) showed that this population could clearly be differentiated from all the others. In addition, two major clones were detected which covered 80% of the stand (Liesebach et al., 2004). This is consistent with our finding since the Munchenberg population exhibited the lowest negative F IS (−0.180), which indicated an excess of heterozygosity potentially due to a high level of clonality.
In Japan, both clonal and sexual reproduction have been found to promote the spread and invasion of black locust (Kurokochi & Hogetsu, 2014). Sexual regime is likely to influence invasiveness and a shift in the mating system has already been observed between ranges for several invasive species (Barrett, Colautti, & Eckert, 2008;Petanidou et al., 2012;Rambuda & Johnson, 2004). Clonal populations can maintain a high level of genetic diversity; however, they can be sensitive to founder events (Barrett, 2010). Clonality is likely to strongly decrease F ST and to slowly decrease genotypic diversity in purely clonal populations (Balloux, Lehmann, & De Meeûs, 2003), but partially clonal populations are hard to differentiate from strictly sexual populations (Balloux et al., 2003). A theoretical study showed an advantage of clonal reproduction for species invasiveness. However, the relationship is not linear and species combining a high clonal rate with a small rate of sexual reproduction would have a higher invasiveness (Bazin, Mathé-Hubert, Facon, Carlier, & Ravigné, 2013). Clonal reproduction provides invasive plant species with reproductive assurance . Shifts in the mating system from outcrossing to clonality, due to a strong founder event, have been observed for the invasive Eichhornia crassipes  and Fallopia japonica (Hollingsworth & Bailey, 2000).
However, this is not systematic and some pure outcrossing species are successful invaders, such as A. artemisiifolia (Friedman & Barrett, 2008).
It is also possible that a shift toward more clonal reproduction occurred in the mating system of the black locust between the native and the invasive range. This could have been produced by the founder event or by artificial selection, as one traditional way of managing black locust plantations in Europe was to stimulate clonal reproduction by damaging tree roots (François de Neufchateau, 1807; Saint-Jean de Crève & Coeur, 1786).

| CON CLUS ION
We found a remarkable congruence between our genetic analysis and historical records regarding the geographic origins of the European black locust, with both approaches pointing to European populations originating from the Northern Appalachian Mountains. The history of black locust introduction is thus a unique pattern among invasive trees, which are commonly characterized by multiple introduction events. As a consequence, only a small part of black locust genetic diversity was introduced to Europe from its native American range.
Furthermore, in spite of the fact that black locust suffered a genetic bottleneck and a loss of diversity following introduction, this did not prevent its successful colonization throughout its European range. Moreover, we found some evidence for a shift in mating systems between ranges with an increase in clonality in Europe, resulting from either natural or artificial selection. However, our sampling method was not specifically designed to investigate changes in clonality during and after introduction. Therefore, further studies involving the sampling of extensive populations and plots would be needed to better understand the role of clonality in the success of this species, conjointly with studies on the role of sexual reproduction in the spread of the black locust.

ACK N OWLED G M ENT
We would like to thank all those who helped us sample black lo-

CO N FLI C T O F I NTE R E S T
None declared.

DATA ACCE SS I B I LIT Y
Data are accessible on Open Science Framework repository https:// osf.io/k97ax/ and they are publicly available.