Gene flow and population admixture as the primary post-invasion processes in common ragweed (Ambrosia artemisiifolia) populations in France


Author for correspondence:
Young Jin Chun
Tel: +33 3 80 69 32 67


  • An improved inference of the evolutionary history of invasive species may be achieved by analyzing the genetic variation and population differentiation of recently established populations and their ancestral (historical) populations. Employing this approach, we investigated the role of gene flow in the post-invasion evolution of common ragweed (Ambrosia artemisiifolia).
  • Using eight microsatellite loci, we compared genetic diversity and structure among nine pairs of historical and recent populations in France. Historical populations were reconstructed from herbarium specimens dated from the late 19th to early 20th century, whereas recent populations were collected within the last 5 yr.
  • Recent populations showed greater allelic and genetic diversity than did historical populations. Recent populations exhibited a lower level of population differentiation, shorter genetic distances among populations and more weakly structured populations than did historical populations.
  • Our results suggest that currently invasive populations have arisen from active gene flow and the subsequent admixture of historical populations, incorporating new alleles from multiple introductions.


The study of past evolutionary changes is a core subject of evolutionary research and is especially interesting for invasive species which have undergone significant changes in their ecological status. To infer the historical processes related to successful invasion, previous studies have compared the relative levels of genetic diversity and structure between populations from native and introduced ranges (Suarez & Tsutsui, 2008). They provide important insights into the population genetic processes involved before and after invasion, the identification of the most likely genetic sources and the inference of the pathways of spread. However, inferring the past evolutionary processes using data collected from historical populations (established in the past) may further our understanding of the status of populations during the early stages of invasion.

Most invasive populations undergo three fundamental steps to become invasive – initial colonization, a lag phase (a latent period during which invasive species grow slowly) and an explosive increase in the population size and range (Sakai et al., 2001). Although past evolutionary processes may be inferred from the genetic analysis of currently established populations, our knowledge of the post-invasion spread of invasive populations during the early stages of invasion will be furthered by direct assessment of historical populations obtained from herbarium specimens. Moreover, the genetic comparison between historical populations and recently established populations may provide useful insights into the post-introduction evolutionary processes forming currently invasive populations.

The present study compares the genetic diversity and population genetic structure between historical and recent populations of common ragweed (Ambrosia artemisiifolia) introduced in France from North America. We reconstructed historical populations from herbarium specimens dated from the late 19th to early 20th century, representing the early stages of invasion in France. Ambrosia artemisiifolia is one of the most problematic invasive plant species in Europe. Since the first known population was established in the middle of the 19th century (Heckel, 1906), its distribution range has expanded rapidly, beginning in the middle of the 20th century (Chauvel et al., 2006). Massive production of ragweed pollen often causes serious allergic diseases in humans (Laaidi et al., 2003). Invasive populations are also problematic in reducing crop yields in agricultural fields, and represent a significant challenge to the management of natural resources (Kazinczi et al., 2008).

Multiple introduction, gene flow and population admixture are known to shape the evolutionary potential of invasive species (Dlugosch & Parker, 2008). A previous study of microsatellite analysis on A. artemisiifolia has suggested multiple introductions, as the genetic diversity of French populations is comparable with that of North American populations (Genton et al., 2005). Gene flow may increase genetic diversity, diminish the negative effects of bottlenecks and founding events, allow the spread of advantageous alleles and promote rapid evolution (Morjan & Rieseberg, 2004; Busch et al., 2007; Lavergne & Molofsky, 2007), although it has not been examined for the invasive populations of A. artemisiifolia. Gene flow may be prevalent for outcrossing species of A. artemisiifolia, as its pollen can be dispersed over distant populations by wind (Fumanal et al., 2007a), and seeds can be spread over long distances via water streams (Fumanal et al., 2007b). The strong dispersal capability, together with its high fecundity (a plant can produce between 300 and 6000 seeds, with a maximum of 14 000), may predispose A. artemisiifolia to be an aggressive invader (Fumanal et al., 2007a). By comparing historical and recent populations using microsatellite markers, we addressed three questions regarding the invasion success of A. artemisiifolia: Is genetic diversity greater in recent populations than in historical populations? How has genetic structure changed from historical to recent populations? Are gene flow and population admixture likely to have shaped currently invasive populations?

Materials and Methods

Population sampling

In 2004, we initiated a study on the historical spread of Ambrosia artemisiifolia L. in France based on the records collected from European herbaria (Chauvel et al., 2006). We found that each historical population was represented by many herbarium specimens collected on the same dates and by the same collectors, but distributed across many herbaria. Therefore, based on the geographical indications of original collection sites in herbarium specimens, we were able to reconstruct historical populations by collecting herbarium specimens. Sampling from herbarium specimens is often limited to a few samples per species, resulting in insufficient sampling and bias for population genetic studies (Nielsen et al., 1999). To alleviate this problem, we collected as many herbarium specimens as possible from nine herbaria across Europe (Bordeaux, Brussels, Clermont-Ferrand, Geneva, Lyon, Marseille, Montpellier, Neuchâtel and Paris). A total of 18 historical and recent populations were sampled from nine geographical sites in France (Table 1, Fig. 1). Recent populations were sampled during the last 5 yr, and historical populations were collected from herbarium specimens whose original collection sites geographically matched the nine sites.

Table 1.   Genetic diversity of historical (H) and recent (R) populations of Ambrosia artemisiifolia
Location/provinceIDCollection yearNNANPNRHOHEFIS
  1. N, sample size and number of subsamples in parentheses; NA, allelic richness; NP, number of private alleles; NR, mean number of private alleles over loci; HO, observed heterozygosity; HE, unbiased measure of expected heterozygosity; FIS, inbreeding coefficient.

  2. 1, Significant deviation from zero. Bold values indicate significant differences between historical and recent populations.

R50200420 (7)4.432.080.000.6710.8550.2301
Saint-Galmier/LoireH80188322 (20)4.413.074.190.7510.8510.1201
Lyon Calluire/RhôneH243190126 (20)3.881.843.150.6630.7730.1461
R177200712 (7)4.680.360.000.8960.893−0.004
R70200420 (7)4.230.800.000.7060.8320.1641
R101200520 (9)4.432.090.000.6890.8540.2031
R180200719 (14)4.542.224.070.8180.8740.0671
Pont d’Ain Gare/AinH1861934–3773.841.730.000.6790.7980.1611
R186200720 (7)
Figure 1.

 Geographical locations of Ambrosia artemisiifolia study populations. Population IDs are as in Table 1. The locations of historical populations were assigned to the same locations as recent populations, according to the collection sites described in herbarium specimens.

Microsatellite analysis

Genomic DNA was extracted from dried leaves using a Qiagen DNeasy™ 96 plant kit, and diluted 20 times. Eight microsatellite loci (Ambart 04, 06, 09, 17, 18, 21, 24 and 27; GenBank accession nos FJ595149FJ595156), described in Molecular Ecology Resources Primer Development Consortium (2009), were used. PCR was performed using an Eppendorf Mastercycler 5333 thermal cycler (Eppendorf GmbH, Hamburg, Germany) in a final reaction volume of 10 μl containing approximately 10 ng of genomic DNA, 0.04 μm forward primer, 0.16 μm reverse primer, 0.16 μm universal fluorescent M13 primer, 0.5 U Taq DNA polymerase (Qiagen) and PCR buffer, including 67 mm Tris–HCl, pH 8.8, 16.6 mm (NH4)2SO4, 2 mm MgCl2, 0.7 mmβ-mercaptoethanol, 0.7 mm each deoxynucleoside triphosphate, 0.05% Brij® 58 (Sigma-Aldrich) and 0.2 mg ml−1 BSA. Amplification profiles included an initial denaturation of 95°C for 15 min, 30 cycles at 95°C for 30 s, 50°C for 45 s, 72°C for 45 s, eight cycles at 95°C for 30 s, 53°C for 45 s, 72°C for 45 s, and a final extension step of 72°C for 5 min. During the first 30 cycles, the forward primer with an 18 bp M13 tail (5′-TGTAAAACGACGGCCAGT-3′) was incorporated into the PCR products. In the following eight cycles, these products were hybridized with universal fluorescent M13 primer (Schuelke, 2000). The genotyping of amplified products was carried out on a CEQ 8000 Genetic Analysis System (Beckman Coulter, Krefeld, Germany) using internal size standard-400, and the allele size were analyzed using Beckman manufacturer’s software. An individual was declared null (nonamplifying at a locus) and treated as missing data after at least two amplification failures.

Data analyses

Microsatellite polymorphism was estimated by the allelic richness (NA) using fstat 2.9.3 (Goudet, 2001). We estimated the number of private alleles (NP), number of rare alleles (NR, frequency < 0.05), and the observed (HO) and unbiased estimate of the expected (HE) heterozygosity using genalex 6.1 (Peakall & Smouse, 2006). To compensate for the effect of unequal sample sizes between historical and recent populations on genetic diversity measures, we randomly subsampled the larger population using a number of smaller populations for each pair of historical and recent populations, following the recommendation of Leberg (2002). We created 100 subsampled datasets to test whether the estimates of genetic diversity were greater in recent than historical populations. For a pair of populations with equal sample size [Moulins (H68/R68)], we randomly shuffled individuals between populations to calculate the probability of finding a greater difference in diversity measures compared with the actual difference.

The independence of genotype distribution (linkage disequilibrium) between pairs of loci was tested using genepop 4.0 (Rousset, 2008). Levels of significance for multiple tests were determined using sequential Bonferroni adjustments (Rice, 1989). Tests for deviation from Hardy–Weinberg equilibrium for each locus were conducted using exact tests as implemented in genepop. Weir & Cockerham’s (1984) estimators of the inbreeding coefficient (FIS) were calculated using genetix 4.05 (Belkhir et al., 1996–2004) for overall loci. Significant difference of FIS from zero was tested using 10 000 random permutations. We checked potential problems with stuttering, large allele dropouts and null alleles using micro-checker (van Oosterhout et al., 2004). We found some indications of null alleles (see Results) and therefore calculated the maximum likelihood estimates of the null allele frequencies using the expectation maximization algorithm (Dempster et al., 1977) employing freena (Chapuis & Estoup, 2007). A new genotype dataset corrected for null alleles was also obtained.

To account for the high mutation rate of microsatellites in estimating genetic differentiation, we tested whether allele size variation contributed to genetic differentiation by permuting the allele size within loci using spagedi 1.3 (Hardy & Vekemans, 2002). As the null hypothesis was not rejected, we chose FST (Weir & Cockerham, 1984) to estimate the genetic differentiation among populations for our data. Preliminary analysis also revealed weakly structured populations (Nm = 9.8), where FST is a better estimate than RST (Balloux & Goudet, 2002). We tested the significance of genetic differentiation among populations using 1000 permutations and standard Bonferroni corrections without assuming Hardy–Weinberg equilibrium. As FST may be overestimated in the presence of null alleles (Chapuis & Estoup, 2007), we corrected for the positive bias of genetic differentiation by recalculating the FST values using the excluding null allele method in freena. We conducted a Mantel test (Mantel, 1967) between FST uncorrected for null alleles and FST corrected for null alleles with 10 000 random permutations to test whether they were significantly correlated.

Isolation by distance was tested separately for historical and recent populations, using the Mantel test between FST/(1−FST) (FST corrected for null alleles) and the log10-scaled geographic distance matrix between populations in kilometers. The significance of the Mantel test was assessed using 10 000 permutations. To assess the distribution of genetic variation between periods (historical and recent) and among populations within each period, we performed analysis of molecular variance (AMOVA; Excoffier et al., 1992) on the FST matrix corrected for null alleles using genalex. We also performed AMOVA for historical and recent populations separately. The probabilities of variance components were estimated from 10 000 random permutations.

To detect the difference in genetic relatedness between historical and recent populations, we calculated Cavalli-Sforza & Edwards’ (1967) chord distance between all pairs of populations using genetix. We used the genotype dataset corrected for null alleles, as recommended by Chapuis & Estoup (2007). This dataset was subsampled 100 times to eliminate the effect of unequal sample sizes between historical and recent populations. The resulting 100 distance matrices were combined using SDM ( to draw a principal coordinate plot.

To compare the population clustering pattern between historical and recent populations, we used a Bayesian approach implemented in structure 2.2 (Pritchard et al., 2000) to cluster similar multilocus genotypes, whilst allowing population admixture and correlated allele frequency. As a result of the presence of null alleles, we conducted the analysis on the original dataset using the option RECESSIVEALLELES = 1 to code null alleles as recessive alleles (Falush et al., 2007). The simulation ran with the number of clusters (K) from one to ten, and was repeated five times for each K to confirm the repeatability of the results. Each run comprised a burn-in period of 105 iterations, followed by 106 Markov chain Monte Carlo (MCMC) steps. As recommended by Evanno et al. (2005), we calculated the ad hoc statistic ΔK on the basis of the rate of change in the log likelihood of data between consecutive K values. We identified the most likely number of clusters (K′) that maximized ΔK and ran an additional 15 simulations on K′. The estimated membership coefficients from the 20 runs at K′ were averaged using the FullSearch algorithm implemented in clumpp (Jakobsson & Rosenberg, 2007), and displayed as box graphs using distruct (Rosenberg, 2004).


Greater genetic diversity in recent than historical populations

For all pairs of historical and recent populations, the allelic richness (NA) and unbiased measure of expected heterozygosity (HE) were significantly greater in recent populations than historical populations (Table 1). No general difference in the mean number of private (NP) or rare (NR) alleles was found between historical and recent populations. Exceptionally, both NP and NR increased greatly in the recent population of Lyon Calluire (R243) compared with the historical population (H243). No linkage disequilibrium was detected among loci after Bonferroni corrections, and all loci were considered to be genetically independent. Significant deviation from Hardy–Weinberg equilibrium for 32 of 144 tests (18 populations × 8 loci) was detected after sequential Bonferroni adjustment. The significant positive deviation of FIS values for most populations (Table 1) indicated heterozygote deficiency. The diagnostic results using micro-checker found no evidence of stuttering or large allele drop-out for any of the loci. However, the potential occurrence of null alleles was detected in 36 of 144 tests, which mostly corresponded to the deviation from Hardy–Weinberg equilibrium. We included all loci in the data analysis, because null alleles occurred widely across all loci and the populations, and were not attributed to specific loci. Although null alleles may impede the correct inference of population genetic structure, a better alternative to either simply discarding loci or ignoring the null alleles may be to accommodate them in an appropriate analysis, especially when the number of loci is low (Wagner et al., 2006). The frequency of null alleles was estimated for each locus × population combination, and its distribution indicated a median of 0.043 with 25% and 75% quartiles of 0 and 0.097, respectively.

Lower genetic differentiation among recent than historical populations

The global measure of genetic differentiation was low, with FST = 0.0411 and 0.0412 before and after correcting for null alleles, respectively. Before correcting for null alleles, FST was greater for historical (0.058) than recent (0.028) populations (= 0.042, group test in fstat), which was consistent with the results after correcting for the null alleles (0.060 and 0.029 for historical and recent populations, respectively; = 0.033). The FST values corrected for null alleles were strongly correlated with the uncorrected FST values (= 0.99, < 0.0001) in the Mantel test, suggesting that all populations were similarly affected by null alleles.

Tests of isolation by distance were not significant for both historical (= 0.54, = 0.082) and recent (R = −0.14, = 0.277) populations. AMOVA indicated that 92% of the genetic variation lay within individuals, whereas only 0.3% of the variation could be attributed to period (historical vs recent; Table 2). Within each period, small genetic variation was partitioned between populations, but was greater for historical (6.0%) than recent (2.9%) populations.

Table 2.   Summary table of hierarchical analysis of molecular variance
Source of variationd.f.Variance% VarianceStatisticP
  1. d.f., degrees of freedom.

Historical and recent populations
 Among period (historical vs recent, S)10.0100.3FST = 0.003< 0.001
 Among populations (P) within period160.1474.1FPS = 0.041< 0.001
 Among individuals (I) within populations2720.1363.8FIP = 0.040< 0.001
 Within individuals2903.29591.8FIT = 0.082< 0.001
 Total (T)5793.588   
Historical populations
 Among populations80.2116.0FPT = 0.060< 0.001
 Among individuals within populations1100.2356.7FIP = 0.071< 0.001
 Within individuals1193.08487.4FIT = 0.126< 0.001
Recent populations
 Among populations80.1042.9FPT = 0.029< 0.001
 Among individuals within populations1620.0681.9FIP = 0.0190.015
 Within individuals1713.44295.3FIT = 0.047< 0.001

Closer genetic relationships among recent than historical populations

The principal coordinate axes PC1 and PC2 together explained 40% of the total genetic variation (Fig. 2). In contrast with the widespread dispersion of historical populations in the principal coordinate plot, recent populations showed reduced genetic distances and generally aggregated in the center of the plot [the exception being Pont d’Ain Gare (R186)].

Figure 2.

 Principal coordinate (PC) plot based on Cavalli-Sforza & Edwards (1967) chord distance. Populations of Ambrosia artemisiifolia: historical populations, squares; recent populations, circles.

In the Bayesian clustering analysis, ΔK indicated that three clusters best explained the genetic structuring of historical and recent populations (Fig. 3). Historical populations were roughly assigned to three clusters (Fig. 4), as their membership coefficients were greatest in the gray cluster (H50, 68, 177, 80), dark-gray cluster (H243, 70, 101, 186) and white cluster (H180). Populations assigned to the gray cluster had negative PC1 scores [the exception being Decize (H177)], whereas populations assigned to the dark-gray cluster had positive PC1 scores (Figs 2 and 4). However, this grouping was not applicable to recent populations, because the variation in the size of each cluster across the population was greatly reduced (i.e. the clusters become more evenly distributed; Fig. 4).

Figure 3.

 The logarithm of the probability of the data, L(K) [mean likelihood (open circles) ± SD], and the second-order rate of change in the probability between successive runs, ΔK (filled circles), as a function of K, the number of clusters.

Figure 4.

 Estimated structure of (a) historical and (b) recent populations of Ambrosia artemisiifolia using Bayesian clustering analysis. Each population is partitioned into three clusters (white, gray and dark-gray).


Studies of the genetic diversity and structure of invasive species may provide a useful insight into the critical evolutionary processes shaping currently invasive populations. In particular, a comparative study between historical and recent populations allows an assessment of the impacts of evolutionary events occurring during the early and later stages of invasion. Although the genetic diversity of an invasive species may be reduced as a result of bottleneck and founder effects during the early stage of invasion, successful invasions may occur when multiple introductions increase the available gene pool, which, in turn, may be distributed and exchanged across populations by gene flow to increase genetic diversity for successful invasion (Erickson et al., 2004; Marrs et al., 2008; Rosenthal et al., 2008; Andreakis et al., 2009).

The allelic diversity and heterozygosity of recent populations were increased significantly compared with the historical populations (Table 1). This may occur when newly introduced alleles have been incorporated from other genetic sources (i.e. repeated introductions from North America; Genton et al., 2005) and/or when post-invasion gene flow has distributed alleles across populations. An interesting example for the former was Lyon Calluire, which shows a remarkably increased number of private alleles and rare alleles, probably as a result of multiple introductions from North America or other French populations not included in this study. A close examination of herbarium collections found that A. artemisiifolia has been introduced multiple times and has established independently across a broad range since the late 19th century (Chauvel et al., 2006). Populations have shown a rapid increase in size and have colonized a widespread range in France. Historical populations are generally small according to the descriptions in herbarium specimens (Chauvel et al., 2006), whereas some recent populations are often characterized by a large size (> 10 000 plants). Thus, our sampling scheme may represent a relatively smaller proportion of the available gene pool in recent than historical populations, and underestimate the actual allelic diversity in recent populations.

The observed heterozygote deficiency generally may be a result of inbreeding, the Wahlund effect or the presence of null alleles. However, inbreeding is the least likely because wind pollination and outcrossing are known to be the prevalent modes of reproduction in A. artemisiifolia (Fumanal et al., 2007a; Friedman & Barrett, 2008). The Wahlund effect is also not likely because all the study populations appeared to have persisted as distinct populations for at least more than several decades. Null alleles are common for rapidly evolving microsatellite loci (Dakin & Avise, 2004), and thus may be the most probable explanation for the deviation from Hardy–Weinberg equilibrium. The degradation of DNA obtained from herbarium specimens (Miller et al., 2002; Cozzolino et al., 2007) may generate null alleles.

The low FST and AMOVA results (Table 2) together indicate a low level of genetic differentiation among populations. Importantly, recent populations were less differentiated than historical populations, suggesting that recent populations were shaped by strong gene flow. A lack of isolation by distance indicates that gene flow may occur over distant populations. Principal coordinate and Bayesian clustering analyses also support gene flow and subsequent population admixture. Recent populations generally showed decreased genetic distance (Fig. 2) and were more weakly structured than historical populations (Fig. 4). This was especially evident in Lyon Calluire (R243), Lucenay-lès-Aix (R70) and Bassens (R101) populations, which showed a disproportionate increase in allelic and genetic diversity than other populations (Table 1). These three populations also moved a greater distance than other populations in the principal coordinate plot (Fig. 2), indicating an increased frequency of alleles shared with other populations. Bayesian clustering results also indicated that they became indistinguishable from other populations by reducing the membership of the dark-gray cluster and increasing the membership of the white or gray cluster (Fig. 4).

In sum, our comparative analysis involving historical and recent populations of A. artemisiifolia has revealed that recent populations increased their genetic diversity through strong gene flow and subsequent admixture of different genotypes during rapid colonization and range expansion. By comparing North American and French populations, Genton et al. (2005) reported that invasive French populations of A. artemisiifolia have resulted from the admixture of North American populations introduced multiple times from different genetic sources. Our results highlight the role of gene flow in forming recent French populations from early established populations. Currently successful populations of A. artemisiifolia may have arisen from the interplay of the incorporation of new gene pools by multiple introductions, active gene flow and subsequent population admixture to maintain high genetic diversity, responding to novel selection pressure in the introduced range of France.


We are grateful to M. Rausher and two anonymous reviewers for insightful comments on the manuscript. We thank the coordinators of the following herbaria for allowing the collection of samples from specimens: Jardin Botanique de la Ville de Bordeaux (BORD), Vrije Universiteit Brüssel (BRVU), Institut des Herbiers Universitaires de Clermont-Ferrand (CLF), Conservatoire et Jardin botaniques de la Ville de Genève (G), Université Claude Bernard (LY), Université de Provence Centre St-Charles, case 4 (MARS), Université Montpellier II (MPU), Université de Neuchâtel (NEU) and Muséum National d’Histoire Naturelle (P). We also wish to thank S. Michel and F. Pernin for technical assistance and M. P. Chapuis for thoughtful comments and suggestions. This work was supported by a post-doctoral grant to Y. J. Chun from the Research and Transfer of Technology, Regional Council of Bourgogne, and the University of Bourgogne.