Trait evolution during a rapid global weed invasion despite little genetic differentiation

Abstract Invasive species often possess a great capacity to adapt to novel environments in the form of spatial trait variation, as a result of varying selection regimes, genetic drift, or plasticity. We explored the geographic differentiation in several phenotypic traits related to plant growth, reproduction, and defense in the highly invasive Centaurea solstitialis by measuring neutral genetic differentiation (F ST), and comparing it with phenotypic differentiation (P ST), in a common garden experiment in individuals originating from regions representing the species distribution across five continents. Native plants were more fecund than non‐native plants, but the latter displayed considerably larger seed mass. We found indication of divergent selection for these two reproductive traits but little overall genetic differentiation between native and non‐native ranges. The native versus invasive P ST–F ST comparisons demonstrated that, in several invasive regions, seed mass had increased proportionally more than the genetic differentiation. Traits displayed different associations with climate variables in different regions. Both capitula numbers and seed mass were associated with winter temperature and precipitation and summer aridity in some regions. Overall, our study suggests that rapid evolution has accompanied invasive success of C. solstitialis and provides new insights into traits and their genetic bases that can contribute to fitness advantages in non‐native populations.


| INTRODUC TI ON
Populations of invasive alien species occur in a range of environments with particular abiotic conditions and biotic selection pressures, which often result in rapid phenotypic changes in the new habitat (Blanquart et al., 2013;Bossdorf et al., 2005;Montesinos, 2021).
Many factors, such as gene flow, genetic drift, mutations, and standing genetic variation, can influence the extent and rate of population differentiation and create a mosaic of geographically independent populations in terms of evolutionary potential (Colautti et al., 2009;Lai et al., 2019;Matesanz et al., 2012;North et al., 2011). The success of an invader is determined by its ability to respond to the conditions it encounters and its rate of adaptation, which is influenced by the interplay between demographic and genetic processes and dispersal abilities (García-Ramos & Rodríguez, 2002). Evidence of rapid evolution during species colonization appears to be a common feature of biological invasions and can increase invader fitness and its impact on the community (Bossdorf et al., 2005;Colautti & Lau, 2015;Prentis et al., 2008). Consequently, biological invasions present unique opportunities to study contemporary evolution and species range limits (Keller & Taylor, 2008;Lee, 2002), with a large body of research devoted to understanding the factors that contribute to invasion success (Enders et al., 2020). Integrating knowledge from invasion biology about species evolutionary responses can be highly informative in terms of biodiversity conservation and predicting species range expansion in the context of climate change (Bock et al., 2015).
Identifying and understanding the processes and mechanisms behind invasive species colonization, establishment, and range expansion, as well as the role played by genetic diversity in invasion success and the traits associated with invasiveness, and how they evolve, remain central to the field of invasion genetics (Bock et al., 2015). During the past 15 years, this has benefited from the rapid progress in genomic approaches that allow us to capture patterns of genome-wide variation and add a much finer resolution into several aspects of the invader success including the colonization history, population demography, and evolution of invasiveness (Matheson & McGaughran, 2022;McGaughran et al., 2021;North et al., 2021). Different species traits have been identified as important determinants of invasive success including a fast growth rate, short life cycles, high reproductive output, good dispersal ability, and a high resource acquisition capacity (Hodgins et al., 2018).
However, given the complexity and dynamics of the invasion process, different sets of traits appear to be important under different contexts and at different stages of invasion (Hodgins et al., 2018;van Kleunen et al., 2015). For species that have recently become invasive, comparison of genotypes and phenotypes in their novel environments relative to that of their native environments ("home vs. away"; Pigliucci, 2001;Hierro et al., 2005) can be highly informative about the phenotypic traits that vary between invasive and noninvasive genotypes, the role of selection and the genetic architecture of invasiveness (Blanquart et al., 2013;Bock et al., 2015;Keller & Taylor, 2008). For example, Turner et al. (2021) used genotyping by sequencing (GBS) and conducted a genome-wide association study (GWAS) on native and invasive populations of Centaurea diffusa Lam. grown under common garden conditions for two generations. They found that invasive plants have significantly larger leaves compared with native plants, a trait that is associated with increased biomass and fitness, and they also identified several potential candidate genes for local adaptation (Turner et al., 2021). Recent studies of native and introduced populations of Mimulus guttatus Fisch. ex DC. generated genome-wide SNPs markers and conducted demographic analyses to reconstruct the species global invasion and understand how the species may evolve toward increased invasiveness (Puzey & Vallejo-Marín, 2014;Vallejo-Marín et al., 2021). Another study by Hodgins et al. (2012) used a microarray experiment to examine differences in gene expression patterns between native and introduced populations of Ambrosia artemisiifolia L. under light and nutrient stress treatments but found no consistent pattern of up-or down-regulation of these genes between ranges. Although during the last decade we have gained important insights into the molecular mechanisms of species adaptation, our understanding of how the genotype translates into phenotype in response to different environments remains limited (Neinavaie et al., 2021).
A powerful approach to study adaptive evolutionary changes in populations of invasive species is to compare the genetically determined quantitative trait differentiation (termed Q ST, Spitze, 1993;Leinonen et al., 2013) with neutral genetic differentiation (F ST , Wright, 1951). However, estimation of Q ST requires prior knowledge about relatedness between sample individuals, which for some species might not be feasible. An alternative to Q ST is P ST in these cases. The phenotypic differentiation index (P ST ) uses purely phenotypic data to compare the combined influence of genetic adaptation, phenotypic plasticity, and genetic drift as causes of population temperature and precipitation and summer aridity in some regions. Overall, our study suggests that rapid evolution has accompanied invasive success of C. solstitialis and provides new insights into traits and their genetic bases that can contribute to fitness advantages in non-native populations.

K E Y W O R D S
biogeography, divergent selection, invasive alien species, P ST -F ST comparison, reproductive strategy, single-nucleotide polymorphisms, yellow starthistle differentiation (Leinonen et al., 2006(Leinonen et al., , 2013. Under the neutral expectation, P ST equals F ST , with any significant deviation indicative of an additional influence of selection on P ST (i.e. a P ST < F ST is evidence of stabilizing selection while a P ST > F ST indicates a history of adaptive divergence, Leinonen et al. (2006). Centaurea solstitialis L. (yellow starthistle, Asteraceae) is an outcrossing annual diploid herb (Irimia et al., 2017) with a genome size of 1C = 851 Mbp (Bancheva & Greilhuber, 2006), native to the Mediterranean region. Seeds of C. solstitialis were introduced across the world during the past 200-250 years as a contaminant of alfalfa seed and established invasive populations in South America, North America, South Africa, and Australasia (DiTomaso et al., 2006). Previous work using population genomic analyses across a broad range of C. solstitialis in Eurasia, North America, and South America has showed that native populations are genetically and geographically structured into four groups including an Asian group (Turkey, Armenia, and Uzbekistan), an Eastern European group, a southern Greece group, and a Western European group (Barker et al., 2017). The same study found evidence that populations in Western Europe are derived from an ancient admixture event among populations from Eastern Europe and Asia and that they served as a bridgehead source for introductions into the Americas (Barker et al., 2017). Moreover, the study by Barker et al. (2017) identified similar levels of genetic diversity between native and introduced ranges and overall low genetic structure in the invaded range as well as evidence of multiple introductions from different sources in some areas of North America. Several phenotypic studies in this species demonstrated a higher performance of invasive genotypes compared to native genotypes including an increase in plant size (Barker et al., 2017;Dlugosch et al., 2015;Eriksen et al., 2012), increased seed size Hierro et al., 2013Hierro et al., , 2020, and increased biomass production (Montesinos & Callaway, 2018a, 2018bWidmer et al., 2007), faster growth rates (Graebner et al., 2012;Montesinos & Callaway, 2018a, 2018b, earlier flowering time (Dlugosch et al., 2015;Eriksen et al., 2012), increased leaf chemical defenses (Sotes et al., 2015), and higher reproductive output (Dlugosch et al., 2015), which suggests that rapid local adaptation has occurred in the introduced range. Still, there are few genetic studies in this species that have aimed to gain insights into evolutionary forces and identify potential adaptive traits contributing to fitness advantages and invasion across the global invasive range. Thus, there is great potential to combine the power of genomic tools with ecological studies to collate such information for this invader.
In this study, we used a combination of approaches to search for genetic signals of divergent selection in C. solstitialis. First, we assessed trait differences among regions and between the native and non-native ranges to gain insights into their evolutionary divergence (P ST -F ST comparison) by growing plant individuals under a common garden environment. We then tested for associations between the phenotypic traits showing differentiation and the environmental variables at the sites of origin. In addition, we included samples from Australia -a region not considered in previous genetic and evolutionary studies of yellow starthistle, to have additional introduced populations with which to test for selection and trait differentiation.  (Hierro et al., 2020). At least 30 individuals were sampled within each population, and population sampling aimed to include a broad geographic and environmental range for each country (hereafter referred to as regions). Ten populations were from Turkey, 10 from Spain, 10 from Argentina, four from Chile, nine from California (USA), and seven from Australia (see Table S1 and Figure 1 for information on sampling sites).

| Common garden experiment
Seeds were germinated in a glasshouse at the Botanical Garden of the University of Coimbra in 50 cell plug trays containing commercial soil (Substratos Profissionais Leal and Soares S.A., Portugal) and seedlings were transplanted into 2 L square plastic pots (one plant per pot) filled with the same type of potting soil about 3 weeks after germination.
Plants were kept in an insect excluded glasshouse where they experienced a Mediterranean climate (sunny and hot summer which is typical for Coimbra) and were grown to maturity and senescence (March-October 2017, which overlaps with the plant life cycle in the wild). We recorded six phenotypic traits on each individual that reached reproductive stage and survived through senescence (N = 370) including (1) days to bolting; (2) days to first flower; (3) length of the largest spine; (4) final plant height; (5) capitula number; and (6) seed mass. These traits relate to plant growth, reproductive output and herbivore defense (flower spinescence) and were shown to be important to C. solstitialis invasive success (Agrawal et al., 2000;Barker et al., 2017;Dlugosch et al., 2015;Hierro et al., 2020). To obtain an F1 generation of seeds for seed mass weighing, we performed experimental crosses between individuals within each of the 50 populations included in this study (between different individuals belonging to the same population i.e., intra-population, see methodology in Irimia et al., 2021) and scored the mass of the seeds with pappus (fibrous outgrowths) for the total number of seeds produced. Final plant height (stem base to the highest point) and spine length (base to tip) were measured to the nearest millimeter using a flexible ruler, at the end of the experiment. A plant was considered to have initiated bolting when a flowering stem of ~5 cm tall started to extend from the basal rosette. At the end of the reproduction period, we counted all the capitula and lateral floral buds on each individual. Once the fruit ripened, we dissected the capitulum and counted the total number of ovules and viable seeds. Seeds obtained from the controlled crosses were stored in paper bags at room temperature for 6 months and then weighed on a Kern ALJ analytical balance (Balingen, Germany) to the nearest milligram.

| Environment characterization
We obtained climate data for the period 1970-2000 for all 50 maternal field sites from WorldClim 2 (Fick & Hijmans, 2017, see www. world clim.com/version2), at a spatial resolution of 2.5 arc-minutes.
The data consisted of 19 bioclimatic variables related to temperature (bio1--bio11) and precipitation (bio12-bio19) for different periods of a year. To quantify climatic differences among regions, we used radar plots and conducted a standardized and centered PCA (principal component analysis) using the (prcomp) function in the package factoextra (Kassambara & Mundt, 2016). We generated a PCA biplot and radar plots to visualize population distribution in the climatic space.

| ddRADSeq library preparation
We selected 194 individuals sampled across the six regions for double digest DNA (ddRADSeq), a reduced representation sequencing and genotyping method that has previously been applied with success to this species (Barker et al., 2017). Genomic DNA was isolated from silica gel dried leaves sampled from 8-week-old individual plants grown in the glasshouse, using the CTAB protocol (Doyle & Doyle, 1987). Leaf tissue was ground to a fine powder with the Tissue Lyser II (Qiagen) for 2 min at 25 Hz. Following extraction, DNA quality and quantity was assessed on a Nanodrop 2.0 spectrophotometer (Thermo Fisher) and a Qubit 2.0 fluorometer (Invitrogen Life Technologies) using the QuantiFlour dsDNA sample kit (Promega). A starting quantity of 500 ng of purified DNA from each individual was digested with the restriction enzymes, Pst1 (recognized sequence and cut site: CTGCA^G), and Mse1 (T^TAA), followed by the ligation of unique paired combinations of individual P1 and P2 barcoded adapters (Table S2)

| Morphological traits, phenotypic and genomic divergence indices
All analyses were conducted in R v3.5.2 (R Development Core Team, 2014). Phenotypic data were checked for heteroscedasticity and normality by Levene's and Shapiro-Wilk tests (Levene, 1960;Shapiro & Wilk, 1965). Trait associations were tested with Pearson's correlation tests. To test the differentiation of phenotypic traits among native (Turkey, Spain) and non-native (Argentina, Australia, California, Chile) regions, we generated generalized linear mixedeffects models by using the glmer function in the lme4 package (Bates et al., 2015). We used region as a fixed factor and population as a random factor and germination time as a covariant to account for potential growth differences between different plants. If the models indicated significant differences among regions, we applied Tukey HSD post hoc tests with p-values < 0.05 to infer which pairs of regions differed. To test for differences in phenotypic traits between native vs. non-native ranges, we generated generalized linear mixed-effects models with range as a fixed factor and population nested by region as a random factor. We used the dropterm function to obtain differences in AIC values across models. We performed a principal component analysis (standardized and centered PCA), to visualize trait differences between the native and non-native ranges We used the psych and stats packages in R (Revelle, 2018) to test for trait correlations, generate pairwise scatter plots, and compute Pearson correlation coefficients.
To compare the level of phenotypic and genetic divergence between native and non-native populations, we calculated P ST , an index that assesses differentiation among populations in quantitative traits. P ST for each trait and each paired region combination as well as between sets of regions combined into native and non-native ranges was calculated using a Bayesian approach following Leinonen et al. (2006). We fitted a linear model with population and region as a random effect and the trait of interest as the response variable using a Gibbs sampler implemented in the software WinBUGS 1.4.3 (Lunn et al., 2000) and specifying the corresponding mean value of F ST (for F ST calculation see the paragraph below on neutral population genetic structure). We assumed trait heritability to be 1 as is standard in these analyses (Leinonen et al., 2006). Posterior distributions were obtained by running five independent chains (50,000 iterations) after a burn-in of 1000 iterations. Bayesian 97.5% credibility intervals were estimated for both P ST and P ST -F ST difference.
If the credibility interval of the P ST -F ST difference was higher than zero, we regarded it as an indication that the expression of the trait tested is putatively under selection. Alternatively, when the P ST -F ST credibility interval overlaps zero, we interpreted that the observed degree of differentiation at quantitative traits could be the outcome of genetic drift (Leinonen et al., 2006(Leinonen et al., , 2013.

| Phenotypic and environmental associations
To overcome collinearity, we used principal component analysis (PCA) to reduce the 19 bioclimatic variables to two or three main principal components that captured at least 85% of the total variance. We then used these principal components to build generalized linear mixed models to quantify the relationship between the phenotypic traits and the environmental characteristics of the maternal field site, at the regional level (i.e. within each region). We tested the effects of the fixed variables using the dropterm function and Chisq test.

| De novo SNP discovery
Sequencing data was demultiplexed or assigned to each sample individual according to their unique paired barcode sequences, and reads were quality filtered using Stacks 2.2 (Catchen et al., 2011(Catchen et al., , 2013 to remove reads containing adapter sequence and low-quality base scores (a phred score below 10). The average number of reads per individual after demultiplexing and filtering was 6.5 M (range 300 K-44 M). All reads were end trimmed to 115 bp (combined paired read size 230 bp) to ensure that they were of the same length before running it through the denovo_map pipeline. We used only the complete paired reads in the mate files 1.fq and 2.fq to perform the alignment in denovo_map.pl with the following set of parameters: a minimum coverage depth of five to create a stack (−m = 5), a maximum mismatch distance of two nucleotides between loci when processing a single individual (−M = 2), a maximum of two stacks at a single locus (−X = 2), and a mismatch distance of two nucleotides between loci (−n = 2), to account for the possibility of fixed differences at loci in individuals when creating the catalog of loci, according to Barker et al. (2017). We also tested a higher stack depth (−m = 10), but this did not significantly change the results. The C. solstitialis individuals were grouped into six geographic regions by supplying the population map into the denovo pipeline. We obtained 526,533 variant sites (unfiltered) across 194 individuals. Fifty individuals had sequenced poorly with less than 500,000 reads each, so we decided to remove them and keep 144 individuals (24 per region × 6 regions) for all the subsequent analysis. VCFtools were used to filter the genotype data for the highest quality genotype calls by filtering out indels, including only bi-allelic sites (-min-alleles 2 -max-alleles 2), including only a minimum genotyping proportion per population of 0.9, and retaining only sites with a minor allele frequency greater than 0.05 in the whole dataset. These stringent filtering steps retained high-quality genotypes for 2138 variant sites (SNPs) across 144 individuals.

| Genomic signatures of selection
We conducted an outlier analysis to detect loci that were particularly differentiated and potentially linked to loci under selection. We used three methods to identify a consensus list of loci and minimize false positives that each individual method might find. The first software, Bayescan, uses a Bayesian likelihood method that assumes a Dirichlet distribution of allele frequencies between populations and a reversible-jump MCMC algorithm to calculate a posterior probability that each locus is under selection (Foll & Gaggiotti, 2008).
We conducted 10 pilot runs each consisting of 5000 iterations, with a burn-in of 50,000 iterations, a thinning interval of 10 and prior odds for neutral model set to 10, for a total of 3 replicates runs. The second software, OutFlank calculates a normal distribution of FST values from all SNPs and detects outliers using left and right-tail trim fractions. This distribution is then used to assign q-values to each locus to detect outliers that may be due to spatially heterogeneous selection (Whitlock & Lotterhos, 2015). Finally, the last software, PCAdapt is based on principal component analysis to detect outliers where each SNP is regressed against each principal component, with outliers extracted using z-scores (Privé et al., 2020). Combining the results from each method, we compiled a list of putative outlier SNPs.

| Population neutral genetic structure
Genetic diversity of each region was calculated at 1975 neutral SNPs (see previous section that identified putatively outlier SNPs that were excluded from this analysis) as H o (observed heterozygosity), H e (expected heterozygosity) , AR (allele richness), and F IS (inbreeding coefficient) using the function divBasic in the R package diveRsity (Keenan et al., 2013). We calculated pairwise region differentiation and its 95% CI between native and non-native ranges using the standardized allelic variance F ST with the diveRsity package (Keenan et al., 2013). We estimated effective population size (Ne) using NeEstimator V2.1 (Do et al., 2014) and the linkage disequilibrium method assuming random mating (Sun & Ritland, 1998).
We assessed neutral population genetic structure at 1975 SNPs in STRUCTURE 2.3.4 (Pritchard et al., 2000) by implementing a model of correlated allele frequencies (Falush et al., 2003) and admixture, and applying the default setting for all other parameters. Ten independent runs for all values of K (number of genetic clusters) between 1 and 8 were run using an MCMC length of 1,000,000 generations following a burn-in of 100,000 generations in accordance with recommended practice (Gilbert et al., 2012). STRUCTURE HARVESTER (Earl & von Holdt, 2012) program was used to carry out downstream processing of STRUCTURE results to calculate Evanno's Δk value to determine the optimal value of K (Evanno et al., 2005) and prepare an input file for CLUMPAK (Kopelman et al., 2015) to generate bar graphs of population structure. Population structure at neutral SNPs was also visualized using a discriminant analysis of principal components (DAPC) (Jombart et al., 2010) as implemented in the R packages, aDeGeneT 2.1.1 (Jombart, 2008) and ade4 (Dray & Dufour, 2007).

| Environment characterization and associations with the phenotypic traits
The PCA on 19 bioclimatic variables showed distinct clustering among regions in terms of climatic space and explained 57.5% variation in the first two PCs axes ( Figure S2). Plants in the two native regions experience a higher temperature seasonality (bio4). By comparison, individuals in the introduced regions experience higher annual mean temperatures (bio1) and mean diurnal ranges (bio2), and different precipitation patterns: a wet summer in Argentina (bio18) and Australia (bio14, bio17, bio18), a wet cold season in California (bio16 and bio19), and a high precipitation seasonality in Chile and California (bio 15) ( Figure S3). Because we found little overlap in several bioclimatic factors across different geographic regions, we conducted a regional analysis to test for associations between the phenotypic traits and the climate within each region. In Turkey, capitula number showed a negative association with PC1 ( Figure 3, Table S4a), which was mainly correlated to the winter temperature and the mean temperature of the wettest quarter (bio11, bio6, bio8, variables that showed negative loadings on the PC1) and to the precipitation of the warmest quarter (bio18, positive loadings on the PC1) (Table S5). Spanish individuals showed no relationship with the climate variables for any of the reproductive traits. In addition, we found no association between the climate and capitula number in Argentinean and Australian individuals (Table S4a). In Chile, capitula number was positively associated with PC1 ( Figure 3, Table S4a), which was mainly correlated to the mean temperature of the warmest quarter and mean temperature of the driest quarter as well as to the annual mean temperature and maximum temperature of the warmest month (bio10, bio9, bio1, bio5, all showing negative loadings on the PC1) and to isothermality and precipitation of the driest quarter (bio3, bio17, both showing positive loadings on the PC1) (Table S5). Lastly, capitula number of Californian individuals was positively associated with PC2 ( Figure 3, Table S4a), which was mainly correlated to the winter temperature and the mean temperature of the wettest quarter (bio6, bio8, bio11, positive loadings on the PC2) and to mean diurnal range (bio2, negative loading on PC2) (Table S5).
Seed mass of Turkish individuals showed a positive association with PC1 ( Figure 3, Table S4b). In Argentina, seed mass was negatively associated with PC2 ( Figure 3, Table S4b), which was mainly correlated to the maximum temperature of the warmest month and precipitation seasonality (bio5, bio15, both showing positive loadings on the PC2) and to winter precipitation and the precipitation of the driest quarter (bio19, bio17, both showing negative loadings on the PC2) (Table S5). We found no association between seed mass and climate in Chilean individuals. Seed mass of Californian individuals was negatively associated with PC2 (Table S4b).

| Candidate outlier SNPs detection
Of 2138 SNP markers, 163 were identified as putative outliers.
Three outlier SNPs were identified by all three methods. Another three outliers were identified both by BayeScan and OutFLANK and one outlier both by BayeScan and PCAdapt. Lastly, 12 outliers were identified both by OutFlank and PCAdapt. Consequently, 19 SNPs in total were identified by more than one method. The remaining 144 loci were identified by one method only as following: BayeScan, 4 loci; OutFLANK, 18, and PCAdapt, 122 ( Figure S4, Table S6). F I G U R E 2 Principal component analysis (PCA) on six phenotypic traits -comparison between native and nonnative ranges. PCA1 and PCA2 together explained 54.2% of the inertia variance in the two axes. The first PC axis was negatively associated with the length of the largest spine and positively associated with the days to first flower while the second PC axis was negatively associated with the number of capitula. The larger symbol of the two groups represents the centroid (i.e., the average coordinates of samples in that group).

| Genetic diversity
In general, genetic diversity was similar across native and introduced ranges in terms of allelic richness, observed heterozygosity, and ex-

F I G U R E 3
Capitula number and seed size variation along climatic gradients in Turkey (native range) and Argentina, Chile, and California (introduced range).

| Population genetic structure
STRUCTURE analysis identified K = 2 as the most probable number of genetic clusters, although some further substructure across regions is also evident (Table S7, Figure S5). In the native area, individuals from Turkey were differentiated from those in Spain, although  same two traits also showed climatic associations in some regions (Table S4). In the pairwise P ST -F ST comparisons, no differences were observed between regions in terms of days to bolting, days to flowering, final plant height, spine length, or capitula number (Table S8).

P ST values for seed mass exceeded F ST values between California and the two native regions (Turkey and Spain) as well as between
California and non-native Chile and Australia, with California showing considerably larger seeds compared with these regions. In addition, differences in seed mass were also found between Argentinean and Turkish populations (Table S8).

| DISCUSS ION
Our results add to the evidence of the rapid evolutionary capacity of C. solstitialis and other invasive plant species more generally.
We identified significant trait divergence for seed size and capitula number between the native and non-native ranges. These traits also seem to be rapidly developing trait-climate associations across several of their non-native ranges. We add to knowledge about the genetic diversity of this species across its global range, in particular through expanding sampling in Chile and new sampling in Australia.
The genetic evidence supports Western Europe, represented by Spain, as a bridgehead source of other non-native regions based on limited genetic divergence between Spain and all other non-native regions studied here.
Our evidence of geographical trait differentiation supports previous findings of increased seed mass in the C. solstitialis nonnative range compared to the native range Graebner et al., 2012;Hierro et al., 2013Hierro et al., , 2020 but do not support other evidence that C. solstitialis had evolved towards larger plant size in the introduced area (Barker et al., 2017;Dlugosch et al., 2015;Eriksen et al., 2012;García et al., 2013;Widmer et al., 2007). The fact that we did not detect differences in plant size may be related to the way we scored this trait by measuring the plant height at senescence. Previous studies have used different proxies for plant size including a morphological index that combines information about the leaf number and leaf size at 5 weeks (Barker et al., 2017;Dlugosch et al., 2015). Others have scored the length and width of the fifth true leaf (Eriksen et al., 2012), the rosette diameter (García et al., 2013), and the fresh weights of seedlings at 2 weeks (Widmer et al., 2007). Some of these studies also involved comparing different geographical regions than ours.
The six regions showed differences in several climatic factors. Australia experience wet summers like those in Argentina. A previous study that compared native populations from Eurasia with non-native populations from western USA (California) found that populations in both ranges tend to occupy a similar climatic niche (Dlugosch et al., 2015). We found several significant correlations between the traits we measured and the 30-year average climate data.
Capitula number showed an association with the winter temperature in both Turkey and California, with more capitula being produced in habitats with moderate winters. In Chile, capitula number increased with decreasing summer aridity. Soil water is a known limiting factor in C. solstitialis invasion (Dlugosch et al., 2015). The species emerges in the fall and overwinters as a basal rosette. Higher water availability, coupled with moderate winter temperatures, could provide an advantage in terms of a faster growth and higher reproductive output (i.e. larger plants producing more capitula, as suggested by the positive correlation we found between these two traits in our study). Similarly, seed mass showed distinct climatic associations in different regions. Seed mass was negatively related to precipitation seasonality in both Turkey and Argentina, with smaller seeds found in highly seasonal habitats. In Argentina, seed mass decreased with a decrease in the winter precipitation and an increase in aridity. In California, seed mass decreased with a decrease in the winter temperature. Our results suggest that different selection pressures in terms of climatic abiotic factors could act on C. solstitialis seed size in different regions. In other invasive species such as Echium plantagineum, populations sourced from hot, arid sites produced heavier seeds than populations from wetter sites, as a strategy to ensure reproductive success in arid environments (Konarzewski et al., 2012).
This does not seem to be the case in our study. Hierro et al. (2020) reported the existence of elevation clines in seed size in C. solstitialis for some of the regions they tested including Caucasus and Anatolia, Western Europe, and Western USA. They showed that seeds at higher elevation were larger than seeds at lower elevation but, also, that seeds in the non-native regions tended to be larger (i.e., highelevation seeds in Eurasia were smaller than high-elevation seeds in the Americas). Their study supports the hypothesis of a steady increase in seed mass as C. solstitialis expanded its distribution across the globe (Hierro et al., 2020) which is coherent with our finding of smaller seeds in the native regions where they generally experience higher seasonal variability. Larger seeds usually produce larger seedlings, which is positively associated with survival, increased competitive ability, and the capacity to withstand different hazards (i.e., herbivory, low nutrients, pathogen attack) which is likely to confer competitive advantages in the non-native regions (Coomes & Grubb, 2003;Hierro et al., 2013).
Levels of genetic diversity were similar between ranges as shown before by previous molecular studies on C. solstitialis (Barker et al., 2017;Eriksen et al., 2014). Our estimates of effective population size were somewhat lower in Chile than in other regions, suggesting that this region might have experienced a founder effect population bottleneck following introductions from Spain (Eriksen et al., 2014;Hierro et al., 2009) or it might reflect smaller current population size. Some of the individuals in Chile showed residual assignments to Turkey, suggesting that populations in Asia Minor might have also been contributors to the colonization of South America (see also Eriksen et al., 2014). However, additional simulations of genetic data are needed to test this hypothesis. Populations in Chile occur in small patches, at low densities, and have a slow spread and reduced impact that contrasts with highly invasive populations in Argentina and California (Andonian et al., 2011). Additionally, our observation of no increase in inbreeding coefficient in invasive regions suggests that no shifts in reproductive system from outcrossing towards selfing have occurred across the introduced range of C. solstitialis, supporting previous findings (Sun & Ritland, 1998 (Colautti & Lau, 2015). Nonetheless, our significant P ST -F ST results support a hypothesis of selective divergence in seed mass as well as previous observations that seed mass is putatively an adaptive trait in some of the species' introduced ranges (Hierro et al., , 2013(Hierro et al., , 2020. Likewise, P ST -F ST analysis at the paired regional level indicated divergent selection for increased seed mass in non-native  . According to Miguel et al. (2017), seed dimorphisms in C. solstitialis relate to a task division strategy in which pappus seeds are involved in dispersal and colonization of new habitats, whereas non-pappus seeds ensure site persistence. Both capitula number and seed size showed a positive relationship with plant height suggesting no trade-off between growth and reproduction.
The evolution of increased biomass (Barker et al., 2017) and competitive abilities (Hierro et al., 2022) previously reported in this species, combined with the evolution of seed size that we found in our study could represent an important part of C. solstitialis invasion success.
Our results on trait evolution need to be regarded cautiously since we used seeds collected in the field, thus, we cannot exclude environment of origin effects on the phenotypic traits we measured.
However, we argue that plant maternal effects are probably minor in C. solstitialis, at least for reproductive traits. Maternal effects seem to be more prevalent at the initial plant stages and seed weight appears to be one of the least plastic plant traits (Hierro et al., 2020). Although desirable, comprehensive sampling was not possible within all regions with C. solstitialis presence (i.e. Caucasus, Eastern Europe, and Pacific Northwest). Therefore, the findings and conclusions that we present here are limited to the regions sampled. More extensive sampling within existing regions would better detect variation at these scales.
Despite the extra preparation, we recommend that future research should aim to measure glasshouse-produced seed from controlled crosses to better control for potential maternal effects. Nonetheless, despite all these challenges, our significant results based on P ST Bayesian 97.5% credibility intervals (Leinonen et al., 2013) are still informative about the traits putatively under divergent selection.
Overall, our findings suggest that genetic drift is unlikely to be the sole force behind the observed biogeographic phenotypic variation in C. solstitialis. Evidence consistent with adaptive rapid evolution in phenotypic traits in the introduced range exemplifies how different environmental conditions across world regions result in unique local evolutionary and ecological dynamics within each region. Our results might be relevant to the management of this invasive weed, indicating that management practices should try to prevent or reduce yellow starthistle seed production and seed bank, thereby limiting further local adaptation.

ACK N OWLED G M ENTS
We are grateful to two anonymous reviewers and to Brittany Barker

CO N FLI C T O F I NTER E S T S TATEM ENT
The authors declare that they have no conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
Sequencing data for each individual can be found on NCBI, BioProject ID: PRJNA950038.