Global patterns of phylogenetic relatedness of invasive flowering plants

The ability of predicting which naturalized non‐native species are likely to become invasive can help manage and prevent species invasions. The goal of this study is to test whether invasive angiosperm (flowering plant) species are a phylogenetically clustered subset of naturalized species at global, continental and regional scales, and to assess the relationships of phylogenetic relatedness of invasive species with climate condition (temperature and precipitation).

. For example, more than 13,000 vascular plant species have been transported by humans outside their native ranges and have successfully established self-sustaining populations in recipient regions without direct human intervention (i.e. naturalized) (van Kleunen et al., 2015).
When a naturalized species spreads broadly and causes negative impacts on the environment and human societies, it is considered an invasive species . Because invasive species pose significant threats to biodiversity, and to ecosystem structure and functioning (Mack et al., 2000), invasive species are one of the most concerning global issues (Simberloff et al., 2013).
For a non-native species to become invasive, it has to pass a number of barriers (e.g. dispersal, environmental and biotic ;Miller et al., 2017) and progress along a continuum with three stages (i.e. introduction, naturalization and invasion), which is termed the introduction-naturalization-invasion (INI) continuum (Blackburn et al., 2011;Divíšek et al., 2018;Qian, 2023a;Richardson & Pyšek, 2012). Previous studies have shown that the underlying drivers can substantially differ among invasion stages (Dawson et al., 2009(Dawson et al., , 2013Richardson & Pyšek, 2012). For instance, socioeconomic drivers underlying colonization effort are deemed crucial for introduction and naturalization (Williamson, 2006) whereas the invasiveness of a naturalized species depends largely on the traits of the species and the interaction of the species with the native species and physical environment of the invaded region Pyšek et al., 2009). In particular, the main drivers of the naturalization and invasion stages might differ . The ability of predicting which naturalized species are likely to become invasive would be crucial to managing and preventing invasions, and to conserving native biodiversity.
Previous studies have found that similarity in ecological traits between introduced and native plant species facilitates naturalization, but differences in ecological traits between these two groups of plants enhance invasion success (Divíšek et al., 2018). Several studies have shown that traits of noninvasive naturalized species tend to differ from those of invasive species (Gallagher et al., 2015;Hamilton et al., 2005;Pyšek et al., 2009), and that differences in ecological traits between native and invasive species are larger than those between native and noninvasive naturalized species (Sandel & Dangremond, 2012;van Kleunen et al., 2010). It is thought that the pattern for native versus noninvasive naturalized species is driven by mechanisms invoking environmental filtering whereas the pattern for native versus invasive naturalized species is driven by mechanisms invoking limiting similarity (Divíšek et al., 2018). These mechanisms might act at different stages of the INI continuum (Divíšek et al., 2018). At the initial stage of introduction, non-native species that are similar to native species in ecological traits might be more successful in establishing in recipient areas, compared with those that are different from native species in ecological traits, as predicted by the environmental filtering hypothesis (Divíšek et al., 2018;Ricciardi & Mottiar, 2006). At the invasion stage, nonnative invasive species are expected to differ from native species in ecological traits in order to avoid niche overlap and competition with resident species or to be able to outcompete resident species in the invaded area, as predicted by the limiting similarity hypothesis (Daehler, 2001;Divíšek et al., 2018;Elton, 1958). To anticipate which introduced species might become invasive, it is crucial to identify which characteristics are correlated with, and potentially drive, species progressing from the naturalization stage to the invasion stage along the INI continuum (Miller et al., 2017). Although key correlates of invasion success have been identified for certain species, measuring relevant functional traits for a large number of species is often impractical (Gallien et al., 2019). For many plant groups, the ecological features that favour invasion success are poorly understood (Gallien & Carboni, 2017).
Previous studies have shown that many functional traits are phylogenetically conserved (Ackerly, 2009;Donoghue, 2008). Thus, phylogenetic relatedness among species in an ecological community could be used as a proxy of similarity in functional traits among the species (Webb, 2000). Strauss et al. (2006) found that in California non-native grass species that are less related to native grass species are more invasive, which appears to suggest that invasive naturalized species are a phylogenetically more clustered subset of species from the overall naturalized species, compared with noninvasive naturalized species. However, Miller et al. (2017) compared to phylogenetic structure of invasive naturalized species with that of all naturalized species in two tree lineages (acacias and eucalypts) in two families in Australia and found no phylogenetic signal underlying changes between naturalization and invasion stages for both groups of trees. These mixed results might be partly because each analysis included species from a single family (a relatively narrow lineage), and partly because the two studies analysed plants with different growth-forms (i.e. grasses vs. trees). Several regional-scale studies (e.g. Qian, 2023aQian, , 2023bQian et al., 2022;Qian & Sandel, 2023;Zhang et al., 2021) have compared phylogenetic relatedness of invasive naturalized angiosperm species with that of overall naturalized angiosperm species in regional floras in China, North America, and South Africa, and found that invasive plant species are a phylogenetically clustered subset of all naturalized plant species. Whether the findings of these studies apply to other regions across the world needs to be tested.
Testing whether invasive species are a phylogenetically clustered subset of the pool of naturalized species for the invasive species is important because if it is ubiquitous, or nearly so, across the world that invasive species are phylogenetically clustered subset of species from their naturalized species pool, then one might be able to identify, at least to some degree, which noninvasive naturalized species have a high potential to become invasive species over time in a region by determining phylogenetic relatedness between invasive and noninvasive naturalized species in the region. Understanding which naturalized species on a phylogeny have a high possibility to become invasive species is a critical step towards predicting and preventing biological invasion. Studies investigating whether invasive species are a phylogenetically clustered subset of the pool of naturalized species for the invasive species are scarce, and such studies at a global extent are lacking. Moreover, knowledge of the relationship between phylogenetic relatedness of invasive species and climate may shed lights on mechanisms about assembly of invasive species across environmental gradient. Studies exploring such relationships are also scarce.
The objectives of this study are threefold. First, I test the hypothesis that invasive naturalized angiosperm (flowering plant) species across the world are a phylogenetically clustered subset of the pool of angiosperm species in the world. Second, I test the hypothesis that invasive angiosperm species are a phylogenetically clustered subset of the pools of naturalized species in angiosperm assemblages at global, continental and regional scales. Third, I determine whether phylogenetic relatedness of invasive species with respect to their naturalized species pools increases, or decreases, with increasing climate stressfulness.

| Geographic coverage
The basic geographic sampling units of this study are geographic units at the level 3 of the scheme of the International Working Group on Taxonomic Databases for Plant Sciences (hereafter TDWG regions or simply regions), which were grouped into nine continents based on the TDWG continental scheme (Brummitt, 2001). I excluded oceanic islands and the Antarctic. As a result, this study included 290 TDWG regions in seven TDWG continents (i.e. including all TDWG continents except for Pacific and Antarctic continents; Figure S1), which are Europe, Africa, Asia-Temperate, Asia-Tropical, Australasia, Northern America and Southern America (Brummitt, 2001). I termed them TDWG continents or simply continents, as in Brummitt (2001).  (Richardson et al., 2000).

| Species distribution data
Botanical nomenclature of species was standardized according to the World Flora Online (www.world flora online.org), using the package U.Taxonstand (Zhang & Qian, 2022). Infraspecific taxa were combined with their respective species. As a result, the final species list of naturalized angiosperms included 10,858 species, 3769 of which were invasive species. A naturalized species that was defined as an invasive species was also considered an invasive species in all regions where the species has become naturalized, because complete invasive species lists for individual regions are generally lacking or not comparable among regions due to different criteria used in defining invasive species in different regions in some cases (Qian & Sandel, 2023). This approach of compiling regional invasive species lists is suboptimal; however, because the same approach was used for different regions and thus there appears no systematic bias towards a particular region, it allows a robust comparison of results among different regions. Moreover, because the same approach was used in compiling invasive plant species lists for regional floras in previous studies (e.g. Qian, 2023aQian, , 2023bQian et al., 2022;Qian & Sandel, 2023), using the same approach in different studies allows a direct comparison between results of different studies.

| Phylogeny construction
The package V.PhyloMaker2 (build.nodes.1; Jin & Qian, 2022; also see Jin & Qian, 2023) was used to generate a phylogeny for the 10,858 species in 2589 genera in the data set of this study, using the megaphylogeny with the package, which is an updated and expanded version of the dated megaphylogeny GBOTB reported by Smith and Brown (2018), as a backbone. All family-level clades in my data set were resolved in the megaphylogeny. Of the genera and species in the data set, 92% and 62%, respectively, were included in the megaphylogeny. I added the genera and species in the data set that were absent from the megaphylogeny to their respective families and genera using Scenario 3 of the V.PhyloMaker2 software (Jin & Qian, 2022), which sets branch lengths of added taxa in a family by placing the nodes evenly between dated nodes and terminals within the family and placing a missing species at the mid-point of the branch length of its genus. I | 1109 QIAN pruned the megaphylogeny to generate a phylogenetic tree retaining only the angiosperm species present in the data set. For the phylogenetic metrics used here (see below), values derived from a fully resolved species level tree are nearly identical to those from a tree resolved only at the genus level (Qian & Jin, 2021). Because nearly all the genera and the majority of species in my data set were resolved in the phylogeny, the phylogenetic metrics derived from the phylogeny are robust (Qian & Jin, 2021). For analyses involving naturalized and invasive species present in a particular continent or region, I extracted a sub-phylogeny from the phylogeny that included only species in the continent or region.
To determine whether invasive angiosperm species are a phylogenetically clustered subset of species from the global angiosperm species pool, I used the above-described approaches to generate a phylogeny including all known angiosperm species (N = 343,596) based on World Plants (https://www.world plants.de).

| Phylogenetic relatedness metrics
I used net relatedness index (NRI) and nearest taxon index (NTI) (Webb, 2000;Webb et al., 2002) to measure phylogenetic relatedness among species in each assemblage. These two metrics are among the best metrics measuring phylogenetic relatedness and have been commonly used in the literature on community phylogenetics (Mazel et al., 2016), including studies on naturalized and invasive species (e.g. Qian, 2023aQian, , 2023bQian et al., 2022;Qian & Deng, 2023;Qian & Sandel, 2023;Zhang et al., 2021). NRI and NTI are, respectively, based on mean pairwise distance (MPD) and mean nearest taxon distance (MNTD), and are defined as follows (Webb, 2000;Webb et al., 2002): −1 × (X obs − mean(X null ))/SD(X null ), where X obs is the observed value of MPD or MNTD, mean(X null ) is the average expected value of MPD or MNTD for randomized assemblages, and SD(X null ) is the standard deviation of expected values of MPD or MNTD among randomized assemblages. Positive values indicate that species within assemblages are more closely related than expected for a random draw from the species pool, and represent phylogenetic clustering whereas negative values indicate that species within assemblages are more distantly related than expected for a random draw from the species pool, and represent phylogenetic overdispersion. These two metrics were calculated using the computationally efficient algorithms implemented in the package PhyloMeasures (Tsirogiannis & Sandel, 2016). NRI represents phylogenetic relatedness at a whole community level whereas NTI quantifies phylogenetic relatedness by incorporating only the distances of the closest relative (Cadotte et al., 2018;Webb et al., 2002). I calculated NRI and NTI for invasive species assemblages at different spatial scales (the world as a whole, each TDWG continent and each TDWG region) using tailor-made species pools. To determine whether the global invasive angiosperm species are a phylogenetically clustered (or overdispersed) subset of the global angiosperm flora, I calculated NRI and NTI for the invasive species, using all angiosperm species in the world as the species pool, and compared the observed value of each metric with a null expectation generated from 999 random draws of equal species richness from the species pool. I took the same approach to determine whether invasive angiosperm species in the world and in each of the seven TDWG continents are a phylogenetically clustered (or overdispersed) subset of their respective naturalized angiosperm floras. The species pool used to calculate NRI and NTI included all naturalized angiosperm species of the world or the continent under investigation, depending on whether global or continental invasive assemblage was considered. For each of the 290 TDWG regions, I calculated three sets of NRI and NTI using species pools of naturalized angiosperm species at three spatial scales (global, continental and regional). For normally distributed data, statistical significance at p-value <.05 is equivalent to NRI or NTI > 1.96 (i.e. significantly more phylogenetic clustering), or <−1.96 (i.e. significantly more phylogenetic overdispersion) (Hortal et al., 2011). I considered p-values of <.05 being significant and p-values between .05 and .10 being marginally significant, the latter of which corresponded to NRI and NTI being between 1.65 and 1.96 or between −1.96 and −1.65.

| Climate data
Mean annual temperature and annual precipitation have been successfully used as correlates of plant distribution (e.g. Moles et al., 2014;Qian & Sandel, 2017;Ricklefs, 2010). I explored the relationships between NRI or NTI and these two climatic variables, as in Qian et al. (2022). I obtained climate data from the CHELSA climate database (https://chels a-clima te.org/bioclim) for each TDWG region. The mean value of each of the two climatic variables was calculated for each region using 30-arc-second resolution data.

| Statistical analysis
I used correlation analysis to assess the relationships between pairs of variables. I considered a correlation (Pearson's correlation coefficient, r) to be strong, moderate or weak if |r| > .66, 0.66 ≥ |r| > .33 or |r| ≤ .33 respectively (Qian et al., 2019). I used SYSTAT (Wilkinson et al., 1992) for the statistical analyses.

| RE SULTS
The entire assemblage of global invasive angiosperm species was a phylogenetically strongly clustered subset of the species of the global angiosperm flora, regardless of whether NRI (8.7) or NTI (22.8) was considered, and the strength of phylogenetic clustering measured by NTI was more than 2.6 times greater than that measured by NRI (Figure 1). When the observed values of NRI and NTI were compared to 999 null assemblages randomly drawn from the global angiosperm flora, the observed values were significantly (p < .05) greater than the randomized assemblages, with both observed values being far from the right (positive) tails of the distributions of their respective null assemblages (Figure 1).
When calculating NRI and NTI for global invasive angiosperm species using global naturalized angiosperms as the species pool, both NRI and NTI were positive (0.31 and 9.84 respectively) but only NTI was significant (p < .05, i.e. NTI > 1.96; Figure 2). When NRI and NTI were calculated for each of the seven TDWG continents with respect to naturalized angiosperm species in each respective continent, NRI was negative in two continents, one of which (Europe) was significant (p < .05), and the other five were positive, all of which, except for Asia-Tropical, were significant (p < .05) or marginally significant (Asia-Temperate; p = .058) (Figure 2). In contrast, NTI was positive for all the seven TDWG continents, and was significant (p < .05) for six continents and marginally significant for the other continent (Asia-Tropical; p = 0.052) (Figure 2). When those invasive and naturalized species in a given continent that were naturalized in some regions of the continent but were native to at least one region of the continent were removed from the analysis, patterns of the two sets of analyses were similar (compare Figure 2 with Figure S2).
When NRI and NTI of invasive angiosperm species in each of the 290 TDWG regions were derived using the naturalized angiosperm species of the globe as the species pool, invasive species were phylogenetically clustered in 97.9% of the regions, 87.3% of which (i.e. 248 of 284 regions) were significantly clustered (p < .05) when NRI was considered ( Figure S3a), and were phylogenetically clustered in 99.3% of the regions, 98.3% of which (i.e. 283 of 288 regions) were significantly clustered (p < 0.05) when NTI was considered ( Figure S3b). These patterns generally held for individual TDWG continents, that is, invasive species in the vast majority of regions in each continent were phylogenetically significantly clustered, regardless of whether NRI or NTI was considered (i.e. NRI and NTI >1.96; Figure S3). When NRI and NTI of invasive angiosperm species in each of the 290 TDWG regions were derived using the naturalized angiosperm species of their respective TDWG continent as the species pool, the above-reported patterns generally held, although the percentages of regions with positive and significant values of NRI and NTI decreased slightly (compare Figure S3 with Figure S4).
When NRI and NTI of invasive angiosperm species in each of the 290 TDWG regions were derived using the naturalized angiosperm species of the region as the species pool, invasive species were phylogenetically clustered in 83.1% of the regions when NRI was considered (Figure 3a), and were phylogenetically clustered in 76.9% of the regions when NTI was considered (Figure 3b). When NRI and NTI of invasive angiosperm species in TDWG regions derived from their respective regional species pools were analysed for each continent, the number of regions with phylogenetically clustered invasive species substantially exceeded that with phylogenetically overdispersed invasive species in all the seven continents, regardless of whether NRI or NTI was considered (Figure 3), although the proportion of regions with significantly positive NRI and NTI values substantially decreased, compared to that based on either global or continental species pool (compare Figure 3 to Figures S3 and S4). Values of NRI and NTI tended to be greater in regions at higher latitudes (and thus with lower temperatures), and this pattern appeared to be more conspicuous for NRI than for NTI (Figure 4).
When NRI and NTI for regional invasive species assemblages derived from the global naturalized angiosperm species pool were related to mean annual temperature and annual precipitation, the two phylogenetic metrics were or tended to be negatively correlated with the two climatic variables at the global scale ( Figure 5), indicating that phylogenetic relatedness of regional invasive species assemblages increased with decreasing temperature and precipitation. For each of the seven individual continents, NRI and NTI for regional invasive species assemblages derived from the global naturalized angiosperm species pool were or tended to be negatively correlated with the two climatic variables in most cases F I G U R E 1 NRI and NTI (red triangle) for global angiosperm invasive species with respect to the species pool including all angiosperm species in the world. Histograms for distributions of NRI and NTI values derived from null assemblages drawn from the species pool including all angiosperm species in the world. Each histogram represents the frequency of NRI or NTI values derived from 999 null assemblages randomly drawn from the species pool. America); the opposite pattern was observed in the other three continents ( Figure 5). When NRI and NTI for regional invasive species assemblages derived from the naturalized angiosperm species pool of each continent were considered, the above-reported patterns generally held or were strengthened (compare Figure 5 with Figure S5).  -6 -4 -2 0 2 4 6 8 10 12 F I G U R E 3 NRI and NTI of angiosperm invasive species for each of the 290 TDWG regions analysed in this study. NRI and NTI were summarized for the globe as a whole and for each of the seven TDWG continents. Blue and red bars represent phylogenetic overdispersion and clustering respectively. NRI and NTI of angiosperm invasive species for each region were calculated using the species pool of naturalized angiosperm species in the region.  At the regional scale examined in this study, there were a number of regions across the globe showing phylogenetic overdispersion of invasive angiosperm species with respect to their regional pools of naturalized angiosperm species; however, the number was much smaller than the number of regions showing phylogenetic clustering of invasive angiosperm species with respect to their regional pools of naturalized angiosperm species. For example, when all regions across the globe were considered, the proportions of regions showing phylogenetic overdispersion and clustering were 16.9% and 83.1%, respectively, when deep evolutionary history was considered, and were 23.1% and 76.9%, respectively, when shallow evolutionary history was considered (Figure 3a,b).

| DISCUSS ION
The pattern that a much larger proportion of regions show phylogenetic clustering of invasive species holds for each continent (Figure 3c-p). Taken together the results of analyses at global, continental and regional scales from this study and from previous relevant studies (e.g. Qian, 2023aQian, , 2023bQian et al., 2022;Q ian & Sandel, 2023;Zhang et al., 2021), one may conclude that invasive species are likely a phylogenetically clustered subset of naturalized species in a study system under investigation.
For invasive species assemblages at global and continental scales, NTI was greater than NRI regardless of which species pool is used (i.e. overall global angiosperm species, global naturalized angiosperm species or continental naturalized angiosperm species). For example, NTI was 2.6 and 31.4 times greater than NRI for overall global invasive angiosperm species with respect to overall global angiosperm species (Figure 1) and global naturalized angiosperm species (Figure 2) respectively. When continental invasive assemblages were considered, NTI was greater than NRI for each of the seven continental regions, with the mean value of NTI being 2.5 greater than that of NRI ( Figure 2). However, when individual regional invasive assemblages were considered, the majority (63.1%) for the 290 regional invasive assemblages had greater NRI than NTI, with the mean values being 1.319 and 0.879 respectively. These results suggest that the relative strength of phylogenetic relatedness of invasive angiosperm species depends on the evolutionary depth on which the used phylogenetic metric focuses. NRI assesses phylogenetic relatedness at a deep level of evolutionary history. Phylogenetic clustering of invasive species at a deep phylogenetic level reflects that major lineages (e.g. orders or families) of invasive species are likely to be clustered on the phylogeny, causing a positive NRI. In this case, phylogenetic relatedness based on a metric reflecting a shallow depth of evolutionary history may be in any direction (clustering, overdispersion F I G U R E 4 Geographic patterns of (a) mean annual temperature in °C, (b) annual precipitation in mm, (c) NRI and (d) NTI for invasive angiosperm species assemblages in the 290 TDWG regions used in this study. NRI and NTI of each region were calculated using all naturalized angiosperm species in the world as the species pool.
F I G U R E 5 Relationships of NRI and NTI with mean annual temperature and annual precipitation for invasive angiosperm species in the 290 TDWG regions, which were analysed for the globe as a whole and for each of the seven TDWG continents. NRI and NTI of each region were calculated using all naturalized angiosperm species in the world as the species pool. Data of temperature and precipitation for the 290 regions were rescaled to vary between 0 and 1 in either climatic variable. In each panel, the red line is the linear least squares best fit, and the blue lines represent 95% confidence intervals; the lines were used to show a linear trend, not for statistical tests.  The present study showed that phylogenetic relatedness of invasive angiosperm species tended to increase with increasing latitude and with decreasing temperature and precipitation when all regional invasive angiosperm assemblages across the globe were considered (Figures 4 and 5). The negative relationships of phylogenetic relatedness of invasive angiosperm species with temperature and precipitation observed at the global scale hold for the vast majority of the seven continents ( Figure 5). These results for invasive angiosperms in regional floras are consistent with those for regional native angiosperm assemblages. For example, phylogenetic relatedness of native angiosperms increases with decreasing temperature and precipitation in both China (Lu et al., 2018;Qian et al., 2019) and North America (Qian & Sandel, 2017). These consistent relationships between native and invasive species, which are further consistent with the prediction of the tropical niche conservatism hypothesis (i.e. phylogenetic relatedness increases with increasing environmental stress; Wiens & Donoghue, 2004), suggest that climatic drivers of phylogenetic relatedness are the same or similar for both native and non-native invasive plant species . Although exceptions may occur in some regions, for example, phylogenetic relatedness of invasive angiosperms increased with increasing precipitation in the Asia-Temperate continent ( Figure 5) and in China , exceptions are likely few, and may suggest that in addition to the two climatic variables examined in this study, other factors, such as those affecting spread of naturalized species in general and invasive species in particular, may also be important drivers of phylogenetic relatedness of invasive species in some regions.
The finding of this study that invasive species are generally a phylogenetically clustered subset of naturalized species across the world has a significant implication to biological conservation.
Previous studies have reported that invasive species can cause biodiversity loss in invaded areas (Harron et al., 2020;Rejmánek et al., 2013). With the knowledge of phylogenetic relatedness between invasive naturalized species and those naturalized species that have not become invasive yet, one may assess the probability of a naturalized but not yet invasive species to become invasive over time. If traits related to the invasiveness of the species are generally shared among the species of a lineage and if knowledge on the degree to which one or more invasive species of the lineage can cause biodiversity loss in a particular area is available, one might predict potential biodiversity loss as a result of introducing another species of the lineage into the area, or as a result of another introduced but not yet invasive species of the lineage to become invasive over time.
Furthermore, when two or more closely related species have already been invasive in a particular area, conservation biologists and managers might develop strategies to control or reduce biodiversity loss to be caused by the invasion of one species which has not been well studied based on the phylogenetic relationship of the species with its closely related species which have been well studied for the effect of their invasions on biodiversity loss. However, because there are many different functional traits that may make species become invasive and because what kind of functional similarities that we should look at cannot be determined solely based on phylogenetic relatedness, finding those traits that are relevant to species invasion remains an important goal of invasion ecology.

ACK N O WLE D G E M ENTS
I am grateful for constructive comments provided by the subject editor, Dr Severin Irl, and anonymous reviewers. I thank Dr. Meichen Jiang for generating the maps.

CO N FLI C T O F I NTE R E S T S TATE M E NT
The author declares no competing interests.

DATA AVA I L A B I L I T Y S TAT E M E N T
Data used in this paper were obtained from published sources, which were cited in this article, and are available to the public. Naturalized and invasive distributional data are available from Global Naturalized