Phylogenetic relationships of invasive plants are useful criteria for weed risk assessments

Risk assessments are conservation tools used to prevent the introduction of invasive species. Many assessments ask whether a taxon has invasive close relatives, but it is unclear whether this phylogenetic information is useful, and which taxonomic scales (e.g., genus, family) are most predictive of risk. Combining phylogenetic clustering analyses with models predicting invasion risk, we found invasive plants were clustered within nonnative flora of the conterminous United States. Taxonomic information in models improved their predictive capacity; invasion risk for taxa with invasive confamilials, congeners, or sister taxa increased by 9%, 16%, and 19% respectively. Phylogenetic information did not improve inference for species without any congeners, or those from large genera. The most common approach—assessing congeners—is well suited to identify invaders, particularly for genera with 2–10 established species. While existing phylogenetic information can enhance assessments of invasion risk, biologists and regulators should collaborate to improve nonnative species phylogenies.


INTRODUCTION
Human activity and trade have driven the introduction of plants and animals beyond their historic ranges (van Kleunen et al., 2015).Many of these introduced organisms establish self-sustaining populations (hereafter: established) and some spread rapidly and cause harm (hereafter: invasive), negatively affecting native biodiver-about plant traits and other characteristics (Roy et al., 2018).WRAs would optimally differentiate between invasive and noninvasive plants using a few, easy-to-collect criteria for rapid and accurate screening.Unfortunately, WRA criteria vary widely (Bradley et al., 2022;Buerger et al., 2016), which may make WRAs ineffective (Hulme, 2012) and lead to inconsistent invasive species regulations (Beaury et al., 2021).Improving the overall effectiveness of WRAs begins with evaluating the effectiveness of individual WRA criteria.
One criterion often used in WRAs focuses on whether a species is closely related to known invaders.Shared evolutionary history among taxa (hereafter: phylogenetics) should be a strong indicator of invasive potential for introduced species (Omer et al., 2021;Pyšek, 1998;Qian & Sandel, 2023).For plants, traits including rapid growth, high fecundity, and broad environmental tolerance could lead to invasion (Pyšek & Richardson, 2007;Van Kleunen et al., 2010).While we lack measurements of these traits for many species, they are often phylogenetically conserved, suggesting phylogenetic relationships could be proxies for invasiveness (Tucker et al., 2018).Invasiveness is more common in certain families or genera (Kuester et al., 2014;Pyšek, 1998) and invasive species are phylogenetically clustered in some flora (Qian & Sandel, 2023;Qian et al., 2022).However, many invaders do not have aggressive close relatives, thus phylogenetic relationships are not always clear indicators of invasive potential (Mack et al., 2000;Williamson & Fitter, 1996).Perhaps reflecting this uncertainty, some, but not all, national WRAs include phylogenetics as an indicator of invasiveness (e.g., Koop et al., 2011: Are there any congeners that are considered significant weeds?Pheloung et al., 1999: Documented evidence that one or more species, with similar biology, within the genus of the species being evaluated are weeds).Yet it is unclear whether these questions improve overall assessment of the invasive potential of introduced species.
If phylogenetic relationships are predictive of invasion risk, it is also unclear whether the genus level is the most appropriate indicator, as opposed to coarser (confamilial) or finer (nearest relative, or sister taxa) taxonomic scales.The best taxonomic scale for predicting invasive potential likely depends on the evolution of invasiveness (Figure 1).For example, if a suite of traits associated with invasiveness evolved early (in deep evolutionary time) and remained conserved, family-level relationships may predict invasive potential, and would be useful additions to WRAs.By contrast, if traits that confer invasiveness have arisen recently or frequently, the genus or sister species levels would offer better inference, reinforcing the existing criteria within many WRAs.Recent studies on established floras of China and North America indicate that the evo-lution of invasiveness is more recent (Qian & Sandel, 2023;Qian et al., 2022), however these patterns may vary across geographies.Understanding both the general phylogenetic structure of invasiveness and how these phylogenetic patterns can be translated into effective screening questions in WRAs is critical for informing policy aimed at preventing the introduction of new invasives.
In this study, we assess phylogenetic relationships among nonnative vascular plant species of the conterminous United States and test a series of models to understand which taxonomic scales are the best for predicting potential invaders.
Specifically we ask: (1) At what evolutionary scale (early or more recent evolutionary time) are invasive species a phylogenetically clustered subset of the established plant community, if any? (2) How does incorporating phylogenetic information at different taxonomic scales (family, genus, sister species) affect the performance of, and inference from models intended to predict the likelihood that a species is invasive?
Our analysis provides a reflective look at the ways evolutionary biology currently informs invasive species regulations, and how it can be better leveraged to aid national and subnational policy efforts to reduce the spread of new invaders.While we focus on the United States, dozens of countries have WRA procedures (see Roy et al., 2018), making this work globally applicable.

Comparison groups
We focused on the established (introduced species with self-sustaining populations) and invasive (established species spreading in natural ecosystems and causing harm) plant species within the conterminous United States.
Because most plant introductions in the United States were through the ornamental trade or as crop contaminants (Lehan et al., 2013;Mack & Erneberg, 2002), there is likely selection bias towards sets of traits that either appeal to horticulturalists or enable accidental introduction (Colautti et al., 2006).However, because invasive species are a subset of established ones, comparing these two groups controls for some of this selection bias and allows us to test whether invasive plants are a clustered subset of established species.
F I G U R E 1 A conceptual diagram illustrating the relationship between phylogenetic clustering and the expected predictive power for including phylogenetic predictors in models designed to quantify invasion risk.If invasiveness evolved independent of phylogenetic relationships (i.e., no phylogenetic structure), neither coarse (e.g., confamilial) nor finer (e.g., congener or sister taxa) phylogenetic predictors would better predict invasion risk than a simple model with no phylogenetic predictors included.If invasiveness evolved early in evolutionary history and is conserved (i.e., branch-weighted phylogenetic structure), both coarse and fine scale phylogenetic models should be more predictive of invasion risk (evaluated by nonoverlapping uncertainty intervals between the "invasive relatives" and "no invasive relatives" estimates and between the uncertainty intervals of the no phylogenetic predictor model and other models).If invasiveness evolved more recently or frequently in evolutionary time (i.e., tip-weighted phylogenetic structure) only the models with finer-scale taxonomic predictors should be better predictors of invasion risk than the model without phylogenetic predictors.

Data source
We obtained state-level lists of invasive and established species for the 48 conterminous United States and the District of Columbia from Pfadenhauer and Bradley (unpublished, data).We merged these lists to generate a list of invasive and established taxa for the region.
We cross-checked our list against the "native status" classifications of the PLANTS database (USDA, 2023) and removed any species classified as native.Our final dataset included 3744 species.

Phylogeny construction
We used the R package V.PhyloMaker2 (v.0.1.0)(Jin & Qian, 2019) to generate a phylogeny of the established species pool (established and invasive species together) using the expanded megaphylogeny (GBOTB.extended.TPL) of Smith and Brown (2018) as a backbone and the build.nodes.1 function (Jin & Qian, 2019).We added taxa that were absent from the megaphylogeny with V.PhyloMaker2, using the scenario "S3" as per Qian et al. (2022).

Measuring phylogenetic clustering
To evaluate the phylogenetic clustering of invasive species within the established species pool, we used two metrics of community phylogenetics: the inverse net relatedness index (-NRI) and inverse nearest taxon index (-NTI) (Webb, 2000;Webb et al., 2002;Kembel et al., 2010).
-NRI and -NTI are estimates of phylogenetic clustering that compare the observed mean phylogenetic distance (MPD) or mean nearest taxon distance (MNTD), respectively, of an observed community to a series of randomly generated ones (see Supporting Information: Supplemental Methods for details).
These two metrics give complementary information about phylogenetic structure.-NRI is a "branch-weighted" metric that estimates the phylogenetic distances across all species in the phylogeny that captures phylogenetic structure in earlier evolutionary time.-NTI is a "tip-weighted" metric that is based on the phylogenetic distance between a given species and its closest relative that captures phylogenetic structure in more recent evolutionary time (Webb et al., 2002;Qian et al., 2022).

Data preparation
We subsetted our dataset to include only species with wellresolved nearest relatives in the dataset (hereafter: sister species).We identified sister species with functions from the R package "phangorn" (Schliep, 2011), as described in Revell ( 2013).This resulted in a dataset of 858 species pairs (n = 1716 species total).For each species in list one (the first species of a pair), we assessed whether it had (a) an invasive confamilial, (b) an invasive congener, or (c) an invasive sister taxon in list two (the second species of a pair).

Statistical analyses
To understand how incorporating different scales of phylogenetic relationships inform invasion risk, we ran four different Bayesian, logistic regressions models using a Bernoulli likelihood distribution using the R package "brms" (Bürkner, 2017).The first model assessed the likelihood of invasion for species in list one with no phylogenetic information.The other three models assessed the likelihood of invasion for species in list one given an invasive in the same family (confamilial model), genus (congeneric model), or closest relative (sisters model) in list two (see Supporting Information: Supplemental Methods for model details).Phylogenetics as predictors of invasion risk may be less reliable for species from large or small genera.To test this, we subsetted our data to include (1) species that were the only members of their genus and (2) species in genera with >10 species and reran our models for these special cases.
We used our models to generate posterior predictions for the likelihood of invasion for each scale (confamilial, congener, sister taxa) of phylogenetic information.In all figures and tables, we present the mean estimates, with 50%, and 90% uncertainty intervals (UI).

Phylogenetic clustering
The mean phylogenetic distance of invasive species within the established nonnative species pool was 266.0 (Figure 2a), with a nonsignificant standardized effect size (-NRI) of 0.38 (p = 0.64).The mean nearest taxon distance was 48.2 (Figure 2b) with a statistically significant standardized effect size (-NTI) of −3.6 (p = 0.0001).
When evaluating species with no congeners, the model without phylogenetic predictors estimated the likelihood of invasion to be 0.11 [UI 90 = 0.07, 0.17] (Figure 3b; Table S1).Although having an invasive confamilial increased the likelihood of being invasive by 10%, from 0.07 [UI 90 = 0.02, 0.13] to 0.17 [UI 90 = 0.09, 0.26] compared to species with no invasive confamilial (Figure 3b; Table S1), the 90% UIs overlapped each other, suggesting these differences in invasion likelihood cannot be cleanly differentiated from each other (Figure 3b; Table S1).The sister species model did not produce reliable estimates due to low effective sample size, so we do not report the estimates here.
For all four models (no phylogenetic predictors, confamilial, congeneric and sisters) and in all three cases (all species, no congeners and > 10 congeners), there were no statistically significant differences in model performance (Table S2).

DISCUSSION
Invasive species are consistently a top concern for conservation practitioners and natural resource managers (e.g., Ernest Johnson, 2020).Preventing the introduction of new invasive species is the most effective (Rejmanek & Pitcairn, 2002) and least expensive (Keller et al., 2007) approach for reducing ecological harm.Consistently including criteria in national WRAs that can effectively distinguish invasive from noninvasive plants is thus critical for proactively managing invasive plants (Pheloung et al. 1999;Koop et al., 2011).
On average, we found that having an invasive congener or sister species increased invasion likelihood by 14% and 17%, respectively, when compared to the mean invasion likelihood prediction of the model without phylogenetic information (Figure 3a; Table S1).Additionally, the increases in invasion likelihood were 10% and 13% for invasive congeners and sister species respectively relative to the confamilial model's mean predicted invasion likelihood, given an invasive confamilial.
Our study suggests that species with invasive relatives pose a higher risk of invasion, and that phylogenetic questions in WRAs effectively differentiate invasive from noninvasive species, particularly when they are based on finer-scale phylogenetic information (congeners or sister taxa).This finding, combined with significant phylogenetic clustering based on MNTD (Figure 2b) suggests a more recent evolution of invasive traits.
Collectively, these results suggest that including a question about whether a species has an invasive congener would improve risk assessments and lead to better conservation policy.Unfortunately, some widely used WRAs do not include phylogenetic criteria (Baker et al., 2008; "EPPO prioritization process for invasive alien plants," 2012).Determining whether a candidate species has an invasive congener is relatively straightforward information to find, making phylogenetic criteria both simple and effective for plants and potentially other organisms.For example, CABI (https://www.cabidigitallibrary.org/product/QI) provides an international list of invasive plants and the Global Plant Invaders database (Laginhas & Bradley, 2022) similarly provides a database of invasive plants in the scientific literature.
In contrast, identifying finer-scale taxonomic relationships like sister species requires more training in phylogenetic methods.Our analyses illustrate the limitations of identifying sister species.While genera were available for all 3744 established nonnatives species in our dataset, our phylogeny was only well-resolved enough to F I G U R E 3 Predicted invasion risk for three subsets of established plant species based on the invasiveness of close relatives at different taxonomic scales.(a) The full subset of all paired species.Here, the model with no phylogenetic information predicts an overall invasion likelihood of 12%.Models predict increasingly higher invasion risk at finer taxonomic scales from species with close relatives that are invasive.The predicted invasion risk for subsets of established plants with (b) no congeners and (c) greater than 10 congeners, respectively.In all panels, triangles indicate the presence of a close relative that is invasive while circles indicate the absence of an invasive close relative at the family, genus, and sister species level, respectively.Shapes indicate mean predictions from Bayesian regression models; solid and dotted bars indicate the 50 and 90% uncertainty intervals.Summaries of these results are available in Table S1.assign 46% of species as sister taxa.Well-resolved phylogenies may be less likely for species being assessed for invasion risk, since those are often novel plant species or unique hybrids sourced from regions that are underrepresented in science (e.g., Bradley et al., 2012).
For species lacking congeners, the confamilial model offered some inference (an average 10% increase in risk if a confamilial is invasive vs. not), but invasive and established groups were not well differentiated statistically (Figure 3b).Of the 3744 species in this dataset, 809 (22%) are the sole representative of their genus.As a result, a substantial portion of the established species pool cannot be screened based on the invasion status of congeners, though screening these species based on the invasive status of confamilials may provide limited benefit.
When considering large genera (>10 species), the congener and sisters models offered some inference, with an average increased risk of 10% and 14%, respectively, but all of these models had overlapping 90% uncertainty intervals, suggesting high uncertainty in these estimates (Figure 3c).Fifty-five genera in the established species pool contain >10 species, representing 961 species, or 26% of the established flora.
We suggest that the poor performance of our models in these two, relatively common cases highlight opportunities for increased collaboration between phylogeneticists and regulators.
Regulators could identify clades of interest to themthose with many introduced or cosmopolitan species-and phylogeneticists could develop high-quality phylogenies to identify phylogenetic patterns of invasion below the genus level (e.g., subgenus, sister taxa).This would improve our ability to assess risk for species from both small and large genera.Additionally, increased collaboration could produce new ways of sharing and manipulating phylogenetic information that can be performed by nonexperts.This could give regulators the ability to use more powerful phylogenetic approaches in developing their screening protocols for assessing species, which may be particularly useful for the special cases we describe above, as well as for species that lack reliable data about their biological traits and ecological characteristics.
A limitation to our approach is that invasiveness is not a clearly defined, heritable trait like those typically addressed in phylogenetic studies, but rather a combination of multiple biological attributes, ecological circumstances and human perception (Colautti & MacIsaac, 2004;Wehi et al., 2023).Therefore, the link between phylogenetic structure and our predictive models (Figure 1) is likely an oversimplification of the evolutionary forces that shape biological invasions.Nevertheless, our finding that regulators can use evolutionary relationships to predict invasive potential of species is applicable even if we cannot fully explain the processes underlying these patterns.Further, because there is no standardized process for categorizing species as invasive, our analyses could be sensitive to alternate classification schemes.For this reason, we hope our study provides a framework for practitioners to assess these questions in their own localities and biological systems.
In summary, we observed strong phylogenetic clustering using a tip-weighted phylogenetic metric (MNTD) and not with a branch-weighted metric (MPD), supporting the assertion that invasive traits have likely arisen multiple times in more recent evolutionary history.As a result, asking whether a species has an invasive congener is likely to improve the successful identification of potential invaders, particularly in cases where the species is part of a genus with 2-10 established species.The predictive challenges associated with species in large or single-genus groups highlights a need for increased collaboration between evolutionary biologists and plant regulators to better assess these special cases.

A C K N O W L E D G M E N T S
We thank T.J. Davies for his guidance regarding our phylogenetic analyses and Audrey Barker-Plotkin for her insights on this project.

F U N D I N G
This work is a product of the John Wesley Powell Center for Analysis and Synthesis Working Group on Invasive Plant Impacts (Award #G22AC00156-00).Additional support was provided by the National Science Foundation through Interagency Agreement #2135795 to the Powell Center.Additional funding for this project came from the US Geological Survey Northeast Climate Adaptation Sci-ence Center Award #G21AC10233-01, US Geological Survey Northeast Climate Adaptation Science Center Award #G19AC00091, and by a fellowship from the Lotta Crabtree Trust.

C O N F L I C T O F I N T E R E S T S TAT E M E N T
The authors declare no conflict of interest.

D ATA AVA I L A B I L I T Y S TAT E M E N T
Data and code used in this study are available at UMass Scholarworks (https://doi.org/10.7275/gfsy-4x17).

F
I G U R E 2 Phylogenetic clustering of invasive species within the established nonnative flora of the conterminous United States using two metrics: mean phylogenetic distance (MPD) and mean nearest taxon distance (MNTD).Points indicate the observed MPD and MNTD values, respectively, while blue X's and bars represent the means and standard deviations of MPD and MNTD values for 9999 randomized phylogenetic relationships (solid bars indicate one standard deviation and dotted bars two standard deviations).(a) The mean phylogenetic distance of invasive plants in the phylogeny is not significantly different from randomly generated relationships.(b) The mean nearest taxon distance of invasive plants in the phylogeny is significantly more clustered (lower values) than random.