Modelling species distributions and environmental suitability highlights risk of plant invasions in western United States

Non‐native invasive plants impact ecosystems globally, and the distributions of many species are expanding. The current and potential distributions of many invaders have not been characterized at the broad scales needed for effective management. We modelled the distributions of 15 non‐native invasive grass and forb species of concern in western North America to define their environmental niches and predict potential invasion risk.


| INTRODUC TI ON
Exotic plant invasions are damaging ecosystems around the world, changing ecosystem structure and function, reducing biodiversity, and harming social and economic systems (Mack et al., 2000;Pejchar & Mooney, 2009;Simberloff et al., 2013). Across the western United States, invasive plant species are compounding the ecosystem stresses of changing climate and fire regimes. Swaths of the sagebrush biome have converted from perennial-and shrub-dominated ecosystems to invasive annual plant dominance, severely impacting sagebrush-dependent species and ecosystem functions (Chambers et al., 2019;Coates et al., 2016;Germino et al., 2016). The annual grass Bromus tectorum has been extensively studied due to its ability to form monocultures, increase fire severity and thwart restoration (Balch et al., 2013;Bishop et al., 2019). However, much less is known about other non-native invaders in the region, which also have the capacity to impact ecological processes. Grasses including Bromus rubens, Schismus barbatus and Taeniatherum caput-medusae also alter fire behaviour, contributing to the development of grass-fire cycles that can lead to progressive loss of the native plant community (Fusco et al., 2019). Annual forbs such as Lactuca serriola, Tragopogon dubius, Sisymbrium altissimum and Salsola tragus can displace native vegetation, limit forage and habitat for animals, and facilitate annual grass establishment (Antill et al., 2012;Piemeisel, 1951;Prevéy et al., 2010).
Mitigating the impacts of plant invasions requires rapidly identifying and controlling local populations before they spread farther (Reaser et al., 2020), yet monitoring efforts may not detect non-native species until invasion is well underway. To anticipate future invasions, models of invasion risk based on the current distribution of non-native species have become increasingly common (Abella et al., 2012;Ibáñez et al., 2009). However, modelling suitable habitat for invading species poses unique challenges (Bradley, 2013).
Newly arrived species have not yet explored the range of available environmental conditions and may have patchy and sparse distributions that reflect dispersal limitations (Jarnevich & Reynolds, 2011;Welk, 2004). Because species currently undergoing range expansion are not yet at climate equilibrium, presence data may not represent the full range of suitable habitat and absences may not necessarily represent unsuitable conditions (Uden et al., 2015). Describing habitat suitability thus requires a flexible modelling approach that uses broad-scale, comprehensive data on species occurrences and can distinguish informative and uninformative absences.
Multiple factors define an invasive species' niche and determine whether it establishes and threatens native ecosystems once it arrives. Temperature restricts species' ranges directly based on physiological thermal tolerances and interacts with precipitation to determine ecosystem water inputs (Allington et al., 2013;Bradford & Lauenroth, 2006). Precipitation seasonality influences the timing of available water and its storage across soil layers and thus has important effects on the distribution of plant species based on rooting habits (Lauenroth et al., 2014). At local scales, topography modifies climate conditions that promote or inhibit invasions (Chambers, Bradley, et al., 2014;Peeler & Smithwick, 2018). Soil properties influence availability of water and nutrients to invasive species , and soil type and water availability are useful indicators of ecosystem resistance to invasion (Chambers et al., 2007;Roundy et al., 2018).
Disturbances can facilitate the arrival (Gill et al., 2018) or increase the abundance of non-native plant species (Jauni et al., 2015).
Successful invaders often share traits that allow exploitation of habitat openings, such as short generation times, high fecundity and capacity for long-distance dispersal (Van Kleunen et al., 2010).
Wildland fires suppress native species and create pulses of available resources, which can release invasive plants from competition (Gill et al., 2018;Hanna & Fulgham, 2015;Steers & Allen, 2010).
Anthropogenic land use and infrastructure can also remove competing vegetation and increase resource availability, and the associated movement of equipment and animals provides a vector for the spread of seeds (Gelbard & Belnap, 2003;González-Moreno et al., 2014;Leu et al., 2008).
Our primary objective was to assess the current distributions and environmental suitability for non-native invasive plants and the relative risk of invasion across the arid and semi-arid western USA. We leveraged presence and absence observations from extensive, publicly available vegetation survey and monitoring data (BLM, 2020b;Pyke et al., 2011;USGS, 2012). We defined a model training region for each species using threshold distances from observed presences, in which absences were more likely to represent environments where species may have dispersed but did not establish and inclusion of absence data improved model performance (Barve et al., 2011). We projected the model predictions across ecoregions to identify areas at risk of future invasion.

| Study area
Our study focused on six-level II EPA ecoregions in the arid and semi-arid western USA: Cold and Warm Deserts, South-Central and West-Central Semi-Arid Prairies, Western Cordillera and Upper Gila Mountains (US Environmental Protection Agency, 2011). EPA ecoregions are based on abiotic and biotic factors at nested scales for consistent analysis across the continent (Omernik & Griffith, 2014). The level II ecoregions outline the arid and semi-arid western USA, and level III ecoregions within them K E Y W O R D S arid, AUPRC, climate suitability, forbs, grasses, invasive plants, semi-arid, species distribution modelling, western USA represent units of soils, climate and vegetation patterns within which species may have similar invasion potential. Observations of invasive species presence data were obtained from more than 3.3 million km 2 in 28 level III ecoregions ( Figure 1, Table 1). We included species presence data from outside the focal level II ecoregions to assess suitable habitat of invasive species and areas at risk of future invasion both within and adjacent to the focal area.

| Focal species
We identified invasive species of management concern using state weed classifications and expert opinion. We used a cut-off of 600 presence observations of these species in our database, resulting in 15 species: five annual grasses, one perennial grass and nine annual forbs (Table 2). Intentionally introduced species (e.g. perennial grasses for forage or erosion control) were excluded. We classified species by rooting type (deep or shallow) to select the depth to which soil predictor variables were integrated: 0-30 cm for shallow-rooted species and 0-100 cm for deep-rooted species. Nomenclature follows USDA Plants (USDA NRCS, 2020).

| Vegetation survey data
We compiled species presence/absence data from multiple regionalscale data sources. Most data came from recent releases (as of April 2020) of the Landfire Reference Database (USGS, 2012) (BLM, 2020a(BLM, , 2020b. TerrADat and LMF data were collected using standardized monitoring protocols for grassland, shrubland and savanna ecosystems (Herrick et al., 2017). We supplemented these datasets with 826 plots from a Joint Fire Sciences Program study of post-fire reseeding treatment responses, the "Chronosequence Study" (Pyke et al., 2011). This study recorded species composition in paired plots within and outside fire perimeters for 88 fires occurring between 1987 and 2003 in the Central and Northern Basin and Range.
Prior to analysis, plot data were thinned to one plot per grid cell at the 3-arc-second resolution (~94 m) of the digital elevation model (described below). This minimized pseudoreplication while preserving the biologically meaningful topographic differences between cells of the elevation model. If plots from the AIM and Landfire databases fell within the same cell, we chose AIM plots, which were sampled more recently. If two Landfire plots fell within the same cell, we chose the F I G U R E 1 Distribution of vegetation survey plots (grey dots) across US EPA level III ecoregions. Table 1 lists the ecoregion name associated with each numeric code plot that contained the greatest number of focal species, excluding the near-ubiquitous Bromus tectorum. If plots were sampled more than once, we chose the most recent date for AIM and Landfire data and the date when more species were present for the Chronosequence data to provide a comprehensive and up-to-date set of species presence records. The final dataset included 148,404 plots.

| Environmental predictors of presence/ absence
We compiled data on climate, topography, soils and factors related to disturbance and dispersal (Table 3). All rasterized data were resampled by bilinear interpolation to the resolution of the digital elevation model to create harmonized prediction surfaces. Climate data were calculated from the TerraClimate monthly products at 150-arc-second resolution (Abatzoglou et al., 2018). Soil data from the National Soil Survey were obtained from gNATSGO (Soil Survey Staff, 2019). Distance from roads, a proxy for human-mediated disturbance and dispersal, was calculated based on the TIGER roads database (US Census Bureau, 2019).
As another metric of disturbance, we included a variable indicat- Annual species may not be detected if sampling occurs before emergence or after senescence resulting in inaccurate absence records. To incorporate phenology effects, we included day of year of plot sampling as a predictor. When predicting species presence, we set the day of sampling for each species to the day with highest probability of observed presence for that species.
To reduce collinearity in our initial list of 21 possible predictor variables, we removed annual mean and hottest-month temperature (retaining coldest-month temperature), monthly temperature variance (retaining precipitation coefficient of variation), total annual precipitation (retaining climatic water deficit) and sand content (retaining clay and silt content). Remaining variables had Spearman rank correlations of rho < 0.6 across all plot locations. We also removed those variables that never ranked in the ten most influential in preliminary model fitting and variable importance testing, which included topographic position index, soil calcium carbonate concentration, soil available water content and soil depth to restrictive layer. Final models included 13 predictor variables (Table 3).

| Species distribution modelling
We trained and evaluated predictive models of species presence, then used those models to predict invasion risk within model training areas and across ecoregions where the species was observed.
We fit boosted regression trees using the "dismo" and "gbm" packages in R version 3.6.2 (Greenwell et al., 2019;Hijmans et al., 2017;R Core Team, 2019). This method allows for nonlinear relationships and interactions between multiple variables and species presence, is robust to missing data and correlations between variables and combines the strengths of multi-model averaging (fitting many simple regression trees) and boosting (focusing on prediction of hard-topredict cases, rather than simple majority voting as in random forests) (Compton et al., 2012;Elith et al., 2008;Van Ewijk et al., 2014).
We present our methods in ODMAP format (Zurell et al., 2020) in a supplement (Table S1). TA B L E 1 Focal level III US EPA ecoregions for prediction of potential distributions of 15 invasive plants species in the USA. Numeric codes correspond to map in Figure 1 Code To distinguish between informative and uninformative absences, we adapted methods from species distribution modelling based on presences and pseudoabsences (Iturbide et al., 2015). We developed species-specific buffers from observed presences, beyond which including additional absence data did not improve the models' ability to accurately predict species presence (Barga et al., 2018). These model training buffers approximated the area where species may have already explored; absences within buffers likely represented unsuitable, rather than unexplored, habitat.

US EPA Level III Ecoregion Level II Ecoregion
We set the buffer distance for each species based on fivefold cross-validation, evaluating a range of possible buffers between 40 and 340 km in 30-km increments. To minimize influence of spatial outliers on model training areas, we defined buffers based on mean distance to the second-through sixth-nearest presences, so that small clusters or isolated presence observations were not automatically included. For each buffer distance, we trained the model on a random sample of four-fifths of the plots within the buffer (training folds), stratified by presence and tested the model on the remaining fifth (test fold).
We then selected the buffer distance that optimized area under the precision-recall curve (AUPRC) across cross-validation folds. In an invasion-risk framework, it is more important to accurately predict species presences than absences . We therefore selected model evaluation metrics that emphasized correct prediction of presences without discarding absence data. AUPRC measures the trade-off between precision (accurate detection of presences) and sensitivity (detection of true presences). AUPRC is better suited to assessing model performance on imbalanced datasets (e.g. with many more absences than presences) than the commonly used area under the receiver-operator curve (AUROC) (Lobo et al., 2008;Saito & Rehmsmeier, 2015). AUROC is high for models that correctly predict absences and fail to detect presences, because both types of errors are weighted equally, whereas AUPRC focuses on correct prediction of presences. Because AUPRC for a null model (random guessing) is equal to species prevalence, the proportion of plots where the species was observed (Carrington et al., 2020), we based our buffer selection on the difference between AUPRC and species prevalence within the buffer. Larger buffers generally decreased AUPRC, but also decreased prevalence. If the difference in the mean net AUPRC (AUPRC minus prevalence) between several buffers was less than 1% of the maximum net AUPRC, the largest of these buffers was selected to include the maximum number of training plots.
To increase model sensitivity and emphasize accurate prediction of presences, we trained all models on a subset of data from within the invaded-area buffer, such that at least 10% of observations in the training data represented species presences. In each cross-validation step, we selected training data from within the buffer by including all presence observations and adding randomly selected absences until the 10% prevalence threshold was reached. If species prevalence within the buffer already exceeded 10%, this step was skipped.
In all model-building steps, we fit 2000 boosted regression trees per model, with a tree complexity (maximum possible order of interactions between predictors) of 5 and a learning rate of 0.02. These TA B L E 2 Non-native invasive species selected for modelling species distributions and invasion risk. Nomenclature is from USDA Plants database (NRCS, 2011 Mean of the absolute differences in elevation of a cell and its eight neighbours (Wilson et al., 2007) Clay content parameters were derived from preliminary models for each species using the gbm.step method to optimize the number of trees per model based on presence-stratified 10-fold cross-validation and minimization of model deviance, which resulted in between 1,700 and 2,400 trees.

| Model evaluation
For each species, we calculated AUPRC, prevalence and AUROC, as the mean values from the five model cross-validation steps within the final model training area. We also calculated model specificity, sensitivity, false positive rate and the Symmetric Extremal Dependence Index (SEDI, a model performance measure for predicting low-frequency events (Wunderlich et al., 2019)) at two thresholds of predicted probability of species presence for classifying a species as present at a point: 0.15 (low threshold for maximizing prediction of presence) and 0.5 (more conservative). Choice of a threshold for converting probabilities to expected presence/absence, or integration across possible thresholds as with AUPRC, depends on the data user's priorities.
Variable importance was measured as the relative reduction in model mean square error resulting from data partitions based on each variable, averaged across all trees in the final model, as implemented in the "gbm" package (Friedman, 2001;Greenwell et al., 2019). We produced partial effects plots to visualize relationships between individual predictors and species presence (Hijmans et al., 2017).

| Prediction of invasion probability
After evaluating the models, we fit a single final model per species using all presence observations within the training area and enough absence observations from the area to maintain 10% prevalence.
We projected model-predicted probability of occurrence at 3-arcsecond resolution across the level III ecoregions where each species was observed, within the focal area.  its range in several ecoregions, filling in gaps between current presences in the Northwestern Great Plains and potentially expanding throughout the semi-arid prairies (Figure 3).

| Cold desert grasses
The training area for Bromus tectorum included the entire western USA; no vegetation survey plot was more than 340 km from the nearest five presences and more than 20% of plots contained the species ( Figure 3). Although B. tectorum occurred across the range of environmental conditions, it was most abundant in Cold Deserts (Figure 3).
Modelled probability of presence was higher with CWD > 500 mm, summer precipitation < 100 mm, and in burned areas ( Figure S1c), although burn history was less important than climatic predictors (Table 4). Models had the highest precision of any species and high sensitivity and SEDI (Table 5). Due to high prevalence, B. tectorum models had a relatively high false positive rate of > 0.05 at the 50% probability-of-presence threshold (relative to < 0.03 for all other species).
Taeniatherum caput-medusae was observed in Cold Deserts, particularly those with clayey soils and winter-dominant precipitation ( Figure 2, Figure S2a). Its probability of presence was greatest for clay contents from 20%-50%, CWD between 700 and 1,000 mm, and minimum temperatures above −4°C ( Figure S1e). Sampling day of year was the most important predictor of presence (Table 4) with highest detection probabilities in early summer and late fall. This species was predicted to expand within the Columbia Plateau and possibly Snake River Plain, Northern Basin and Range, and Western Cordillera ecoregions ( Figure 3). Model performance was relatively high as was its local prevalence (> 7% of plots within the training area) ( Table 5).
The perennial grass Poa bulbosa was associated with coldest-month temperatures below 0°C (Figure 2). It was predicted to expand within the Columbia Plateau, Eastern Cascades, Sierra Nevada, and Mojave Basin and Range (Figure 3). Its probability of presence was strongly associated with winter-dominant precipitation (PTCOR near r = −1) and sampling in late spring (Table 4, Figure S1f).

| Warm desert grasses
Bromus rubens was confined largely to the hot southwest margin of the focal area. Absence data from up to 340 km of presences improved model performance, defining a climatic niche with coldest-month temperatures above freezing (generally > 5°C), low but highly variable summer precipitation and annual CWD > 1,000 mm  Figure S1d).
Model predictions were robust with the highest sensitivity and SEDI and high net AUPRC within the 340-km buffer (Table 5). Infilling was predicted within the Mojave Basin and Range, and this species may threaten ecosystems outside of the study area, particularly in southern California (Figure 3).  Environmental conditions of plots with species present (grasses)  Table 5).

Draba verna was present in northwestern Cold Deserts and Blue
Mountains, primarily in regions with winter-dominant precipitation and CWD < 1,000 mm annually (Figure 4, Figure 5). Like C. testiculata, it was predicted to occur most frequently for spring sampling dates and subfreezing minimum temperatures (Table 6, Figure S1h).
This species was also associated with higher silt content. It was projected to expand substantially outside its 40-km model training buffer, primarily in the Middle Rockies ( Figure 5).
The observed range of Halogeton glomeratus was centred in the Cold Deserts ( Figure 5). It a halophyte with higher modelled probability in soils with nonzero electrical conductivity (i.e. saline soils) ( Table 6). The species occurred in flat areas (low ruggedness) with low summer precipitation (<100 mm, but positive PTCOR) and high CWD (> 1,000 mm) and had a narrower range of topographic heat load index than most species (Figure 4, Figure S2b) This species was predicted to remain within the 160-km model training buffer filling in suitable habitat within the Central Basin and Range, Wyoming Basin and Colorado Plateaus ( Figure 5). Models for this species had relatively high sensitivity and SEDI, consistent with the strength of salinity as a predictor (Table 5).
Tragopogon dubius had the most eastern range with extensive infilling predicted in the Northwestern Great Plains, Glaciated Plains, northwestern Cold Deserts and Columbia Plateau ( Figure 5).
Probability of presence was high in regions with minimum temperatures below freezing and CWD below 1,000 mm and either cold-season or warm-season precipitation (Table 6, Figure S1o). It was observed in areas with greater ruggedness than most species ( Figure S2b).

| Warm desert forb
Erodium cicutarium inhabited a range similar to the warm-desert grasses ( Figure 5) with modelled probability increasing at abovefreezing temperatures (especially > 5°C), CWD > 500 mm (especially > 900 mm) and low summer precipitation, even where most precipitation falls in summer (Table 6, Figure S1i). It was also observed and predicted to expand in patches throughout the Cold Deserts. It was most commonly detected in early spring, particularly in areas with lower CWD, but could be detected throughout the year in areas with CWD > 1,200 mm ( Figure 4).

TA B L E 4
Top five predictors of grass species presence based on final models including all available presence data. Relative influence is the improvement in model mean square error at all nonterminal nodes based on the target predictor averaged across all trees in the model (Friedman, 2001) Species Note: Clay = per cent clay, CWD = climatic water deficit, Day = sampling day of year, EC = electrical conductivity, PPT_summer = summer (June, July, August) precipitation, PPTCV = coefficient of variation of monthly precipitation, PTCOR = correlation between monthly temperature and precipitation, Silt = per cent silt, TMIN = temperature of coldest month (°C).

| Disturbance-driven and widespread forbs
Lactuca serriola was widely distributed throughout the western USA, though it was most common in northern latitudes. It was associated with moderately dry climates (CWD 500-1000 mm/ year) with winter-dominant precipitation (PTCOR < 0) (Figure 4).

Few presences were observed in the High Plains or Arizona/New
Mexico Mountains, but it was predicted to expand its distribution widely in these regions and the Columbia Plateau ( Figure 5). It was frequently observed in previously burned areas and discrete patches of predicted expansion corresponded to fire footprints ( Figure 5).
Sisymbrium altissimum was patchily observed in burned areas throughout the Cold Deserts ( Figure 5). Burn status contributed to over 40% of regression tree nodes, whereas summer precipitation (greatest probability of presence with < 100 mm) and coldest-month temperatures (greatest probability from 0 to −10°C) each contributed about 11% ( Figure S1n). It was observed and predicted to expand in eastern Cold Desert and prairie ecoregions within its 340-km modelled buffer ( Figure 5). As with L. serriola, this species did not have a clearly defined climate niche and model performance on training data was intermediate (Table 5).
Lepidium perfoliatum was most common in the Cold Deserts but showed a wide and patchy distribution across the range of climate variables ( Figure 5). Presence probability was highest in areas with summer precipitation < 50 mm and CWD > 600 mm (Table 6, Figure S1l). Model performance was relatively poor for this species, consistent with a generalist habitat and seasonal detection constraints (Table 5) As with L. perfoliatum, Salsola tragus had a wide and patchy distribution, primarily in the Cold Deserts, and was similarly difficult to predict. Observations were associated with flat terrain in basins (narrow range of HLI and ruggedness) ( Figure S1b).

TA B L E 5
Metrics of predictive ability for models trained and tested within buffered invaded areas. Metrics are means from fivefold cross-validation on all training data within the invaded areas. Sensitivity, false positive rate, precision and the Symmetric Extremal Dependence Index (SEDI) are each calculated at two thresholds for classifying a species as present. The low threshold, considering a species as present if the predicted probability of presence exceeds 0.15, maximizes prediction of presences, while a threshold of 0.5 is more conservative  Predicted probability of presence was highest in areas with moderately high CWD (typically > 1,000 mm), warm-season precipitation (PTCOR > 0) and nonzero electrical conductivity (Table 6, Figure S2m). In addition to infilling in the western Central Basin and Range, the model predicted extensive expansion into the Southwestern Tablelands, where warm-season precipitation dominates ( Figure 5).

| Model performance on test data
Overall, the models discriminated between presences and absences in data withheld from the training set (Table 5). Values of AUROC were generally high (0.85 or greater for all species). The more relevant AUPRC metric was usually an order of magnitude higher than its baseline of species prevalence in the test dataset, ranging from 0.228 for B.
japonicus (present in only 2.8% of plots within the 40-km invaded-area buffer) to 0.637 for B. rubens (present in 4.4% of plots within a 340-km buffer and occupying a well-defined temperature niche).
Even when trained within buffers and with presence-weighted training data, models were more likely to miss observed presences than to predict presences in areas where none were observed. With a threshold of 50% probability for classifying a species as present in a grid cell, models predicted less than 20% of S. tragus and L. perfoliatum presences in cross-validation. At the 50% presence probability threshold, the false positive rate was less than 2.5% for all species except B. tectorum. Therefore, the models provided a conservative estimate of which areas might contain invasive species.
Model projections beyond training-area buffers generally stayed within the range of environmental conditions covered by the training data ( Figure S3) with a few exceptions. For B. japonicus and D.
verna, survey points did not cover the coldest parts of the ecoregions, potentially affecting model predictions for D. verna, which was associated with low temperatures. Ranges of CWD in invaded ecoregions extended beyond those of sample plots for P. bulbosa and B. japonicus. Plots with T. caput-medusae did not exhibit the full range of intra-annual precipitation variation; this species was associated with both high and low values of PPTCV ( Figure S1e).

F I G U R E 3
Predicted habitat suitability for grass species. Probability of presence is shown in a range from yellow to red within the model training area for each species, and from blue to red outside the training area, within the Level III ecoregions where the species has been observed. Small grey points represent survey plots where each species was present. Albers equal-area conic projection. BRJA = Bromus japonicus, BRRU2 = Bromus rubens, BRTE = Bromus tectorum, SCBA = Schismus barbatus, TACA8 = Taeniatherum caput-medusae, POBU = Poa bulbosa

| D ISCUSS I ON
Our distribution maps provide conservative predictions of invasion potential, differentiating between buffered areas around observed occurrences where models were trained and currently unoccupied areas that may be suitable for invasion. We compiled nearly 150,000 presence/absence observations to overcome the challenges of predicting invasive species distributions from patchy and expanding ranges, and systematically selected the most informative absence data to model the environmental niches of 15 common invasive plants, including relatively unstudied species.
To our knowledge, this is the first attempt to map regional invasion potential for these species at a resolution that can inform

| Predicted invasion patterns reflect species traits and highlight future vulnerabilities
Our model outputs suggested sources of ecosystem vulnerability to invasion under current and future conditions and were consistent with known species traits. Predicted probability of presence tended to follow ecoregional boundaries, indicating that models reflected established thresholds in environmental conditions. As expected, climate variables were the best predictors of most species' distributions. Some specialists were predicted by soil characteristics, while some generalists were most responsive to disturbance. These predictions can be incorporated into an adaptive management frame-  Environmental conditions of plots with species present (forbs) Parmesan & Yohe, 2003). The effect of minimum temperatures on the focal species' future distributions will depend on their physiological tolerances. For example, B. rubens occurs in areas with higher minimum temperatures than B. tectorum, has lower freezing tolerance (Bykova & Sage, 2012), and is less broadly distributed in warmer, southern ecoregions (Salo, 2005). Bromus tectorum's distribution is projected to expand into areas that are becoming warmer, but to contract in the hotter and drier edge of its range (Bradley et al., 2016). Bromus rubens is projected to expand into those areas becoming less suitable for B. tectorum as temperatures warm.
Temperature interacts with precipitation to influence climatic water deficit (CWD), which was among the top five predictors of all but one species. Many focal species were associated with relatively F I G U R E 5 Predicted habitat suitability for forb species. Probability of presence is shown in a range from yellow to red within the model training area for each species, and from blue to red outside the training area, within the Level III ecoregions where the species has been observed. Small grey points represent survey plots where each species was present. Albers equal-area conic projection. CETE5 = Ceratocephala testiculata, DRVE2 = Draba verna, ERCI6 = Erodium cicutarium, HAGL = Halogeton glomeratus, LASE = Lactuca serriola, LEPE2 = Lepidium perfoliatum, SATR12 = Salsola tragus, SIAL2 = Sisymbrium altissimum, TRDU = Tragopogon dubius high CWD, which is predicted to increase as greater evapotranspiration from higher temperatures more than offsets potential increases in precipitation (Cook et al., 2004;Jeong et al., 2014). Many annual or short-lived invasive plant species in these environments have highly plastic responses to resource availability and can rapidly grow and reproduce when soil water is available and temperatures are conducive (Hulbert, 1955;James et al., 2011). Regional increases in CWD may promote expansion of species that are currently associated with areas of high CWD, but differences will likely exist among species. Growth and reproduction of shallow-rooted species, including the focal grasses, occurs prior to drying of shallow soils in summer across much of the study area (Ryel et al., 2010). For deeper-rooted species, like many of the forbs, growth and reproduction usually extends into the summer (Prevéy et al., 2010). Decreases in precipitation and recharge of deeper soil water may disadvantage both deeper-rooted natives and invaders in areas with low or decreasing summer precipitation (Wilcox et al., 2012). Ecosystem conversion to shallow-rooted annual grasses and forbs and increased uptake from shallow soil layers may have a similar effect.
The abundance and timing of precipitation varies across the western USA and drives species distributions (Robertson et al., 2009) (Duda et al., 2003).
Halogeton glomeratus may benefit from climate warming in some areas as higher temperatures facilitate its germination (Khan et al., 2001). Salsola tragus is salt tolerant and was associated with higher salinity, but invades diverse environments (Beckie & Francis, 2009). Climatic factors were important in determining the species' ranges and will interact with soil properties to influence future invasions.

| Disturbance may rapidly expand potential habitat for invasion
Fires are becoming larger and more frequent across the arid and semi-arid West (Abatzoglou & Kolden, 2013;Dennison et al., 2014;Fusco et al., 2019) and are rapidly changing the map of habitat suitability for post-fire invaders. Overlap with a fire footprint was the top predictor for S. altissimum and L. serriola and was within the top 5 for C. testiculata and T. dubius. These species have seeds that are dispersed by wind or adhesion to animals, and the strength of fire as a predictor of their presence reflects both their capacity for widespread dispersal and establishment following disturbances (Brandt & Rickard, 1994). Projected increases in burned area may therefore translate into large increases in these species' ranges.
Distance to roads was not highly predictive of any species, likely because more than 87% of sample plots occurred within 1,000 m of a road and no species was confined to roadside margins. In Wyoming,

| Data availability and species characteristics affect model outcomes
For many species, timing of sampling contributed meaningfully to the models, suggesting that imperfect detection due to phenology artificially constrained modelled suitable habitats (Jarnevich et al., 2015). Sampling date may also have acted as a proxy for climatic parameters not fully captured in the other predictors. While we expect field crews accurately identified common invasive species during the growing season, pre-emergence or post-senescence sampling of short-lived forbs like C. testiculata and D. verna and misidentification of species like T. caput-medusae may have resulted in false absence records. This indicates that monitoring of these species will need to be timed to maximize detection and more accurately model their distributions.
Prevalence and habitat specificity also affected model performance: the distribution of species with high prevalence in a limited subset of environmental conditions, such as S. barbatus, was easiest to accurately model. Model performance was poorer for species that were widely dispersed across the study region and range of environmental conditions, particularly L. perfoliatum and S. tragus. These species may be generalists for which presences are defined more by rare long-range dispersal events than by environmental conditions (Beckie & Francis, 2009

| Species distribution models can help manage emerging invasion risks
Anticipating Both species distributions and areas at risk will change as the influences of climate change and natural and anthropogenic disturbance increase. Models of species' distributions should be revised as part of an adaptive management framework that incorporates smaller scales. Predicted areas of expansion can be targeted to search for new populations and models can incorporate new records of occurrence and understanding of species' ecology (Uden et al., 2015).
The environmental conditions associated with each species that are identified in the models can be used to help target species-specific monitoring and control efforts at finer management scales. The information provided by the models can be supplemented with local knowledge of soil characteristics, microclimates and disturbance history. Our approach provides the needed distribution maps and information on environmental suitability of common invaders to help increase their detection and prevent their expansion across the arid and semi-arid western USA.

ACK N OWLED G M ENTS
This work was supported by a grant from the US Joint Fire Science Program (JFSP Project ID: 19-2-02-11). We thank John Bradford, Brice Hanberry, Francis Kilkenny and Shannon Still for their feedback on the study design.

PEER R E V I E W
The peer review history for this article is available at https://publo ns.com/publo n/10.1111/ddi.13232.

DATA AVA I L A B I L I T Y S TAT E M E N T
Raster files of species probability maps that were generated for this study, analysis R code and associated metadata are available on the U.S. Forest Service Research Data Archive: (https://doi.org/10.2737/ RDS-2020-0078).