The use of state‐and‐transition models in assessing management success

State‐and‐transition models (STM) are valuable tools that aid understanding and management of ecosystems. There also is potential for STMs to be used as a framework to assess whether management actions are achieving desired conservation objectives. However, few published examples exist where STMs have been used in this way. Using high‐quality empirical field data collected in endangered temperate woodlands across a 6‐year period, we explore whether a Box Gum Grassy Woodland STM employed in an Australian agri‐environment scheme can be used to assess whether management interventions are achieving an improvement in woodland Condition State. We found there was insufficient contextual information in the STM to facilitate its robust use as a framework for assessing whether management interventions were achieving conservation outcomes in Box Gum Grassy Woodlands. Weather was a key driver influencing management effectiveness, but its effects were absent from the STM, as were probabilities and time frames for transitions between Condition States. The deficiencies in the STM may preclude accurate conclusions about the effectiveness of the program. Given the influence of weather on the system, longer time frames are required to adequately assess the influence of management on key variables (e.g., native plant richness, native ground cover) underpinning the STM. This case study provides opportunities to understand the potential implications of using insufficiently contextualized STMs as frameworks for assessing whether management actions are achieving desired conservation objectives. It also provides opportunities to learn from what went wrong. To this end, based on our findings, we provide practical recommendations (applicable beyond our case study system) for improving the construction and implementation of STMs for the purposes of evaluating the success of management interventions in ecological restoration and conservation programs.

The development of most STMs involves: (a) describing abiotic features (e.g., climate, soils, topographic position) and identifying vegetation communities that are present in a discrete area; (b) identifying causes of transitions between Condition States and constraints to recovery of identified vegetation communities; and (c) describing characteristics of the vegetation communities in different Condition States (from pristine vegetation through to highly degraded vegetation) (Bestelmeyer et al., 2017). In defining the "reference state" of an STM, due consideration also should be given to evolutionary and land-use history of a community, as well as time scales for processes driving or affecting the community (e.g., disturbance) (Bestelmeyer et al., 2017).
STMs have been valuable tools for researchers and land managers to: (a) promote the conceptual understanding of ecosystems (Bestelmeyer et al., 2017), (b) identify gaps in knowledge about ecosystems (Knapp, Fern andez-Giménez, Briske, et al., 2011), (c) facilitate communication between a variety of stakeholders about target ecosystems (Bestelmeyer et al., 2017;Knapp, Fern andez-Giménez, Kachergis, & Rudeen, 2011), and (d) generate predictions about ecosystem responses to management (Scanlan, 1994). There also is potential for STMs to be used as tools to assess whether management actions are achieving desired conservation objectives as, theoretically, a shift from a lower Condition State to higher Condition State after management interventions are implemented would suggest that those actions are having a positive influence on ecosystem condition. In addition, recent developments in STMs (e.g., quantitative STMs, Bestelmeyer et al., 2017; state and transition simulation models, Daniel, Frid, Sleeter, & Fortin, 2016) have broadened the original scope and capabilities of these models. STMs are now being used to underpin assessment frameworks for conservation (e.g., the Melbourne Strategic Assessment; Sinclair et al., 2019), as predictive tools for management (e.g., simulation models to evaluate invasive species and fire management options; Jarnevich et al., 2019), and in planning and monitoring ecological restoration actions (Fraser, Rumpff, Yen, Robinson, & Wintle, 2017;Wainwright et al., 2020).
Two key features of STMs make them suitable tools for assessing the success of management interventions. First, STMs can integrate multiple qualitative and quantitative knowledge sources into a single model (Kachergis et al., 2013;Knapp, Fern andez-Giménez, Kachergis, & Rudeen, 2011), which increases the likelihood that ecosystem dynamics and drivers are represented holistically. This is important for targeting response variables and covariates for inclusion in statistical models that allow for appropriate evaluation of the effects of management on ecosystem condition. Second, STMs can quantify the likelihood of a transition between states (e.g., Halofsky et al., 2013;Rumpff et al., 2011), the time frames for such transitions (e.g., Jones & Burrows, 1994;Westoby et al., 1989), and levels of uncertainty in these elements of a model (see Kachergis et al., 2013). These features assist in contextualizing evaluation efforts that is, whether a transition is likely to occur in the time frame of a management program or period of data collection, and the confidence around these estimates.
Limited documentation is available that uses detailed field data to explore whether STMs can be used as a framework to effectively assess whether management interventions are achieving desired conservation objectives. Here, we use empirical data to explore: (a) if a Box-Gum Grassy Woodland (BGGW) STM can be used to frame the assessment of whether management interventions resulted in improved woodland condition (i.e., a change in Condition State) over a 6-year period in an Australian agri-environment scheme; and (b) if no shift in Condition State is observed (i.e., the indicator of improvement in woodland condition), what additional information is needed in the STM framework and assessment process to better contextualize the success (or failure) of conservation actions in these systems. For this study, we define an agri-environment scheme as a conservation mechanism in which landholders are compensated for modifying their farming practices to provide environmental benefits (e.g., through less intensive management, conservation covenants, or active restoration; Kleijn & Sutherland, 2003).

| The Environmental Stewardship Program
The agri-environment scheme we used as our case study to explore STMs was the Environmental Stewardship Program (henceforth "ESP"), in which data were collected between 2010 and 2016 in a broadscale monitoring program (see Lindenmayer et al., 2012). The ESP aims to protect and improve the condition of BGGW, an endangered ecological community in Australia (Department of Agriculture, Water and the Environment, 2021). BGGW is characterized by an overstory of yellow box (Eucalyptus melliodora A.Cunn. ex Schauer), white box (Eucalyptus albens Benth.), Blakely's red gum (Eucalyptus blakelyi Maiden), or grey box (Eucalyptus microcarpa Maiden) and has been extensively cleared across eastern Australia (Yates & Hobbs, 1997). BGGW was formerly extensive but only small remnant patches remain (Fischer et al., 2009), and these patches are highly degraded due to land clearing (Gibbons & Boak, 2002), livestock grazing, and weed invasion. Given this context, restoration and management interventions are needed including grazing control, replanting, and fencing (Lindenmayer, Michael, Crane, Florance, & Burns, 2018). These types of management interventions can be effective within 5-10 years (see Lindenmayer, Blanchard, Crane, Michael, & Florance, 2018;Lindenmayer, Blanchard, Crane, Michael, & Sato, 2018).
Under the ESP, it was anticipated that payments to landholders would incentivize conservation-sympathetic management in remaining BGGW remnants on farms leading to improve conservation outcomes and vegetation condition Burns, Zammit, Attwood, & Lindenmayer, 2016). The combination of actions employed by a landholder was specific to the condition and land-use history of a particular farm. However, conservation-sympathetic actions across the whole program included ceasing fertilizer application and cultivation, retaining standing trees and bush rocks, targeted weed control, strategic livestock grazing to avoid important flowering or plant growth periods, or ceasing grazing altogether (Department of the Environment Water Heritage and the Arts (DEWHA), 2009). The ESP occurs across approximately 172,000 km 2 of eastern Australia ( Figure 2) in landscapes with a land-use history dominated by livestock grazing and cropping that would largely have been covered by temperate woodlands prior to European settlement (Yates & Hobbs, 1997). The study area is largely characterized by a temperate climate with no dry season, and warm to hot summers (Beck et al., 2018). Over the period of study, mean maximum summer temperatures ranged between 24.7 C and 32.2 C and mean annual precipitation ranged between 436 mm and 859.5 mm.
At the beginning of the ESP, a STM developed using a combination of empirical and expert-derived data (see Appendix A in Supporting Information) was used to assess the initial condition of woodland remnants for each of the 158 farms in the scheme . This aided identification of specific management actions for each farm (Zammit, Attwood, & Burns, 2010). The STM used for the ESP defined five states (with sub-states for State 1, 2, and 3), where State 1 is representative of intact, forb-rich woodland and State 5 is a highly degraded woodland with no tree cover or any evidence of natural regeneration (see Figure 1). States 2 to 4 represent varying conditions of woodland between intact and highly degraded, as defined by native tree canopy cover, native plant species richness, proportion of the ground layer that is native, and presence of native regeneration in a given woodland patch (DEWHA, 2009). In addition, these states are defined by land use history; for example, grazing intensity and type, and fertilizer addition to patches (Figure 1; DEWHA, 2009). The STM also illustrates the possible (or predicted) transitions that can occur for woodland remnants in different states. As part of the ESP program evaluation, it was intended that the BGGW STM would be used to help determine if management interventions led to a Condition State change (as a surrogate for vegetation condition improvement).
The development and initial application of this STM did not involve the authors of this study. We were engaged during the later phases of monitoring design and implementation of the ESP. As such, we were users of, and can be considered independent assessors for, the STM. We were interested in exploring whether the developed STM could provide a framework for the assessment of management success by not only defining the state that would be monitored but also in making predictions about transitions between states that could be tested with monitoring. The STM used for the ESP was modified from McIntyre and Lavorel (2007) who sought to summarize available knowledge to improve understanding of vegetation change and plant functional trait responses to different land uses in BGGW across temperate subhumid and temperate cool-season wet climate types. Additional information regarding the STM is provided in Appendix A and Figure S1. To refine the STM presented in McIntyre and Lavorel (2007) for the ESP, modifications were made by Australian Government Environmental Stewardship Program staff. Text descriptions of BGGW characteristics were incorporated into the STM, qualitative transition probabilities were represented through thick versus thin transition arrows, the "enriched grassland" state was replaced with revegetated areas, and transitions between State 5 and State 1 were removed (Figure 1; DEWHA, 2009). To assign initial Condition State to a BGGW patch in the ESP, the following variables were considered: understory richness (excluding grasses), number of mature trees, length of fallen logs, % native overstory cover, % native mid-story cover, % exotic under-story cover, and over-story regeneration (Whitten, Gorddard, Langston, & Reeson, 2009).
The monitoring program accompanying the ESP was established in 2010 and focused on 268 sites on 158 farms. On each farm, paired sites were established (where possible) to monitor the long-term success of the ESP. One site was in a woodland remnant associated with the program where targeted management interventions, like grazing control, occurred. A second site was subject to "businessas-usual" farm management practices (e.g., uncontrolled grazing, Lindenmayer et al., 2012). In 2010, at all 268 sites, we collected site-level vegetation variables and structural attributes at permanent locations (baseline data). We repeated surveys in 2016. Further details on field methods and measured variables are given in Sato et al. (2016).
For this case study, we used field measurements to assign a Condition State to a subset of the ESP sites in the initial (2010) and final (2016) year of survey according to the STM. We also used these data to assess whether management actions led to improvements in the condition of ESP remnant woodland patches through time. For our case study, we selected a subset of ESP sites based on grazing management practices. We included only those farms where holistic, intermittent grazing was practiced prior to stewardship management. This was done to remove variability in condition response between farms associated with differing "business-as- usual" management practices. The STM from Whitten, Doerr, Doerr, Langston, and Wood (2010) that we used in our analysis (see Figure S3) is a successor to the BGGW STM that was developed to assign classes to sites in the ESP monitoring program (Figures 1,S2) ; Appendix A). However, Whitten et al. (2010) provide additional detail required to assign states to sites that was not provided in the original BBGW documentation. Figure 3 is a summary of the proportion of sites in each state, in each year of survey. While several variables underpin each Condition State, a site is assigned to the lowest Condition State that any of the field variables indicate it is in (Whitten et al., 2010). Some patterns observed from the empirical field data were promising in terms of conservation (e.g., a decrease in lower Condition State 3a Stewardship sites between 2010 and 2016). However, a parallel increase in higher condition Stewardship sites (i.e., State 1a or 2a) was not observed between 2010 and 2016, meaning that the condition of State 3a Stewardship sites across the study area likely declined during this period. The same pattern also was observed in the Control sites, indicating that the F I G U R E 4 Proportion of declining, improving and stable sites under (a) business-as-usual ("control") and (b) stewardship management regimes. Within the declining, improving and stable categories, the proportion of sites in each state ("2A," "3A," "2B," "3B,", "4," "5") contributing to that category are also presented decline in site condition across the study area was likely due to factors other than landholder management alone (e.g., a drought that coincided with the period of data collection). Patterns of change in site condition between 2010 and 2016 for Stewardship and Control sites revealed that a higher proportion of sites were improving under Stewardship management compared with "business as usual" management (17 vs. 9%, respectively). In addition, similar proportions of Stewardship and Control sites were declining (45 vs. 50%, respectively) or remaining stable (38 vs. 41%, respectively) ( Figure 4). Most of the sites that declined between 2010 and 2016 for both Stewardship and Control sites were initially in Condition States 3a and 4. Most sites that improved for both Control and Stewardship treatments were initially in Condition State 3b and 4 (but the ratio of State 3b and 4 sites that improved in condition varied between Control and Stewardship treatments; Figure 4).
A lack of clear patterns in overall and state-specific improvements, declines, and stability in site condition further suggests that management is unlikely to be the only factor influencing the movement of sites between states. Indeed, uncertainties surrounding the specific effectiveness of management in biodiversity conservation are well-documented issues in management-related research (e.g., see Tulloch, Hagger, & Greenville, 2020).
We quantified the influence of management and weather on key variables contributing to the STM (native ground cover, native plant species richness), using hierarchical generalized linear mixed models (HGLMs; Lee, Nelder, & Pawitan, 2018). For the HGLMs, we included the interaction between year (2010, 2016) and management type (Control, Stewardship), and year and weather (mean maximum temperature and cumulative rainfall in the 12 months preceding survey) as fixed effects, and farm and survey as random effects. For native ground cover, we assumed a quasibinomial distribution with a logit-link function for the response, and a beta distribution with a logit-link function for the random effects. For native plant species richness, we assumed a quasi-Poisson distribution with a log-link function for the response, and a gamma distribution with a log-link function for the random effects. We conducted all statistical analyses in GenStat 18.2 (VSN International Ltd). Detailed results for each HGLM are provided in Table 1. In brief, we found that: 1. Native ground cover declined between 2010 and 2016, while native plant species richness was relatively stable (Figure 5a,b). 2. There was no significant interaction between time and treatment for any variable, meaning that neither Stewardship nor business-as-usual management significantly altered patterns of native ground cover or native plant species richness over the 6-year study period (Figure 5a,b). 3. Increased mean annual temperature was associated with significant increases in native ground cover across the study (Figure 5c). 4. For native plant species richness, both temperature and rainfall showed a significant interaction with time. Increases in temperature and rainfall were associated with increased native plant species richness, but only in 2016 (Figure 5d,e). This was potentially due to milder conditions (cooler maximum temperatures and high summer rainfall) prevailing across the study region at the beginning of the study period, advantaging plant growth. However, by the end of the study period, hotter and drier conditions returned (see supplementary information in Lindenmayer et al., 2019), potentially limiting growing conditions in some areas. Hence, weather conditions may have played a greater role in plant growth toward the end of the study period than at the beginning.

| DISCUSSION
STMs are used widely in management to understand site potential and change over time (Bestelmeyer et al., 2017) and also are valuable as testable, conceptual models of an ecosystem that can inform the implementation of conservation actions (e.g., Knapp, Fern andez-Giménez, Briske, et al., 2011;Miller et al., 2015;Shlisky & Vandendriesche, 2012). However, practitioners should carefully consider what STMs should (and should not) be used to do. It is not advisable to use STMs as a framework to assess whether management actions are achieving desired conservation objectives where the STM provides limited or no information about transition probabilities or time frames, or incomplete information about ecosystem drivers. This is particularly the case if control sites are not included in monitoring programs to disentangle changes that are due to implemented management actions or due to other factors (such as weather).
There are several ways in which the formulation of the STM in our case study limited its capacity to act as a framework for assessing whether management interventions were achieving desired conservation objectives.
First, not all key drivers influencing transition between states were represented in the STM. Second, transition time frames and probabilities were not included in the STM. Finally, validation of the STM using field data was not done prior to using it to assess whether management interventions in the ESP led to improvements in Condition State in BGGW. Without additional context, it could be concluded-using the STM-that the ESP was not effective in achieving conservation outcomes for BGGW. However, this conclusion may not be an accurate reflection of the effectiveness of management actions, as weather played an influential role in patterns of variables underpinning the STM over the short term. To account for the influence of weather, it is likely that much longer time frames are required to appropriately evaluate the success of the ESP. As such, we see our case study as an example of where good insights about the potential use of STMs as a framework for evaluation come from learning about what did not work.
Addressing the limitations identified in our case study STM (but that are apparent beyond our case study, for example, no quantification of transition probabilities or time frames; Petersen, Stringham, & Roundy, 2009;Young, Perotto-Baldivieso, Brewer, Homer, & Santos, 2014;Ravolainen et al., 2020) is critical in scenarios where evaluation and success of conservation actions is tied to incentive payments and contractual obligations of landholders, as is the case in the ESP. Inappropriate evaluation may lead to unnecessary penalization of landholders and/or alterations to management programs that may actually have been having a positive effect on biodiversity, or may have achieved intended conservation outcomes but over longer time frames than the evaluation window. To assist in avoiding these scenarios and to improve evaluation of ecological restoration or management programs using STMs, we suggest the following recommendations based on our findings: 1. Include Control sites in monitoring programs, particularly in situations where understanding of the ecosystem or of the effectiveness of proposed management actions is partial or uncertain. Our data show that if the STM formulated for the ESP was used as a framework to evaluate the effectiveness of Stewardship actions without reference to Control sites, it would be feasible to conclude that Stewardship management is inappropriate for achieving desired conservation objectives. This is because Stewardship sites infrequently exhibited "ideal" condition behaviors (e.g., an increase in condition through time). Control sites helped determine that management was not necessarily a key factor influencing broadscale changes in remnant vegetation condition between 2010 and 2016.
2. Consider the full range of key drivers that can influence ecosystem condition in STMs. Achieving a holistic representation of the drivers that influence ecosystems is challenging. Knowledge of ecosystem dynamics and mechanisms driving these dynamics is often only partial (Bestelmeyer et al., 2017;. Therefore, diverse sources of information need to be synthesized from published experiments, monitoring data, historical records, local and expert knowledge (as highlighted by Bestelmeyer et al., 2017) to best represent current state-ofknowledge of an ecosystem. This information search and synthesis should be as exhaustive as possible where STMs underpin expectations of conservation contracts (as with the ESP). This is because factors outside the control of a landholder may be influencing their ability to achieve agreed conservation outcomes. In our case study, weather was important in explaining patterns in native ground cover and native plant species richness (variables that inform the STM) across the study area ( Figure 5), and has been identified as an influential driver in other systems beyond Australia (e.g., the Sonoran Desert; Bagchi et al., 2012). The influence of weather on BGGW vegetation is well documented in the scientific literature (see Clarke, 2003;Clarke & Davison, 2004;Gibbons et al., 2008), and as such, is an oversight that should have be included in the STM from the outset. Incorporating understanding of weather into the BGGW STM will contextualize the response of the ecosystem in future. It also will inform whether management actions should be implemented in sub-optimal weather conditions, as weather events become more extreme under changing climate regimes (e.g., precipitation;Evans, Argueso, Olson, & Di Luca, 2017 Bagchi et al., 2012). Restoration time frames were originally included in STMs (e.g., Jones & Burrows, 1994;Westoby et al., 1989) and, with methodological developments for using expert knowledge to provide quantitative estimates where long-term monitoring data are not available (e.g., see Kachergis et al., 2013), we suggest that these restoration time frames be included as a core element of an STM. Ideally, longterm monitoring data will be used to quantify transition probabilities and time frames in an STM (as in Bagchi et al., 2012). Where these long-term data are not available, the use of expert knowledge (Rumpff et al., 2011) and/or simulation modeling (Provencher, Frid, Czembor, & Morisette, 2016) can be used to provide informed initial estimates that can be updated as data come to hand (as in Rumpff et al., 2011). Incorporating transition time frames and probabilities in a rigorous manner across different soil types or sites is challenging. However, using monitoring data, or expert elicitation methods where data are lacking, will assist in setting realistic expectations about restoration time frames, and the certainty of anticipated restoration outcomes (e.g., see Kachergis et al., 2013). This information can subsequently inform resource allocation and/or contractual obligations of land managers. 4. Identify uncertainties in STMs and validate models as data become available to refine the model. The formulation and use of STMs is an iterative process (Bestelmeyer et al., 2017). Identifying areas of uncertainty in particular features that underpin an STM will assist with targeting research questions to improve management outcomes (by improving understanding of ecosystem dynamics). Information from this research, as well as monitoring data, should then be used to periodically test and update STMs (e.g., Rumpff et al., 2011). This process will assist in identifying deficiencies that require attention (e.g., including weather as a driver in the STM as in our case study), and reducing uncertainties and ambiguity in the model (Bagchi et al., 2012). Updating STMs with new data also may identify additional areas of uncertainty in ecosystem understanding associated with ecosystem stochasticity (Provencher et al., 2016) or predicted change under altered climate regimes (Halofsky et al., 2013). Each of these refinements will clarify expectations around what management actions can achieve and in what time frame. These uncertainties also can be fed into planning phases for large-scale restoration programs (such as the ESP) to maximize opportunities for positive and cost-effective conservation outcomes. This will be particularly important for long-term programs under changing short-term weather (Hardegree et al., 2018) and long-term climate regimes (e.g., Leroux & Whitten, 2014). 5. Evaluate the influence of management on individual variables comprising the STM, rather than composite indicators (like Condition States). Composite indicators such as Condition States can correspond to general patterns in condition, but they may mask the individual variable(s) driving change in condition. The responses of variables in our case study to temperature and rainfall were not necessarily consistent and sometimes temporally variable (see Figure 5c-e). This can make it challenging to draw meaningful conclusions from changes in STM Condition States without illustrating specific relationships between these variables and weather parameters. If management aims to address undesirable changes or to maintain desirable changes in condition, then it is important to present the trajectories of individual variables to land managers and policymakers so effective management actions can be targeted to specific variables. Presenting the trajectories of individual variables requires a welldesigned, objective-driven monitoring program (see Lindenmayer & Likens, 2018) to complement the use of an STM. Such a program also will provide the data necessary to test and refine the STM (e.g., estimating transition time frames, clarifying drivers of ecosystem change, and reducing uncertainties identified in the model).

| CONCLUSIONS
STMs are valuable tools for conceptualizing ecosystems, developing management actions to address drivers of ecosystem degradation, and assessing the potential consequences of those implemented actions. However, using incompletely formulated STMs as a framework for assessment may preclude rigorous assessment of whether management actions are achieving desired conservation objectives in ecological restoration and management programs. This problem can be mitigated by providing sufficient context in STMs for evaluation purposes, including identifying all drivers that influence key variables in the STM and estimating transition time frames between states (using empirical data and/or expert elicitation estimates). Where uncertainty or partial knowledge is apparent in an STM, control sites will be critical in determining whether change (or lack thereof) is due to management (or other factors) and will assist in refining and validating STMs.

ACKNOWLEDGMENTS
This study was supported by the Australian Government under the Environmental Stewardship Program (ESP) and the Ian Potter Foundation. The authors thank the highly enthusiastic landholders in the ESP who allowed us repeated access to their land, and three anonymous reviewers for their helpful feedback on earlier versions of the manuscript.