Selecting Indicators and Optimizing Decision Rules for Long‐Term Water Resources Planning

Decision rules provide an intuitive framework for water resources planning. Having adopted a rule‐based plan, decision makers can monitor critical variables to trigger timely adaptation actions when the variables pass their predetermined thresholds. However, establishing a strategy that is comprised of a set of decision rules raises methodological challenges: (i) to identify observable indicators that provide reliable information about current and future change, (ii) to choose suitable statistics to characterize nonstationary time series that are germane to system performance, and (iii) to optimize threshold levels that trigger interventions. We propose a methodology that addresses these methodological challenges whilst explicitly balancing expected risks of water shortages with the costs of intervention in the water supply system. The four‐step framework uses a multiobjective evolutionary algorithm to search for and to identify the combinations of indicator‐informed decision rules that govern if, when, and what supply options should be included in the water resource system. The rule‐based strategies are dynamically tested against an extensive ensemble of future climate and demand scenarios to examine the trade‐offs between strategy cost and level of service. The framework is applied to the London water system (England) using regional climate simulations to identify strategic rules for a 60‐year planning period. The results demonstrate the utility of the framework, identifying observable indicators and decision thresholds that are used in optimal rule‐based planning strategies. In key areas of the solution space, rule‐based strategies reduce expected restriction costs on average by 13.1%, and as much as 24.1%, for a given intervention cost.

unnecessary and potentially irreversible investment decisions by linking adaptation actions to exogenous uncertainties, such as climate change (Fletcher et al., 2019) or changing patterns of water consumption (Di Baldassarre et al., 2018), as well as uncertainties surrounding investment options, such as infrastructure construction time (Trindade et al., 2019). A monitoring system of observable indicators and triggers ensures that adaptation actions are only taken if and when necessary, with actions being triggered when indicators of change detect the need for adaptation. However, selecting the most suitable indicators and implementing them in a rule-based capacity expansion project remains a difficult and often neglected task.
Many studies have demonstrated the value of using indicators of change to inform water resource planning and operations. In their study of risk-based adaption and planning under deep uncertainty, Haasnoot et al. (2013) propose a dynamic adaptive policy pathway framework that considers uncertainties in future political, social, technological, economic, and climate states. A monitoring system with signposts and triggers is used to detect changes to the water supply system. When the triggers are activated, the actions are deferred, expanded, abandoned, or modified. Likewise, Fletcher et al. (2019) incorporate learning about future climate observations into a reservoir capacity expansion plan in Mombasa, Kenya, with results illustrating the importance of developing adaptive policies that consider conditions as they change over time. Zeff et al. (2016) propose a coupled framework for adapting long-term infrastructure sequencing and short-term drought management, using an aggregate risk-of-failure indicator to trigger adaptation actions. In their case study example, the indicator is defined by the probability of total reservoir storage dropping below 20% of total capacity in 78 weeks. Trindade et al. (2019) use the same risk-of-failure indicator to identify cost-effective infrastructure options and short-term water management policies that are robust to endogenous uncertainties in the Research Triangle of North Carolina, USA. Herman and Giuliani (2018) also use reservoir storage, alongside inflow measurements, as indicators for threshold-based operating policies for the Folson Reservoir in California, USA. The framework is designed to optimize the conditions that determine whether management actions are taken, rather than optimizing the sequence and timing of actions. The simulation-optimization approach successfully identifies optimal policies for individual future scenarios, and also robust policies that perform well across a range of future scenarios. Additional examples of indicator informed planning are presented in Kirsch et al., (2013), Steinschneider and Brown (2012), and Woodward et al., (2014).
Indicator-based planning frameworks face several methodological challenges. Principally, the planner must identify: (i) the optimal number of indicators to include in the monitoring system; (ii) the windows of observation for each indicator; and (iii) the thresholds for individual indicators which define when an adaptation action is triggered. For some systems, an aggregate indicator such as reservoir storage (Fletcher et al., 2017;Paton et al., 2014;Trindade et al., 2017;Zeff et al., 2016) or the supply demand gap (Erfani et al., 2018) may be used to inform planning. Yet aggregate indicators are influenced by system inputs (flow) and outputs (water use), which will likely change on different timescales and have different degrees of predictability. Separating the effects of these variables could lead to improved prediction of future water system performance when multiple uncertainties exist. Similarly, plans using only one indicator of change (Hui et al., 2018;Steinschneider & Brown, 2012) may underperform compared to plans with multiple indicators that target different sources of uncertainty. This effect could be especially important for spatially heterogeneous water systems with regions that are influenced by more than one source of uncertainty.
In spite of the evidence in favor of indicator-based approaches to water planning, many of the studies discussed above do not offer a clear process for indicator selection. Instead, indicator choice is judgment-based and may therefore not optimally select the indicators that are most likely to achieve the desired outcomes in the long run. Several studies have shown that indicators must be chosen systematically, scientifically, and evaluated a priori against selection criteria. Galelli et al., (2014) argue that variables should be selected based on their usefulness for predicting a system outcome of interest, proposing the input variable selection (IVS) framework to identify candidate observable variables that characterize the relationship between input-output variables in predictive models. Raso et al., (2019) propose that indicators should be selected based on four measures: relevance, observability, completeness, and parsimony. Indicator relevance features heavily in indicator selection frameworks, referring to the influence, or relationship, of the indicator on adaptation policy success. A relevant indicator should represent emerging trends of uncertain drivers in the underlying system, whilst remaining closely related to system performance and plan success. Observability measures the amount of data required to calculate the indicator, with high observability meaning that a good estimate of the indicator can be obtained with relatively little data. Completeness represents the combined ability of selected indicator variables' ability to represent critical uncertainties in the system being modeled, with a greater number of indicators being more likely to achieve high completeness. The importance of completeness is demonstrated by Giuliani et al. (2015), who use simulation and an evolutionary algorithm to identify variables best suited to inform reservoir operations of the Hoa Binh Reservoir in Vietnam. In their investigation, historical and real-time observations of rainfall, river flow, and water levels are used to design reservoir operating policies. The performance of the indicator-based policies is compared to perfect operating policies (POPs) designed using perfect foresight, with results showing that the policies with multiple observable variables achieve the best performance relative to the POPs. Finally, Raso et al. (2019) state that the selected set of indicators should be parsimonious, with an ability to detect important ongoing changes without increasing the dimensionality of the planning process. To achieve parsimony, no two indicators should describe similar or identical features of the system being modeled (Raso et al., 2019). As Herman et al. (2020) report, analyzing and optimizing flexible planning systems can be conceptually challenging and computationally expensive. Thus, selecting a parsimonious monitoring system helps to avoid unnecessary complexity which may render the computational problem intractable.
Scenario discovery has also been used as a mechanism to identify informative indicators for adaptive management actions, with studies adopting discovery methods to find important sub-spaces within the problem uncertainty space that influence policy performance (Kwakkel, 2019). Groves et al. (2015) use a robust decision-making framework to identify early warning indicators emerging from scenarios of future conditions in the Metropolitan District of Southern California, with an aim to use indicators to prevent the current planning strategy from failing to meet its goals. In contrast, Trindade et al. (2019) adopt scenario discovery to detect key drivers of system performance and robustness under indicator informed water resources planning. Finally, Fletcher et al. (2017) combine simulation and scenario analysis to identify key uncertainties that will inform learning over time. The planning framework is used to develop historic planning portfolios for Melbourne, Australia, considering uncertainties in reservoir inflows, population growth rate, electricity prices, water shortage penalty values, and demand when implementing real adaptation options.
Evidently, a range of methodologies exists to identify the best indicators of change for water resources planning. Yet few indicator selection methodologies go on to define the threshold an indicator must exceed for an adaptation action to be triggered. Within water resources, triggers are commonly adopted in short-term feedback control loops, such as those defining reservoir operating policies (Giuliani et al., 2015), flood control and conservation actions (Herman & Giuliani, 2018), and water use restrictions Mortazavi-Naeini et al., 2014). Other studies focus on defining thresholds to trigger long-term permanent capacity expansion infrastructure, with many planning frameworks using multiobjective evolutionary algorithms (MOEAs), or genetic algorithms (MOGAs), to optimize threshold values for specific adaptation actions. For example, Mortazavi-Naeini et al. (2015) and Zeff et al. (2016) examine the success of algorithm-generated trigger-based adaptation policies against ensembles of synthetic streamflow scenarios. Likewise, Trindade et al., (2017) use an MOEA and many-objective robust decision-making framework to identify risk-of-failure indicator thresholds for planning portfolios in the Research Triangle region of the US, demonstrating the importance of evaluating adaptation action triggers against a wide range of future hydrologic, demand, financial, and infrastructural uncertainties. Robinson and Herman (2019) present an alternative approach to defining adaptation triggers that do not rely on MOEAs, developing a data-driven quantitative framework to identify planning thresholds based on vulnerable scenarios within ensemble projections. As the authors acknowledge, however, the study fails to consider more than one type of indicator and does not apply the methodology to a real-world planning problem.
The discussion above reveals a scarcity of methodologies that combine indicator selection and threshold definition for long-term adaptation planning under 21st-century uncertainties. This paper presents a stepby-step framework to overcome this gap, developing indicator informed adaptive water resources planning strategies that balance the cost of adaptation, aggregated over time, against the benefits of adaptation, defined by the reduction in expected risk of water shortages. We first use simulation and linear regression to identify a set of candidate indicators from a larger pool of indicators, based on the indicator's ability to predict future risk of failure in the water supply system. Next, multiobjective optimization is used to define water resource planning strategies. Each strategy contains a set of decision rules, with each rule containing one or more candidate indicator(s), optimized indicator threshold value(s), and a corresponding adaptation action that is triggered if the indicator(s) surpass the optimized threshold(s). The strategies generated in the multiobjective optimization search are Pareto-optimal against a range of future uncertainties. The position of Pareto-optimal solutions in the objective space, relative to baseline "fixed" plans, is used as a measure of the indicator's contribution to improving system performance. The best indicators of change are selected based on their ability to produce water resource strategies and adaptation actions that minimize (in a Pareto-optimal sense): (i) financial costs of operating and adapting the water system; and, (ii) the expected risk of water restrictions, across a range of future conditions. Further validation of indicator selection is provided in Step 4 through out-of-sample simulation, which tests selected strategies against a wider range of future conditions. A summary of the framework to identify decision rule-based water planning strategies is presented in Figure 1.
The framework is applied to a water planning problem in the London water system (Thames Basin, England), using indicators of climate and demand uncertainties to inform long-term adaptation actions. The rule-based strategies are optimized against risk-based performance metrics that consider adaptation cost and observable water use shortages, having been tested with a large ensemble of climate-change-driven hydrological flows and population-driven demand projections. The strategies are categorized as risk-based because they are evaluated against their ability to reduce the frequency, severity, and duration of water restrictions and the associated impacts, which in this case are valued as financial costs . The results show that user-oriented decision rules, consisting of observable indicators and triggers, provide practical guidance to decision-makers who are faced with uncertainties concerning future states of the world.

Stakeholder Dialogue
Planning for water systems requires well-defined aims, objectives, and clear communication between stakeholders (Poff et al., 2010;Trindade et al., 2017). Precise communication is especially important for optimization-based planning approaches which are influenced by objective functions (Quinn et al., 2017), so this stage requires stakeholders to clearly define the objectives for their system and outline the planning problem to solve. Common objectives in water resources planning under uncertainty include: (i) minimizing financial costs of the water system, including capital and operating costs; (ii) minimizing environmental impacts (including greenhouse gas emissions) and where possible enhancing the aquatic environment; and, (iii) minimizing the risk of an event occurring that may negatively impact water users. The proposed framework considers a set of z planning objectives, W = {w 1 , … ,w z }. The first planning objective aims to minimize the financial costs associated with the construction and operation of the water resource system. Subsequent objectives are at the discretion of the water planner, but should seek to minimize one or more risk metrics inherent to the planning problem. The risk metrics may relate to the expected risk of water shortages, the economic risk of flood damages, or water quality risks. This study uses the risk metric developed by Borgomeo et al. (2018), which estimates the expected cost of water use restrictions, as a function of the frequency, duration, and severity of restrictions imposed throughout the planning period. The metric is classified as risk-based according to criteria by Hashimoto et al. (1982), and explicitly considers the economic consequences of restrictions of differing severities. The planning problem is framed as multiobjective because water managers frequently face planning decisions that trade-off reductions in risk achieved by water system adaptation against the cost of adaptation investments (Garrick & Hall, 2014).
In addition to the planning objectives, stakeholder discussions should outline operational constraints in the water supply system, such as environmental flow limits, water withdrawal licenses, minimum reservoir releases, and target pumping levels.
The question of which options to include in the search for optimal planning strategies requires applying experience and creativity to particular contexts. Consultation with stakeholders and review of analogous cases in different river basins will help to identify a wide range of candidate options. The set of adaptation MURGATROYD AND HALL 10.1029/2020WR028117 options are denoted as A = {a 1 , …, a g }, where g is the total number of options to consider. This framework focuses specifically on capacity expansion options and demand saving schemes, but could also be used to define indicator-informed operational rules that work on shorter planning timeframes.
Dialogue between water planners and stakeholders can also inform the identification of indicator variables to be used in the adaptation monitoring system, thanks to local knowledge about the system vulnerabilities.  These discussions could point toward use statistics of river flow, precipitation, temperature, population change, elapsed time, and/or water use as decision-relevant indicators.

System Model
A system model is used to simulate the impact of water planning strategies and corresponding adaptation actions under changing future conditions. Key features of the water system should be represented within the system model, including inputs (surface and groundwater withdrawal points), outputs (demand centers), system operation, planning options, and capital and operating costs of adaptation actions. Our framework requires that the system model can represent outcomes of interest to decision makers, such as water shortages of varying frequency, severity and duration (Hall & Borgomeo, 2013), or indicators of water quality . Whilst less-computationally intensive models could be used for initial assessment of planning options and adaptation policies (Haasnoot et al., 2014;Hall et al., 2016), a model which fully represents operational complexities of the water system is necessary to examine the performance of candidate water planning strategies (Borgomeo et al., 2018).
Consistent with several other water planning studies, this framework uses a discrete event simulation model to represent the water supply system Mortazavi-Naeini et al., 2015). In simulation, decision rules are consulted at discrete time steps to inform adaptation actions. These actions are represented by discrete values, which pertain to individual planning options identified in stakeholder discussions, their associated yields, and infrastructure construction lead times. The use of discrete values reflects common water utility planning approaches, which preselect features of candidate adaptation actions based on economic, social, and geographic constraints (Herman et al., 2020).

Scenarios of Uncertainty
Extensive simulations of the system model are required to evaluate the performance of long-term planning strategies against future uncertainties. Strategies are simulated against an ensemble of k scenarios denoted by X = {x 1 , … ,x k }. A scenario x i is composed of time series of future conditions that are input into the system model, and is denoted by where t represents the time step (i.e., day, month, and year) in a time series of length q. With some abuse of notation, we herein refer to scenario x i as scenario i. The system model described in Section 2.1.2 uses the ensemble of future scenarios as inputs, simulating state variables in the system, such as river flows, diversion volumes, reservoir levels, and volumes of water delivered to users.
The framework presented here targets exogenous uncertainties in the water supply system arising from climate change-driven hydrological projections and population-driven water demand forecasts. However, the framework could consider other uncertainties that can be defined using an a priori ensemble, such as hydrological modeling uncertainties (Ajami et al., 2008), uncertainties concerning infrastructure construction cost and lead times (Trindade et al., 2019), and land use change uncertainties (Kwakkel et al., 2015).

Quantifying Expected Risk in a No Adaptation Future
We require the set of indicators used in the rule-based strategies to be parsimonious, as this helps to reduce monitoring costs and reduces the complexity of the planning problem. At the same time, the indicators need to efficiently represent the factors that determine future system performance (Raso et al., 2019). A detailed examination is necessary to identify candidate indicators from the larger pool of stakeholder suggested indicators, and to identify the most appropriate statistics of candidate indicator variables. For example, low flow statistics may feature in discussions as an ideal indicator variable for the target water system, but the severity, type of statistic, and time-window over which to observe the statistic will be difficult to define through intuition alone. Given that some candidate indicators will have high variability, we wish to select statistics (e.g., moving averages) that are sensitive to changes and salient to system performance whilst being relatively stable, that is, having low sampling variance. Ultimately, the indicators should not be unduly influenced by random fluctuations and at risk of triggering false-positive adaptation (Robinson & Herman, 2019).
The process presented in Step 3 adopts multiobjective optimization to identify the indicators that produce the greatest improvement in objective performance, relative to the performance of "fixed" plans (described in Section 2.3.3). However, this entails an expensive two-stage optimization, requiring optimization of decision rules and strategies in a large ensemble of scenarios, for every candidate indicator. Given the complexity of the system and the size of the option space for both indicators and strategies, this is not computationally feasible. Therefore, an empirical analysis is adopted in Step 2 to narrow down the set of indicators to take forward into Step 3. In this step, we identify indicators that enable the water planner to make better adaptation decisions in the future, where "better" is measured in terms of the objectives outlined in Section 2.1.1. Because this framework assumes the cost of adaptation is known without uncertainty, indicator selection is based on the planning objectives that minimize future risks in the water system. If the world was stationary and we had extensive measurements of historical system performance, we could accurately estimate present-day risk using past observations. We could also conclude that, without adaptation, this estimation of risk would remain the same in the future. As this is not the case, we use simulation to predict future risk in a "no adaptation" future by using observable indicators of change.
First, the system model developed in Section 2.1.2 is run under the ensemble of scenarios, X, to estimate performance metrics directly related to the risk-based objective function(s). For each simulation run under scenario i, a performance metric, i t p , is estimated at each time step t. An example performance metric could estimate the daily expected cost of water-use restrictions, or the probability of flooding, at time t. No adaptation actions are implemented in these simulations; instead, water demand is met by supply from existing features of the water system. It is for this reason that a performance metric related to the cost-based objective function need not be considered.
Candidate indicators are also calculated for each time step t in simulation of scenario i. Indicators could use statistics of river flows and demand, or simulated measurements of existing water supply infrastructure, such as reservoir levels.

Select Subset of Indicator Variables
Ordinary least-squares regression (OLS) is used to explore the strength of the relationship between an observable indicator, i t o , and the risk metric, i t h . Linear regression is chosen for its ease of implementation and its robustness relative to nonlinear methods. The relationship is calculated as where i t h represents the aggregated risk at time t remaining in scenario i. Regression models are considered in the form , and  is the error term. Models can be built with multiple indicators to investigate the combined ability of indicators in predicting future risk. To achieve a parsimonious monitoring system and avoid selecting redundant variables, indicators should be considered together only if they represent different sources of uncertainty. Indicators with different observation periods should also be considered. The desired indicator should have a long enough observation window to screen natural variability, but not so long that it fails to identify nonstationarity in indicator trends. Models using indicators with different time windows should only be compared if they use observations and simulation output from identical time periods. Therefore, two regression models using indicators with observation periods spanning 5 and 10 time steps should only be compared if they consider identical time series for Several methods are available to identify appropriate combinations of predictor variables (indicators) for the regression models, such as forward selection, stepwise selection, backward elimination, and principle component analysis (Haque et al., 2018). This study uses a simplified forward selection process, iteratively increasing the number of predictor variables in the model. Models are rejected if p-values for coefficient estimates of each model term are not significant at the 5% significance level. p-values are derived from t-statistic tests that test the hypothesis that the coefficient is equal to zero. The regression models with significant predictor variables are ranked according to root mean squared error (RMSE), which represents the standard deviation of the error distribution. Models with the smallest RMSE are assumed to have the greatest predictive ability. The coefficient of determination (R 2 ), Akaike information criterion, or residual mean square error could also be used to rank the regression models.
The model and candidate indicators (or combination of indicators) that perform best according to the selection criteria outlined above are used in Step 3 to generate rule-based adaptation strategies. If multiple indicators exhibit similar predictive abilities, both can be taken forward into Step 3 for further screening.

Step 3-Strategy Generation
This step serves two purposes. First, it defines thresholds for the indicator variables identified in Step 2. Each threshold is individual to an adaptation option identified in Step 1. Second, it provides a mechanism to identify the best indicator variables to inform adaptation under future uncertainty.

Strategies and Decision Rules
As established previously, observable indicators are monitored over time and used to inform adaptation decisions in the future. To trigger an action, an indicator must cross a predetermined threshold. When considered together, an observable indicator, threshold, and corresponding adaptation action describe a decision rule. A planning strategy contains one or more decision rules, with each rule characterizing the implementation of individual adaptation actions ( Figure 2). A planner follows a planning strategy when adapting their water system over the planning horizon.
Strategy n is defined as represents the decision rule that describes the criterion for when infrastructure option a should be triggered. Each infrastructure option has a construction lead time (m a ) and capacity that the option is built to (c a ). In this study, sources of financial (i.e., bond terms and interest rates) and logistical uncertainty (i.e., construction disruptions) that may impact option lead times and yield are not modeled. We recognize this as a limitation of the proposed framework; future study could explore the role of these uncertainties on strategy success.
We consider decision rules in the conditional-go form (Garstka & Wets, 1974), where a rule with one indicator may be of the form: IF "observed indicator exceeds threshold value" THEN "build infrastructure option a with capacity c a , with the option operating functionally after the lead time m a has elapsed" ELSE "do nothing." If indicators are expected to display oscillatory behavior, it may be necessary to use multi-indicator decision rules in planning strategies. For many cases, an indicator based on elapsed time will help to avoid adaptation actions being triggered as a consequence of an early random shock that is not representative of the long-term trend.

Threshold Definition and Optimization Procedure
The thresholds, u, assigned to individual decision rules in a strategy s are identified using simulation and multiobjective optimization tools. Here, the system model uses an optimization module that calls simulations in parallel. The optimization procedure uses an ɛ-multiobjective genetic algorithm (ɛMOGA) to search for the set of Pareto-optimal strategies S * from S, defined according to the objective functions W = {w 1 , … ,w z }. ɛMOGA follows the ɛ-dominance concept, which allows one nondominated solution (strategy, s) to occupy a hyperbox of size ɛ in the objective space (Deb et al., 2006;Laumanns et al., 2002). A solution ɛ-dominates another solution if it performs better in at least one dimension of the objective space. MOGAs and MOEAs are ideal for this application because the objectives do not need to be weighted (a priori), as solutions are selected after the trade-offs are realized (a posteriori) (Hurford et al., 2014).
The multiobjective optimization search starts with an initial population of solutions. Each solution contains candidate threshold values that govern when an option is triggered in simulation. In the first iteration of the optimization procedure, simulation is used to estimate the performance of solutions in the initial population when simulated against the set of uncertainty scenarios, X. For each solution, the expectation of objective values across ensemble X is estimated another solution in the archived population; or (ii) the offspring occupies an empty hyperbox along the frontier of solutions. In the first case, the new solution replaces the dominated solution and the size of the archive population stays the same. In the second case, the size of the population increases in size. If an offspring is not accepted into the archive population, it is added to the current population if it dominates one or more current solutions in the objective space. The offspring is rejected if it is dominated by any solution in the current population.
The water planner must stipulate the criteria for terminating the solution search. Typically, the stopping criteria specify the maximum number of iterations permitted in the optimization search, or defines the minimum number of iterations that see no improvement in the archived ɛ-nondominated solutions.

Baseline "Fixed" Plans
As indicated in Section 1, multiobjective optimization is used to identify the indicator variables that produce the greatest improvement in adaptation portfolio performance relative to "fixed" plans. The "fixed" plans are generated following the optimization process outlined in Section 2.3.2, using objective functions W, uncertainty scenarios X, and set of infrastructure options A. The "fixed" plans specify the precise times at which options are implemented, whereby an option is triggered if a rule of the following form is satisfied: IF "current time step exceeds time threshold" THEN "build infrastructure option a with capacity c a , with the option operating functionally after the lead time m a has elapsed" ELSE "do nothing." The "fixed" plans are akin to traditional water resources plans (Beh et al., 2014;Borgomeo et al., 2016Borgomeo et al., , 2018Jeuland & Whittington, 2014), which identify the specific time of implementation (i.e., year) for individual planning options. The "fixed" plan is an appropriate counterfactual to the rule-based strategies because it offers no flexibility under future uncertainties. Unlike the strategies, "fixed" plans build an option in the same time step for all scenarios, regardless of realized conditions.

Indicator Selection
Indicator selection is conducted by comparing the Pareto-optimal frontier of the "fixed" plans against the frontiers of the Pareto-optimal planning strategies. The frontier that produces the largest hypervolume improvement in performance across the objective space, relative to the "fixed" frontier, signposts the most suitable indicator (or combination of indicators) for the planning problem.

Step 4-Strategy Testing
Upon selecting the indicators and corresponding Pareto-optimal frontier of rule-based strategies, the planner must deliberate upon which strategy to choose on the Pareto frontier (Deb, 2001).
Step 4 uses an out-ofsample ensemble of scenarios X * to stress test candidate strategies from the selected Pareto-optimal frontier. The ensemble of scenarios X * contains a wider range of the same uncertainty conditions represented in ensemble X. Out-of-sample simulation is important as the ɛMOGA solutions may be over-fit to the scenarios used in the optimization procedure (Herman et al., 2020), and therefore risk being vulnerable to a wider range of uncertainty not necessarily captured in the ensemble of uncertainty scenarios X (Kasprzyk et al., 2013). We stress that this step is designed to provide additional information on candidate planning solutions to the water planner. The out-of-sample simulation can be used to test candidate strategies along the frontier, but final selection should be based on the water planner's cost and risk preferences.

Background
The Thames Basin case study has been widely applied in the decision-making literature (Erfani et al., 2018;Huskova et al., 2016;Kingsborough et al., 2016;Murgatroyd & Hall, 2020). The case study is used here to illustrate the benefits of incorporating indicator-informed decision rules in long-term planning for regions subject to significant supply-side and demand-side uncertainties.
In England and Wales, water companies' water resource management plans are published every 5 years and outline a "secure and sustainable set of options to supply [your] customers with water over the long-term" (Environment Agency, 2018, p. 1). We apply our method to the London water resource zone (WRZ), operated by Thames Water Utilities Ltd. Thames Water's (2019) water resource management plan revealed that increases in demand from a growing population and increased drought risk threaten the supply demand balance, however how these threats will grow in the future remains uncertain. This region is chosen because we believe the uncertain features in the Thames Water supply system can be monitored using the proposed rule-based framework to provide key information for the water planner and improve long-term decision-making under uncertainty.
The London WRZ covers an area of 9,948 km 2 and supplies 3.7 million households (Thames Water, 2019). Water is principally supplied from direct abstractions from the River Thames to impounded reservoirs, with additional water supplied from the underlying chalk aquifer. The Lower Thames Operating Agreement regulates the operation of the London water system, with abstraction limits and environmental flows restricting the amount of water that can be withdrawn from rivers and reservoirs in the system. The Lower Thames Operating Diagram (supporting information S1) defines the levels in which total London reservoir storage must fall below to justify restrictions on water users. The severity of restrictions increases from Levels 1 to 4 and the water company has targets for the frequency with which such restrictions may be imposed.

Scenarios of Future Conditions
Uncertainty is incorporated into the planning framework by using an ensemble of scenarios of possible futures. In consultation with the water company and results from previous research in the Thames study region (Borgomeo et al., , 2018, we focus upon uncertainties in water demand and water availability, which are the main determinants of the reliability of water supplies.

Climate
Advances in climate modeling have led to large ensembles of climatic conditions, each forecasting the different pathways the climate may evolve throughout the 21st century. Individual ensemble members typically represent a realization of uncertainty in climate model projections and human forcing factors, and so naturally fits within the scenario framework introduced here.
Simulations of time series of weather variables for possible future climate scenarios were obtained from the Weather@Home (W@H) modeling framework (Guillod et al, 2017Massey et al., 2015), which consists of a global climate model (HadAM3P) with nested regional climate model (HadRM3P), and is driven with historic (HadISST) and projected (CMIP5) sea surface temperatures (SST) and sea ice. W@H is a "citizen science" project that benefits from the unused computer power of thousands of participants to generate large ensembles of climate model runs. Previous work has used W@H to investigate sea surface temperature driven extreme weather events (Haustein et al., 2016), heat-related mortality (Mitchell et al., 2016), flood damage , and national scale drought Rudd et al., 2019). The W@H climate sequences have also been used to showcase the risk-based planning framework developed by Borgomeo et al. (2018).
The Weather@Home data set contains precipitation (P) and evapotranspiration (PET) time series at a 25 km resolution, which is then downscaled to 5 km resolution. In the downscaling process, P is bias-corrected using a linear approach with monthly bias correction factors. A complete description of the bias-correction methodology, validation process, and results is presented in Guillod et al., (2017Guillod et al., ( , 2018. The full data set of downscaled P and PET is grouped into three ensembles: • 100 realizations of the Baseline period , generated using different initial atmospheric conditions, and historic SST and sea ice records from HadISST (Rayner et al., 2003;Titchner & Rayner, 2014 Because RCP8.5 represents the upper bound of projected global emissions scenarios, the future W@H datasets contain synthetic drought events that are more severe than have been observed in the historical record . This makes the weather sequences ideal for water resources impact assessments and portfolio planning that consider the risks of system failure under extreme events.

Hydrology
Flows in the Thames Basin were simulated using the rainfall-runoff modeling framework, DECIPHeR (Dynamic fluxEs and ConnectIvity for Predictions of HydRology), developed by Coxon et al. (2019). DECIPHeR is well suited to this study as it is able to simulate flows across multiple spatial scales efficiently and quickly, and has been shown to perform well against multiple evaluation metrics when used to simulate historical flows at 1,366 river flow gauges across Great Britain (Coxon et al., 2019).
DECIPHeR has previously been used to generate ensembles of historical and future naturalized flows for 338 catchments across England and Wales . In Dobson et al.'s study, historical flows were first simulated using daily observed P (Tanguy et al., 2019), PET (Robinson et al., 2016), and 10,000 different model parameter sets (Coxon et al., 2019). The ensuing flow ensemble was evaluated against daily naturalized flows supplied by England's Environment Agency, and the best parameter set for each catchment was identified according to NSE and logNSE scores. Future flows for each of the 338 study catchments were simulated using the best parameter set for each catchment and downscaled W@H P and PET weather sequences. This study follows the same hydrological modeling framework outlined by , using the best performing DECIPHeR parameter set for catchments in the Thames Basin to simulate future flows. For the purpose of this study, the 30-year Near and Far Future flow scenarios are transformed into 100 longer 80-year transient time series, as explained in supporting information S2. Further information about the behavior of the transient DECIPHeR flow ensemble is provided in supporting information S3.

Demand
Uncertainties in demand forecasts are a consequence of uncertain changes in population (Erfani et al., 2018; Jeuland & Whittington, 2014), economic growth (Vörösmarty et al., 2000), and trends in water consumption (Fletcher et al., 2017;Trindade et al., 2017). In catchments where domestic water supply is the dominant water use, like the one considered in this study, demand can be parsimoniously represented through a product of population and per capita consumption, both of which are uncertain.
One hundred demand scenarios are generated for the London WRZ, which is represented by a demand node within the system model. Baseline forecasts of water demand (Ml/d) from Thames Water's WRMP19 (Thames Water, 2019), containing data on per capita consumption projections (Environment Agency, 2019), were scaled using 10 population forecasts for the London area from 2014 to 2039 obtained from the Office for National Statistics (Office for National Statistics, 2016). The forecasts are lengthened using linear extrapolation. A further 90 demand scenarios were generated by scaling with a sample from a uniform distribution on the range [0.9, 1.1]. For the remaining demand nodes in the model, 10 scenarios of municipal water demand were estimated using the dry year annual average distribution input at WRZ level and scaled according to Thames Water water resource planning tables and demand profiles (Thames Water, 2019). Additional information on the London WRZ demand ensemble is presented in supporting information S4.

Ensemble
Each demand scenario is randomly coupled with one of the 100 DECIPHeR transient flow scenarios, forming the ensemble X. The out-of-sample ensemble, X * , is constructed in the same way, but the ensemble size is increased to 1,000. Supporting information S5 illustrates the distribution of flow and demand conditions in ensemble X and X * .
In this application, the demand scenarios are assumed independent of climatic conditions; an assumption which is warranted by an earlier study of the impact of climate change on water consumption in the Thames water supply area (HR Wallingford, 2012). The authors recognize that this assumption may not be true for other regions, and stress that water planners should consider the impacts of climate change on water demand in their own water supply systems when constructing scenarios of uncertainty.

Planning Options
Five infrastructure planning options are considered in this study (Table 1). The planning options are akin to the supply side options developed by water managers in Thames Water (Thames Water, 2017). As some options have multiple possible capacities, the optimization search is constrained to prevent the implementation of the same option twice in one strategy. Each option has a construction lead time which determines the elapsed time between triggering an option and an option becoming operational. The planning options are assigned two types of financial cost: the net present value of capital expenditures (capex, CP) denotes the cost incurred when an option is first triggered and is specific to the option capacity; operational expenditure (opex, OP) represents the cost of operation for each day the option is utilized (after the construction lead time has elapsed).
Four water demand management policies, "reduced," "maintain," "effective," and "enhanced," are also included in the search for optimal strategies. The schemes are fixed in time and represent increasing levels of demand reduction and their associated costs. Additional information on the demand management policies is provided in supporting information S6. We do not apply flexible rules to demand management because the schemes are initiated on day one of the strategy, regardless of scenario. This modeling choice is consistent with the expectation that water companies should take actions to reduce water demands and improve water use efficiency both now and into the future (HM Government, 2018).

Planning Objectives
We consider two planning objectives for the case study. Objective function f(1) minimizes the expected total present cost of adaptation, averaged across the scenarios of uncertainty in ensemble X: where k represents the total number of scenarios, q is the total number of time steps in a scenario, CP t and OP t are the capital and operating expenditures incurred at the time step t, and r is the discount rate, set at 4.5% in this case study example. This rate is consistent with the HM Treasure Green Book discount rates (Thames Water, 2019). Objective function f(2) minimizes the expected restriction cost of Levels 3 and 4 water use restrictions, averaged across the scenarios of uncertainty in ensemble X:  (2017).

Table 1 Supply Side Options Included in the Search of Optimal Strategies and Fixed Plans for the Thames Water System
where (E l ) t represents the expected cost incurred from a water use restriction of level l in time t. We use the economic estimates of the cost of water restrictions defined by Borgomeo et al. (2018), who count the total days of restrictions imposed in a scenario and weigh them according to their severity using a relative weight derived from surveys of customer willingness to pay to avoid restrictions. Level 4 (l 4 ) events are 40 times the expected daily economic losses (E) of Level 3 (l 3 ) events, costed at £282M and £6.8 M per day, respectively (Lambert, 2015). Economic estimates for Levels 1 and 2 restrictions are orders of magnitude less than Levels 3 and 4 restrictions and thus excluded from this study. As stated previously, f(2) is considered a risk-based metric as it considers the frequency, severity, and economic consequences of an event of interest to water managers.

System Model and Optimization Procedure
This study uses WATHNET-5, a simulation model of water withdrawals, storage, and releases with an inbuilt multiobjective optimizer (Kuczera, 1992). The model represents a water resource system as a network of arcs (conduit and stream) and nodes (reservoir, demand, river inflow, groundwater, junction, and waste). The Thames model used here contains 9 demand nodes, 12 surface water nodes, and 7 groundwater nodes. In this study, groundwater inflows are set at the license abstraction limit, defined by the Environment Agency (Environment Agency, 2013). The scenarios of flow and demand are input into the model and a minimum cost flow problem is solved at every time step, t, in the simulation. Each arc has a transfer cost; the higher the transfer cost the less likely the arc will be chosen to transport flow. Carryover arcs ensure flow is not wasted. The performance of solution s relative to the two objective functions is evaluated after all scenarios (X) have been simulated. The performance criteria inform the optimization module.
An εMOGA (Laumanns et al., 2002) solves the optimization problem, using the ε-dominance concept to identify nondominated solutions in the objective space. The algorithm is set to binary, one-point crossover, with bitwise mutation to prevent converging to a local optimum and ensure the solutions identified are diverse and well distributed Mortazavi-Naeini et al., 2012. The probability of crossover is set to 1.00, diagonal swap in one-point crossover to 0.20, mutation to 0.05, and inversion to 0.05. The multiobjective optimization was performed on high-performance computers with Intel cores; each run used 20 nodes with 16 individual cores.

Observable Indicators
As established in Section 2.2, the observable indicators selected for Step 3 should be indicative of emerging trends in the water resource system (Haasnoot et al., 2018). River flow and water demand were identified in discussions with stakeholders in Thames Water and the Environment Agency as being most relevant to their adaptation planning in the London region. Nominated indicators included the number of days a flow statistic is less than the equivalent historical statistic, moving averages of a flow statistic, average population growth, and average demand. Ultimately, five different flow statistics (Q80, Q85, Q90, Q95, and Q98) with five distinct observation windows (1, 5, 10, 20, and 30 years), and three demand measures (1, 5, and 10 years total annual demand) were shortlisted. Here, a flow statistic of Q80 represents the value when flow is exceeded for 80% of the observation window. We also consider five reservoir statistics (1, 5, 10, 20, and 30 years average simulated reservoir level), as reservoir levels respond directly to changes in river flow and demand. Finally, we include an indicator related to time (year in simulation) to account for nonstationarity in the observable indicators.
First, we quantify the expected risk of water shortages, To ensure parsimony in the monitoring system, we do not fit models with a reservoir indicator and a flow/or demand indicator. Models are eliminated if the p-value for coefficient estimates is greater than 0.05. If a flow, demand, or reservoir indicator is not significant when we include the time indicator, the decision rules using that indicator are assumed to be no better than rules that use only time. Candidate indicators and/or combinations of indicators are selected based on their ability to predict the risk metric, as indicated by a model's Ordinary R 2 value and RMSE.

Strategy Generation and Testing
WATHNET-5 is run under the 100 transient scenarios (X) at a daily time step. The optimization process is conducted for each candidate indicator or combination of indicators. As the optimizer module runs it performs a comprehensive search for the best combination of option capacities and decision rule thresholds for the candidate indicators. The Pareto-optimal strategies generated in each search contain the rule-based strategies that determine if and when an option is implemented in the Thames Water system. The range of solutions illustrate the trade-off between adaptation cost (objective function f(1)) and expected restriction cost (objective function f (2)). In this step, we also generate the Pareto-frontier of "fixed" plans, which is used as the counterfactual to the rule-based strategies.
We calculate the hypervolume of each Pareto-frontier using the Hypervolume Computation script for Matlab by Johannes (2020). The upper bound reference point for the hypervolume calculation is set at maximum objective function values achieved on the fixed plan frontier. This is to ensure the hypervolume estimations for the rule-based frontiers are comparable. For each frontier, the hypervolume script computes the Lebesgue measure of the objective function values by the means of Monte Carlo approximation, with a sample size set at 100,000. We repeat the hypervolume script 100 times for each frontier. The final hypervolume estimates are calculated by averaging the hypervolume values from the 100 iterations.

Indicator Screening
This section presents results from the observable indicator screening conducted in Step 2 of the planning framework. Table 2 presents the highest performing linear regression models from the screening; Data Set S1 presents the complete results from the indicator screening, including results for the wider set of models evaluated, details about model coefficients, and p-values. Table 2 is divided into subsections corresponding to indicator type ("reservoir level" or "demand and flow"), and by the window of observation for the flow statistics (1, 5, 10, 20, and 30 years).
MURGATROYD AND HALL  Several conclusions can be drawn from the results. First, predictive ability increases when time is included as an indicator in the linear model. This is unsurprising as the linear model response variable (risk metric) is a calculation of the aggregate restriction cost over the remainder of the time series. The results strengthen the argument that using time as an indicator helps to avoid issues of non-stationarity in the other observable indicators. Second, 1-year annual demand outperforms demand indicators with a longer observation window. For this reason, we only present results from three-term models that use a 1-year annual demand indicator. Third, predictive ability is highest for linear models that use a flow, demand, and time indicator. Models in this form outperform models that use average reservoir levels (models 1-5). The three-term models exhibit distinct RMSE scores. In general, the highest performing three-term models contain extreme low flow statistics (Q95 and Q98). This suggests that the performance of the Thames supply system, as indicated by the prediction of restriction cost, is more sensitive to drought events. Finally, for the three-term models using time, flow, and 1-year annual demand, RMSE improves as the window of observation for the flow statistic increases from 1-, to 5-, to 10-, to 20-year periods. No improvement is observed between 20-and 30-year flow statistics. The highest performing model uses a 20-year Q95 flow statistic, 1-year annual demand, and time (model 24). This combination of indicators is used in Step 3 to create decision rules of the form: IF "20yrQ95 is less than threshold value 1" AND "1yrDem is greater than threshold value 2" AND "current year is greater than threshold value 3" THEN "build infrastructure option a with capacity c a , with the option operating functionally after the lead time m a has elapsed" ELSE "do nothing." For illustrative purposes, we choose two additional combinations of indicators to use in Step 3 (Table 2). This choice is based on the performance of the models in each subgroup defined by flow observation window. Models with 1-year observation windows (models 6-10) are excluded because one or more coefficients are not significant (p-value > 0.05). Models with 30year windows (models 26-30) are excluded because they exhibit low observability (Raso et al., 2019) compared to the shorter 20-year flow indicators. Consequently, models 15 and 20 are selected as they are the highest-ranked models within their subgroup.

Strategy Generation
This section presents the results from the four multiobjective searches conducted in Step 3 of the planning framework. Figure 3 shows the Pareto-optimal solutions for each optimization run and illustrates their relative performance in the objective space. Each solution outlines a 60-year plan or rule-based strategy for a planning period from 2030 to the end of 2089. The fixed plans specify the year (elapsed time) in which an option is triggered; the rule-based strategies outline the time, flow, and demand conditions that must be exceeded for an option to be triggered. The three combinations of indicators identified in Step 2 (Section 4.1) are herein referred to as Indicator Group A (model 15), Indicator Group B (model 20), and Indicator Group C (model 24).
In all four sets of solutions, the most expensive rule-based strategies and fixed plans contain multiple options, each with conditions that trigger early option implementation. The least expensive solutions avoid costly options and delay system modification until later in the planning period. These low-cost solutions result in a high frequency of severe restrictions. Moreover, all of the fixed plans and rule-based strategies presented in Figure 3 choose the "reduced" demand management policy, which is the cheapest of the four available. The most frequently selected options (as identified by a positive option capacity) in the fixed plans and rule-based strategies are broadly similar, with over 77% of solutions in each search selecting Option 1, and over 99% picking Option 2. Option 3 is selected in 22% of fixed plans, and 28% and 29% of strategies with indicator Groups A and B, respectively. Only 4% of strategies with indicator Group C assign a positive capacity for Option 3. Option 4 is rarely chosen in the fixed plans, or in strategies with Groups A and B indicators (<7%), but is selected in 28.4% of strategies in Group C. Finally, Option 5 is more popular in the rule-based strategies (>42%) than fixed plans (33.6%).  Figure 3 indicates that rule-based strategies are most likely to outperform fixed plans in areas of the objective space where expected total costs are between £2.3 M × 10 2 and £2.8 M × 10 2 , and expected restriction costs are between £2.5 M × 10 4 and £12.5 M × 10 4 . In this region, the average decrease in expected total cost, relative to fixed plans with an equivalent restriction cost, is 0.91% in Group A strategies, 0.86% in Group B, and 1.35% in Group C. In the same region, the average decrease in expected restriction cost, relative to fixed plans with an equivalent expected total cost, is 8.33% in Group A, 8.04% in Group B, and 13.1% in Group C. The flexible solutions perform better on average than their fixed counterparts because options are only included in the system when the observational trends in demand and flow indicate so, and therefore the costs associated with construction and infrastructure operation occur only in years when the options are needed and not sooner (Steinschneider & Brown, 2012). The inclusion of a time indicator in the rule-based strategy search also helps to avoid options being implemented as a result of random shocks to the system which do not necessarily represent the long-term trends emerging from the climate and demand conditions. For example, a rare extreme event at the beginning of a scenario may satisfy the conditions required to trigger an option, but this option may not be needed for the remainder of the scenario if the long-term trend is milder than the early extreme event signposted. The time indicator therefore ensures that the options are not implemented earlier than is essential in each scenario. This effect is less important for an indicator with a long observation window, as the increased number of observations has a smoothing effect on the indicator trend.   flow threshold, low demand threshold, and time threshold of 2050. In this case, the flow and demand rules are satisfied early in the simulation, but the option is not triggered until the time threshold is surpassed. In this format, the rule-based strategies can only equal or improve the fixed plan performance. This explains the convergence of solutions in the tails of the Pareto-frontiers, along which the rule-based strategies reproduce the adaptation actions enacted in the fixed plans. Overall, the results indicate that this approach is most appropriate for agencies with a balanced preference over the planning objectives. Agencies who value one objective over another may not benefit from the rule-based strategies, as the solutions in the tails of the Pareto curves offer no improvement on their fixed counterparts.

presented in
Step 2 (Section 4.1), which indicates that this combination of indicators was the best predictor of future risk. For this reason, we select strategies using indicators from Group C for out-of-sample testing in Step 4. Data Sets S2 and S3 present the full set of Pareto-optimal fixed plans and Group C strategies from the optimization experiment. The remainder of this section describes features of the strategies in Group C. Together, the plots in Figure 4 indicate that Option 2 is the most likely option to be triggered in the Group C strategies, followed by Options 1 and 5. Options 2 and 5 are likely to be triggered early in the simulation, whilst Option 1 is more likely to be triggered later. Options 3 and 4 are least likely to be triggered, requiring observations of severe low flow events and/or high annual demand.

Strategy Testing
A selection of rule-based strategies from Step 3 is resimulated against the new scenarios of randomly paired flow and demand time series in ensemble X * . This out-of-sample simulation allows comparison of strategy performance under future scenarios not yet tested against. The rule-based strategies selected for resimulation are marked S1-S4 in Figure 5 and described in Table 3. The four strategies were selected from the key objective area highlighted in Figure 3, and are equally spaced along the Pareto-curve (as measured by expected restriction cost). The objective performance of the resimulated solutions is represented by the dark blue diamonds in Figure 5a. Figures 5b-5e present the empirical cumulative distribution of estimated total costs and restriction costs across the scenarios in ensemble X and X * .
Each resimulated solution experiences a shift in objective performance. Small changes are observed in expected total cost, with the largest average increase in expected cost from the resimulated strategies observed under S2 (+0.4%). Larger differences emerge between the estimations of expected restriction cost, with average increases of +5.2%, +5.3%, +4.0%, and +13% under strategies S1, S2, S3, and S4, respectively. Variance in total cost increases from left to right along the frontier, whilst variance in restriction cost decreases. The high variance in capital and operating expenditure from the rightmost strategies is attributed to more variable option implementation across the ensembles. The same pattern in variance exists along the frontier of fixed plans presented in Figure 3. In fixed plans, however, variance in total cost is driven solely by differences in operating expenditures between scenarios, rather than capital expenditure. This is because fixed plans implement an option in the same year of each simulation, regardless of the conditions observed.
In spite of the small differences in performance, all four resimulated rule-based strategies perform better than the original fixed plan frontier. The out-of-sample simulation suggests that: (i) the selected strategies are not over-fit to the scenarios in ensemble X used in the optimization search, and; (ii) the rule-based strategies out-perform the fixed plans across a wide range of future uncertainties. Figure 6 illustrates the frequency and timing of option implementation of rule-based strategy S3 (b), fixed plan F1 (c), and fixed plan F2 (d) when resimulated against the 1,000 new scenarios in ensemble X * . Fixed plans F1 and F2, as identified in Figure 6a, are chosen for their comparability in objective performance to strategy S3 when simulated against the original scenarios in ensemble X. All three solutions feature rules that trigger Option 1 with capacity 600 Ml/d and Option 2 with capacity 300 Ml/d. Plans F1 and F2 implement Option 1 in 2063 and 2062, and Option 2 in 2031 and 2039, respectively.
The crosses in Figures 6b-6d represent the observable conditions (20-year Q95 and annual demand) when an option is implemented in each scenario. The colored boxes represent the decision boundaries (thresholds) of the flow and demand decision rules in strategy S3. Figure 6b indicates that under strategy S3, Option 2 is likely to be implemented early in the simulation, whilst Options 1 and 3 will be triggered later. The plot demonstrates when and in how many scenarios an option in S3 is triggered, and how effectively the decision space within the box is utilized under the future scenarios in X*. Note that in Figures (c) and (d)  . Performance of resimulated strategies. Panel (a) plots the expected total cost and expected restriction costs for resimulated rule-based strategies (S1-S4). Panels (b-e) plot the empirical cumulative distribution of total cost and restriction cost from strategies S1 to S4 when simulated against ensemble X and X * .   than 560,000 Ml. In spite of this early adaptation, both resimulated fixed plans produce a higher expected restriction cost (£5.3 × 10 4 and £5.8 × 10 4 , respectively) in the out-of-sample simulation compared to the resimulated strategy S3 (£5.2 × 10 4 ). This is despite plan F1 exhibiting a similar expected restriction cost to S3 in the original simulation under ensemble X.
Overall, the resimulation experiment suggests that the fixed plans offer a higher risk of water use of restrictions than the equivalent rule-based strategy, and may incur unnecessary financial costs when the need for adaptation is low. The water planner should use this resimulation exercise to ensure that the strategies most suited to their cost and risk preferences are robust to a wider ensemble of future conditions than those used in the initial strategy generation stage.

Conclusions
The benefits of flexibility in water resources planning have long been recognized (Fletcher et al., 2019), but water resources planners have lacked practical tools to interpret system changes and implement adaptive strategies. In this study, we have proposed and demonstrated a pragmatic approach in which strategies are articulated in terms of a set of options, a set of indicators that need to be monitored, and decision rules that implement the options conditionally upon observations of the indicators. Within our decision rule framework, infrastructure options are triggered only when the observable indicators detect emerging trends in the water system (Raso et al., 2019;Robinson & Herman, 2019). The objective functions selected for the multiobjective optimization allow the water planner to trade-off the performance of strategies that are comprised of supply side and demand-side options, with the total discounted cost of implementation. Performance is measured in terms of observable outcome variables of relevance to decision makers and water users: the frequency, severity and duration of water shortages, and the associated economic costs based on estimates of willingness to pay to avoid shortages. Visualization of performance with respect to multiple objectives aids the evaluation of the optimal strategies when simulated against an extensive library of future flow and demand scenarios (Kasprzyk et al., 2013). Given the decision maker's tolerance to risk, economic budget, and minimum performance requirements (Borgomeo et al., 2018;Stakhiv, 2011), they can select the most appropriate strategy and corresponding decision rules for their water system.
The observable indicators and decision rule thresholds are selected quantitatively, through regression analysis, simulation, and multiobjective optimization. This is more rigorous than previous research where choice of adaptation indicators and triggers are judgment-based and informed by stakeholder dialogue (Haasnoot et al., 2013;Zeff et al., 2016). Nonetheless, stakeholder engagement is central to our proposed management process, informing the goals of system operation and navigating trade-offs between those goals. However, the choice of the best indicators should not be a matter of stakeholder judgment-it is a technical question given the overall goals and the properties of a system's observable behavior (Herman et al., 2020).
To avoid the intractable computational expense of "brute force" indicator optimization under uncertainty we have proposed a two-step approach, which first screens possible indicators according to their relevance to system performance, to obtain a parsimonious subset of possible indicators. In the second stage, indicators are chosen by selecting those that yield Pareto-optimal system performance when incorporated in a rule-based strategy. Unlike previous studies that use one indicator of change (Kirsch et al., 2013;Steinschneider & Brown, 2012;Zeff et al., 2016), this study monitors and responds to multiple indicators, so the adaptive strategy is therefore tailored to different drivers of change.
This framework uses OLS regression to identify the indicators best correlated with the metric of system performance; a methodological decision that aligned with the aim to develop a practitioner-oriented decision-making framework. Furthermore, the regression experiment was used to identify indicators in a water supply system with a well-documented relationship between water availability, public demand, and the frequency of water use restrictions ). Yet, other water systems may exhibit more complex relationships between indicators and performance metrics. For this reason, adoption of a more sophisticated feature selection method (e.g., Raso et al., 2019;Robin-son & Herman, 2019) that accounts for nonlinearities in water system performance would merit further investigation.
In the application to the Thames Basin in England, it was found that statistics of low river flows and water demand, as well as time, represented the most informative set of indicators. The Thames case study illustrates the benefits of this approach, when compared with fixed plans that were optimized by their performance on average over the whole ensemble. These fixed plans have high potential for regret in scenarios in which they impose an unjustified financial burden by investing in new supply capacity in futures where the increase in demand or reduction of supply is less than anticipated. Implementing a fixed plan instead of a rule-based strategy requires the water planner to commit to long-term large infrastructure projects without knowing exactly how future conditions may evolve. We expect that a rule-based approach that uses a common-sense indicator like reservoir storage could also perform better than a fixed plan. We reason that it would not perform as well as our optimally selected set of indicators because: (i) our indicators are chosen based on a quantified assessment of their expected performance; and (ii) by using disaggregated observables we can better diagnose the factors that contribute to change. In practice, we would expect optimal rulebased strategies to perform better than arbitrary rule-based strategies, which would on the whole perform better than a fixed strategy. The arbitrary strategies could be tested using the proposed framework, but the performance would depend on the indicator one happened to choose. The fixed comparison does not share the same problem of ambiguity in the indicator selection, and so is an appropriate counterfactual to the rule-based strategy.
In the example presented here, the rule-based strategies in a key region of the objective space reduce expected capital and operating cost on average by 1.35% for a given level of restriction risk, and reduce expected restriction cost on average by 13.1% for a given intervention cost. Out-of-sample resimulation demonstrated how rule-based strategies also perform notably better under unforeseen conditions. Results showed that expected restriction costs under resimulated fixed plans were up to 12.4% more than the expected restriction costs of the comparative resimulated strategy. Whilst we do not explicitly include robustness as a planning objective in this framework, we believe that the decision rules may be considered to be robust in that: (i) they have been optimized under a very wide range of possible future conditions, with a focus upon extremely undesirable performance (i.e., the most severe water shortages); and, (ii) they have been further tested in unforeseen conditions not used in the optimization and have proved to perform acceptably well and better than the baseline alternative.
Though we believe the proposed approach represents a significant step both in rigor (for the reasons outlined above) and practicality (because decision rules are inherently intuitive), we recognize that there are still limitations in our method that require further research and development, besides dealing with the computational expense. The objective of preserving and enhancing the environment, which is central to sustainable water resources management, has been implemented as a constraint on the abstraction rules governing surface water withdrawals. Though it is unlikely that there would be any willingness to relax this constraint, water resources managers should be looking for ways to enhance the aquatic environment and reduce the impact of water withdrawals, which should be included as a further objective in the optimization method, combined with more explicit environmental indicators (Murgatroyd & Hall, 2021;Poff et al., 2016). Furthermore, this analysis deals only with water quantity, yet water quality is crucial for the aquatic environment and for the reliability of potable water supplies. Mortazavi-Naeini et al. (2019) examined trade-offs with river and reservoir water quality and quantity for public water supplies in the Thames Basin, but using their coupled water quality modeling system within the optimization framework presented here may be computationally intractable. Moreover, roughly a fifth of the water supply in the Thames Basin is withdrawn from groundwater sources, which interact with surface flows and are sensitive to climatic conditions, so a complete analysis would better account for groundwater and its system interactions. Finally, limitations on new water sources in the Thames Basin mean that large inter-basin transfers may be necessary to improve water system resilience to 21st-century drought events (Murgatroyd & Hall, 2020). To account for this, future studies should extend the number of objectives and observable indicators presented in this planning framework to also include those that are germane to neighboring basins. Though the complete system for adaptive management of water resource systems to achieve sus-tainability in challenging future conditions has yet to be fully integrated, we believe that the significant building blocks are now in place.