A Flexible Framework for Simulating the Water Balance of Lakes and Reservoirs From Local to Global Scales: mizuRoute‐Lake

Lakes and reservoirs are an integral part of the terrestrial water cycle. In this work, we present the implementation of different water balance models of lakes and reservoirs into mizuRoute, a vector‐based routing model, termed mizuRoute‐Lake. As the main advantage of mizuRoute‐Lake, users can choose between various parametric models implemented in mizuRoute‐Lake. So far, three parametric models of lake and reservoir water balance, namely Hanasaki, HYPE, and Döll are implemented in mizuRoute‐Lake. In general, the parametric models relate the outflow from lakes or reservoirs to the storage and various parameters including inflow, demand, volume of storage, etc. Additionally, this flexibility allows users to easily evaluate and compare the effect of various water balance models for a lake without needing to reconfigure the routing model or change the parameters of other lakes or reservoirs in the modeling domain. Users can also use existing data such as historical observations or water management models to specify the behavior of a selected number of lakes and reservoirs within the modeling domain using the data‐driven capability of mizuRoute‐Lake. We demonstrate the flexibility of mizuRoute‐Lake by presenting global, regional, and local scale applications. The development of mizuRoute‐Lake paves the way for better integration of water management models, locally measured, and remotely sensed data sets in the context of Earth system modeling.


Introduction
Lakes store a large fraction of terrestrial water and have considerable impacts on the terrestrial water cycle, global and local climate, and ecosystems (Biemans et al., 2011;Samuelsson et al., 2010;Shugar et al., 2020;Thiery et al., 2015;Xu et al., 2018).Additionally, reservoirs that are constructed to store water for human activity strongly alter natural river systems, and enable irrigation in drier seasons, stable water supply for urban or industrial sectors, and hydropower production.From the year 1950-2000, the total volume of water stored by large dams increased from 1,000 to 11,000 km 3 (Chao et al., 2008).
Since lakes and reservoirs are an integral part of the Earth system, accurate representation of these water bodies in Earth system models is essential to simulate land-atmosphere fluxes (Pokhrel et al., 2016;Vanderkelen et al., 2020).However, the representation of lakes and reservoirs in Earth system models is a challenging task.Lakes have an impact on three major conservation laws: (a) conservation of mass, including water and sedimentnutrients (Wisser et al., 2013;Yang et al., 2007); (b) conservation of energy, including energy fluxes from and to the water bodies, evaporation, and phase change such as ice cover (Abbasi et al., 2016;Balsamo et al., 2012;Bonan, 1995;Croley & Assel, 1994;Subin et al., 2012;Vanderkelen et al., 2020Vanderkelen et al., , 2021)); and (c) conservation of momentum, including wave propagation, circulation of water, and events such as dam failure (Xiong, 2011).Among these conservation laws, water balance has understandably attracted substantial attention because the inflow and outflow fluxes to and from lakes and reservoirs are directly linked to human activity for irrigation, food production, flood prevention, and risk mitigation efforts (Pokhrel et al., 2016;Postel et al., 1996;Wada et al., 2014).
River, lake, and reservoir routing schemes are often embedded within existing hydrological, land surface, or water management models (Guinaldo et al., 2021;Hanasaki et al., 2008;Yassin et al., 2019;Zajac et al., 2017;Zhou et al., 2020).Additionally, terrestrial hydrological processes are often simplistically implemented in water management schemes used for water accounting.The land modeling community has a mandate to explore a myriad of terrestrial system processes at continental or global scales rather than the implementation of often complex operational rules for each reservoir.Given the importance of lakes and reservoirs in water management and Earth system modeling, there is a pressing need for frameworks that can simulate different types of lake and reservoir models within a defined system of rivers, lakes, and reservoirs network.
In this study, we introduce mizuRoute-Lake to simulate lakes and reservoirs within the routing model mizu-Route (Mizukami et al., 2016(Mizukami et al., , 2021)).The advantage of mizuRoute-Lake can be summarized as the flexibility in seamless alternative lake and reservoir models.Despite the development of the standalone lake and reservoir water balance models, frameworks that integrate multiple lake and reservoir models, with different levels of complexity and hence data availability, are needed.mizuRoute-Lake implements various lakes and reservoir models in the fabric of a river-lake network topology.As an example, users can adjust the complexity of lake and reservoir models based on the available data for each lake or reservoir in a river basin.Additionally, switching between the different options for lake and reservoir models is seamless, and therefore the impact of different lake water balance models can be evaluated in isolation from changes in river network topology.The runoff field that drives mizuRoute-Lake can be the output of any hydrological or land surface model with various spatial configurations (hydrological response units or grids).This flexibility provides the user with a routing model that is agnostic to the land surface or water management host models (Nazemi & Wheater, 2015).mizuRoute-Lake also has the capability of data-driven simulations of lake and reservoir volume.In this case, users can provide target storage/volume or abstraction and injection to/from a lake or reservoir or any river segments.These values can be provided from various sources such as water management models (e.g., Water Evaluation And Planning System, WEAP, Yates et al., 2005) or Artificial Neural Network or Agent-Based Models (Ehsani et al., 2016;Giuliani & Castelletti, 2013) or historical observations or future scenarios.
The vector-based nature of mizuRoute-Lake allows the use of various river network topologies, subbasins, or grids.Additionally, the vector-based nature of mizuRoute-Lake makes it easier to use readily available river network topologies and information such as the NHDPlus data set which is the underlying river fabric of the U.S National Water Model (Maidment, 2017;Salas et al., 2018).This flexibility allows for better representation of the lakes and reservoirs, without the need to rasterize water bodies.Recent studies include lakes and reservoirs in vector-based models for large-scale applications in Earth system models (in the RAPID model by Tavakoly et al., 2021), though these capabilities remain in their infancy in comparison to grid-based routing models (e.g., MOSART, and LISFLOOD;Li et al., 2013;Burek et al., 2013;Thurber et al., 2021).Figure 1 provides an example of the vector-based setups for gridded or subbasin network topology with resolvable and unresolvable lakes and reservoirs.mizuRoute-Lake has previously been applied at global and regional scales, for example, to evaluate the effect of parametric lakes and reservoirs, while considering predicted irrigation demands as simulated by the Community Land Model version 5 (Vanderkelen et al., 2022).
In the following sections, we describe the implementation and applications of lake and reservoir water balance models in mizuRoute-Lake.Section 2 outlines the advantages and possibilities that mizuRoute-Lake offers for lake and reservoir modeling at local to continental domains.Section 3 describes the global, regional, and local case study applications of mizuRoute-Lake.Concluding remarks are provided in Section 4.

mizuRoute
mizuRoute is a vector-based routing model (Mizukami et al., 2016).It is open source and is available within the Earth System Community Modeling Portal GitHub repository (https://github.com/ESCOMP/mizuRoute).The mizuRoute model aims to be implemented as a river model component in the Community Earth System Model (CESM) for large-scale water cycle studies.The runoff generated in the land model component, Community Terrestrial System Model (CTSM) in CESM, is directly transferred to the mizuRoute as well as the feedback from mizuRoute to land model (e.g., river volume impacting irrigation demand).For stand-alone use, mizuRoute ingests runoff input from an external source, which can be from any hydrological or land surface, and if needed, mizuRoute remaps runoff to the routing units and sub-basins.mizuRoute then performs two stages of runoff routing.First, in-basin or hillslope routing is performed to ensure the transfer of runoff from the basin area to the river network.Currently, this is performed using a gamma-distribution-based unit-hydrograph (Bhunya et al., 2007;Nadarajah, 2007).Next, the in-basin routed water is passed to the river segment where in-channel routing is performed.The current version of mizuRoute includes four options for in-channel routing schemes: (a) impulse response function, IRF, (b) kinematic wave tracking, KWT, (c) Muskingum-Cunge, and (d) Diffusive Wave.mizuRoute can internally decompose the routing domain into regions and pass calculations to multiple processors for faster simulation (for further reading refer to Mizukami et al., 2021).This parallelization is also used in river-lake combined networks which makes global simulations with millions of river segments and lakes feasible.

Lake Water Balance
When lakes and reservoirs are activated in mizuRoute, additional variables for lake evaporation and precipitation onto lakes must be provided alongside the runoff variable, and mizuRoute remaps those variables to the lake area smaller lakes that are resolved on the river network, (c) a vector-based configuration with lakes, using a low-density river network with unresolved smaller lakes, and (d) a vector-based configuration with higher density river network resolving smaller lakes on the river network.The blue and gray color indicates resolved and unresolved lakes and reservoirs on the river network topology.Resolvable lakes and reservoirs are water bodies that can be simulated within the river network topology with inflow and outflow points, while unresolvable lakes and reservoirs are often referred to as in-basin or in-grid lakes and reservoirs and are simulated in isolation or with indirect links to river network.

10.1029/2022WR032400
if remapping is needed.In addition, mizuRoute provides the capability to use externally calculated or prepared abstraction and/or injection of water to and from river segments, lakes, or reservoirs.
The water balance of lakes and reservoirs in mizuRoute-Lake can be written as: in which S [m 3 ] is the lake or reservoir storage, I and O [m 3 s 1 ] are the inflow and outflow flow of the lake or reservoir, P and E [m s 1 ] are the lake precipitation and evaporation, G [m s 1 ] is the interaction of groundwater with lakes and reservoirs, and A [m 2 ] defines the lake area.F a,i [m 3 s 1 ] is the abstraction or injection flux that is provided as times series; if positive it is an abstraction, given that there is enough water available in the river segment, lake or reservoir, and if negative it is treated as an injection.The time series of abstractions and injections to lakes and reservoirs can come from other models, for example, water management models, or directly from observations or existing regional or global water demand data sets (similar to Shin et al., 2019;Droppers et al., 2020).In addition, mizuRoute-Lake can scale the input variables (runoff, evaporation, and precipitation) using a scale factor and offset (in the form of linear transformation AX + B with A as the scaling factor and B as the offset, and X is the time series).This capacity allows the users to easily evaluate the impact of changes in model forcings without changing the model setup or inputs.In the case of lake modeling, for example, it allows users to suppress the precipitation and evaporation to a minimum, or zero, to evaluate their impacts on downstream streamflow and lake volumes.

Including Various Lake and Reservoir Water Balance Models
The lake and reservoir water balance models that are used in Earth system models are often taken from water resources management and global hydrological modeling communities.As an example, the Hanasaki formulation has been used in routing models in the context of Earth system modeling (Biemans et al., 2011;Droppers et al., 2020;Hanasaki et al., 2006).Recent studies have provided insight into the sensitivity of parameter values in various lake models (Gutenson et al., 2020;Sadki et al., 2023;Shin et al., 2019;Tavakoly et al., 2023) and the impact of lake and reservoir models on inferred parameters of hydrological or land surface models (Dang et al., 2020).Encompassing various water balance models, mizuRoute-Lake further assists both sensitivity and uncertainty analysis for lakes and reservoirs.In addition, existing river network topology (or grid-based) can be used to set up mizuRoute-Lake.The lake models implemented in mizuRoute encompass both parametric and data-driven methods to enable flexibility in modeling lakes and reservoirs.The user can invoke different lake and reservoir water balance models within the same routing configuration.For example, smaller upstream lakes can be modeled using a simpler parametric model due to lack of information, while the larger downstream reservoirs, where observed data availability is more likely, can be modeled using more complex models.

Lake and Reservoir Models
Lakes can be generally classified as endorheic or endorheic.Exorheic lakes are lakes that have at least one outlet.
In the current implementation, mizuRoute-Lake assumes one outlet for the exorheic lakes and reservoirs.Endorheic lakes have no outlet, meaning that the water that enters these lakes is depleted by other means, such as infiltration to the lake bed, abstraction, or evaporation from the lake surface.Endorheic lakes are treated as water bodies in which outflows from the outlet, O from Equation-1, are assumed to be zero.

Parametric Lake and Reservoir Models Available in mizuRoute
Parametric lake models link outflow, O, to inflow, I, and storage, S, of a lake or reservoir by a set of functions and parameters.Parametric lake models can be further categorized into time-invariant and time-varying (or hyperparametric) models.

Time-Invariant Parametric Models
The simplest outflow models for lakes and reservoirs are time-invariant parametric models.Examples of these models are Döll (Döll et al., 2003), Wada (Wada et al., 2014), HYPE (Arheimer et al., 2019), and the LISFLOOD Water Resources Research 10.1029/2022WR032400 (Burek et al., 2013) lakes and reservoir formulations (for a more extensive list of models refer to Gutenson et al., 2020).Currently, there are two time-invariant parametric models implemented in mizuRoute.
• Döll: The simplest lake model implemented in mizuRoute-Lake is the formulation of Döll et al. (2003) (based on Meigh et al., 1999).This model relates the outflow from a lake to the current volume of water stored in the lake, its maximum active capacity, and an empirical power relationship identified by a coefficient and power.The Döll formulation is often used for natural lakes (no regulation) or when there is limited knowledge on how reservoirs are operated.Appendix A describes the implementation of the Döll model.• HYPE: The second time-invariant parametric model in mizuRoute is the HYPE formulation of lakes and reservoirs (Arheimer et al., 2019).The implemented HYPE model in mizuRoute-Lake aims at representing hydropower production.The operation rules for the HYPE reservoir model depend on four elevation zones described by four parameters with respect to spillway crest and turbine(s) intake levels.These elevation parameters from largest to smallest are (a) the elevation of the emergency spillway, E emg ; (b) the elevation under which the turbine(s) inflow and therefore hydropower becomes restricted, E lim ; (c) the elevation turbine(s) intake below which outflow is zero E turbine ; and (d) an elevation which defines the volume of the so-called inactive storage of a lake or reservoir, E min .When the lake water level elevation is between the minimum elevation, E min , and the elevation of turbine(s) intake, E turbine , the outflow is effectively zero and the reservoir accumulates (and evaporates) water.For lake water level elevations higher than primary turbine(s), E turbine , and lower than the limiting elevation, E lim , the hydropower production and outflow are limited.The detailed elaboration of HYPE hydropower formulation for each condition explained above, alongside parameters, is described in Appendix B. It is noteworthy to mention that the HYPE formulation of mizuRoute-Lake can be expanded to include all the HYPE formulations for irrigation and water supply and flood control reservoirs in addition to the already implemented hydropower reservoir.

Time-Varying Parametric Models
Time-varying model parameters are often used to capture time-dependent changes in reservoir operations over months, seasons, years, and decades, accounting for wetting or drying periods and accommodating the complexity of reservoir operation rules and decisions.To address this, we have implemented the "Hanasaki with memory" parameterization as follows.
• Hanasaki with memory: The Hanasaki formulation is among the most well-known formulations that are used to inform the water balance model of a reservoir based on time-varying, often monthly, inflow and demand terms (Hanasaki et al., 2006).The model scales the demand term based on the state of the reservoir (the amount of water stored).In our implementation, we have made the monthly inflow and demand parameters variable over time by allowing the model to adjust the inflow and demand parameters based on the memory of the system (e.g., the past 5 years).This enables adjustment of time-varying parameters by considering how long-term changes in environmental conditions, such as climate change, affect inflow or by how demand changes over time due to changes in irrigation technology, irrigated area, or irrigation intensity (Appendix C).
The performance of the Hanasaki formulation for reservoirs compared to the natural lake model of Döll in mizuRoute is globally evaluated by Vanderkelen et al. (2022).

Data-Driven Lake and Reservoir Modeling
Additional time-varying parameter formulations have been proposed.In these models, often assumed timeinvariant parameters are varied in time at a given temporal scale such as monthly (Tefs et al., 2021;Yassin et al., 2019).The monthly parameters can reflect the resolution of water management models or available data and, in principle, these parameters can be changed even per simulation time step.Parametric models have rigid formulations and assumptions that may limit their applicability.mizuRoute-Lake allows the user to provide a time series of abstraction and injection fluxes (m 3 s 1 ) for each river segment, natural lake, or reservoir in the river-lake network topology.Also, for reservoirs in the river-lake network, a user can identify target volumes (at the time step of the model simulation, e.g., hours or days).mizuRoute-Lake then adjusts the outflow.If the current target volume is greater than the current reservoir volume, the water is stored without discharging downstream.On the other hand, the model releases water if more water is stored in the reservoir than the target volume.

Case Studies
In this section, four mizuRoute-Lake case studies are presented.The first case study focuses on differences in streamflow simulations using river network topology with and without resolvable lakes.The second case study focuses on the interconnection and impact of hydraulic parameters and climatic forcing on Lake Victoria's water level and outflow.The third case study evaluates the impact of different lake and reservoir water balance models given different numbers of resolvable lakes and reservoirs based on two river network topologies for the South Saskatchewan River.A final case study focuses on the simulation of various scenarios for Lake Diefenbaker in Saskatchewan, Canada, to evaluate peak flow reduction in the city of Saskatoon during the flood of June 2013.All the case studies are carried out with simulations and observations at daily time resolution with the in-channel routing scheme of IRF.

Global Simulation of Lakes and Reservoirs Using Döll Parametric Model
In our first case study, we evaluate the difference in streamflow simulation in river segments globally with a network topology without lakes versus one including lakes.The network topology is based on the Hydrologic Derivatives for Modeling and Applications with approximately 300,000 river segments worldwide (HDMA; Verdin, 2017).The river-lake network topology also utilizes the HDMA river network and adds approximately 4,200 resolvable lakes and reservoirs globally from the 1.43 million lakes in the HydroLAKES data set (Messager et al., 2016).The resolvable lakes are the lakes that can be captured by the length and coarseness of the selected river network topology (refer to Figures 1c and 1d).For the river network with lakes and reservoirs, the river segment length and sub-basin areas are corrected for the portions that are under lakes and reservoirs.This exercise can be seen as a vector-based analogy to recent advances with grid-based routing models that consider lakes and reservoirs globally (Zajac et al., 2017).For the Döll formulation parameters, see Appendix A, the inactive storage of each lake is set to the total volume provided by the HydroLAKES data set, and the maximum active storage is considered to be 5 m of water volume over the area of lake more than the inactive storage.The power and coefficient parameters are set to suggested values of 1.5 [-] and 0.01 [day 1 ] respectively (Hanasaki et al., 2006).The simulated runoff is from the Variable Infiltration Capacity, VIC, with a resolution of 0.25°forced with MSWEP precipitation data set (Beck et al., 2019;Lin et al., 2019).The advantage of this product is its biascorrected runoff fields, resulting in a closer match to observed streamflow when routed.Additionally, precipitation and open water evaporation are passed to mizuRoute-Lake for accounting in the water balance Equation-1.Simulated open water evaporation, based on Penman formulation, is used from the VIC model.
Figure 2 indicates changes in maximum discharge values per river segment with and without lakes for the period of 1985-2010.As expected, the larger lakes have a higher impact on streamflow downstream.Also, the impact of lakes on streamflow is diminishing further downstream of lakes and reservoirs.This can be the basis of exploring the earlier assumptions on the distance of influence from a lake and reservoir in the downstream (e.g., 600 km by Wada et al., 2014).We also investigate the simulation performance with and without lakes compared to observation stations from the Global Runoff Data Center, GRDC (see the inset figure).75% of stations from the total of 1600 GRDC stations that are downstream of lakes and reservoirs within the network of HDMA and Hydro-LAKES exhibit improved performance in Nash Sutcliffe Efficiency, E NSE , in comparison to the case without lakes and reservoirs.This demonstrates the added value of the inclusion of lakes and reservoirs in streamflow simulations, even when the runoff field is bias-corrected based on observations.

Reconstruction of Level and Outflow From Lake Victoria
In our second case study, we focus on the reconstruction of the lake level and outflow from Lake Victoria in East Africa.It is the second largest freshwater lake in the world with an average area of 6 10 4 km 2 and a volume of approximately 2.4 10 3 km 3 , with active storage of approximately 2.4 10 2 km 3 (considering Lake Victoria's average area and assuming 4 m of water level fluctuation over the historical period by Nicholson & Yin, 2001).The outflow of the lake is the source of the White Nile and is governed by the Nalubaale dam complex for hydropower production located in Jinja, Uganda.Many studies have attempted to explain lake level fluctuations through detailed evaluations of water balance components of Lake Victoria (refer to Yin & Nicholson, 1998;Sene, 2000;Vanderkelen et al., 2018;Pietroiusti et al., 2024).Modeling the Lake Victoria water balance is challenging due to (a) a lack of comprehensive rainfall data on the lake (estimated to contribute up to 85% of lake tributaries to Lake Victoria to estimate lake inflows, (c) a lack of data to validate and estimate evaporation from the lake surface, and (d) human activities to regulate lake outflow at the dam, which transformed Lake Victoria from a natural lake to a very large reservoir.
According to the HydroLAKES data set, 207 lakes and reservoirs exist in the Lake Victoria basin.From this number, only six lakes and reservoirs, including Lake Victoria itself, are resolvable given the HDMA river network (Figure 3).The upstream lakes and reservoirs are treated as natural lakes with the Döll formulation using default parameters.The outflow from Lake Victoria is based on an Agreed Curve that relates release to the lake level as quantified by Kull (2006): in which O t and h t are the outflow in cubic meters per second and the lake level in meters, referenced to 1122.86 m above mean sea level, for a given time.This formulation can be encapsulated by the emergency spillway formulation, Equation-B4, from HYPE implementation.Therefore, the default parameters of the Agreed Curve are 66.3, 7.96, and 2.01 for parameters Q emg, rate , E emg , and P emg , respectively.For the years 2004 and 2005, the lake level dropped to historically low values.This lake level drop is attributed equally to reduced precipitation over the lake and excess outflow at the dam, potentially driven by increased hydropower demand (Kipyegon Bosuben et al., 2022;Vanderkelen et al., 2018).There are no lake outflow observations publicly available, which makes it challenging to estimate the relationship between lake level and outflow from the lake post-2000.
The case study is designed to illustrate the capacity of mizuRoute-Lake in exploring the interaction between hydraulic parameters and climate variables on lake storage and outflow simulations.It is often the case that lake parameters are inferred, with some forms of calibration techniques, embedded in the hydrological or land surface model given an input forcing of precipitation, potential evaporation, and observed streamflow at hydrometric gauge locations (similar to the study by Zajac et al., 2017).In the case of Lake Victoria, the simulation was performed from 1979 to 2013.The pre-2000 period is simulated using the default values for the Agreed Curve while the post-2000 is simulated with perturbations of hydraulic and climatic variables to evaluate the effects of both.Two scenarios are considered.potential evaporation are also scaled, alongside parameters P emg and Q emg, rate , to explore the impact of meteorological and hydrological variability on the lake-level behavior simultaneously.The two newly introduced parameters for scaling of precipitation and potential evaporation are A scale, Prec and A scale,Ep .Their ranges are set between 0.6 and 1.4, deflating or inflating the input values for precipitation and potential evaporation over Lake Victoria.Given the dominant role that precipitation directly onto Lake Victoria has in lake input fluxes, runoff values for upstream sub-basins are not perturbed.
1000 simulations are performed for the post-2000 period for each scenario with randomly selected parameters from the predefined parameter space (uniform across each parameter).The goodness of fit is evaluated, using E NSE , comparing simulated lake level time series to the remotely sensed DAHITI data set (Schwatke et al., 2015).
To understand the sensitivity of parameters for each scenario, a global sensitivity analysis is performed on the parameters described above.The global sensitivity analysis is carried out using the VISCOUS method that can estimate the order of sensitivity for parameters given any strategy of sampling from parameter space (Liu et al., 2024;Sheikholeslami et al., 2021).In addition to sensitivity analysis, uncertainty analysis for lake volume and outflow are evaluated for behavioral simulations.Simulations and their associated parameters with E NSE values higher than 0.5 are considered behavioral (best performance in the scale of E NSE is 1).
Figures 4a and 4b depicts lake level and outflow simulations of behavioral parameter sets respectively.The simulation based on the Agreed Curve results in a higher lake level than the DAHITI observed data.This clearly shows the difficulty of representing changes in operational human decisions, as the Agreed Curve and its parameters (Equation-2) are no longer representative of the lake level and lake outflow relationships after the year 2000.Figure 4b depicts the outflow from Lake Victoria.The Agreed Curve default parameters and scenario-A result in an overall higher outflow from Lake Victoria while scenario-B, due to its degree of freedom, results in much larger outflow simulation ranges.This is expected as the outflow is not used to constrain behavioral parameters in this case study.In addition, perturbation of precipitation and evaporation resulted in a much larger interplay and therefore increased the range of uncertainty for the outflow from Lake Victoria.Figures 4c and 4d indicate the behavioral parameter sets for P emg and Q emg, rate of the HYPE formulation for scenario-A and -B.The best-performing simulation for scenario-A and -B are 0.61 and 0.86 respectively, in scale of E NSE .The comparison between behavioral parameters of scenario-A and -B indicates that often inferred or calibrated hydraulic parameters are intertwined with the nature of forcing, in this case, precipitation and evaporation.This increasingly makes it harder to infer the effective or behavioral lake parameters only based on one deterministic precipitation and evaporation data set.Table 1 indicates, for scenario-A and -B, the parameter P emg is the most sensitive in its first and total effects.The difference between the first order and total order sensitivity of parameter Q emg, rate for scenario-B hints at the interaction of the lake water balance model to climate forcing.Note.The sensitivity analysis assesses the impact of each parameter on the variance of model simulation which in this study is represented by evaluating lake level simulation against the DAHITI data set using the Nash-Sutcliffe Efficiency (E NSE ).Sensitivity is quantified in terms of first-order and total-order sensitivity.First-order sensitivity measures the direct impact of an input variable (in this case, parameters) on model output, while total-order sensitivity evaluates both the direct and interaction effects of the input variable.
This case study highlights the challenges associated with estimating the water balance and outflow from Lake Victoria, stemming from uncertainties in climatic variables and changes in operational rules at hydropower stations near Jinja.mizuRoute-Lake serves as a platform to explore this complexity.Expanding upon these findings could involve incorporating more complex precipitation or evaporation data sets (e.g., EM-Earth by Tang et al., 2022) or exploring various formulations of lake outflow and its parameters while encompassing larger parameter ranges.

Evaluation of the Impact of River Network Density, the Number of Resolvable Lakes and Reservoirs, and Water Balance Models on Streamflow Simulations
Our third case study focuses on the South Saskatchewan River with a total drainage area of 141,000 km 2 .The South Saskatchewan River headwaters are in the Canadian Rockies in the province of Alberta.Bow River passes through the city of Calgary.Two major rivers, Bow and Oldman merge before the city of Medicine Hat in Alberta to form the South Saskatchewan River.The South Saskatchewan River, after the confluence with the Red Deer River, flows to the province of Saskatchewan.It flows to Lake Diefenbaker, with a total volume of 9.4 10 9 m 3 , created by Gardiner Dam.South Saskatchewan River then passes through the city of Saskatoon after which it joins with the North Saskatchewan River and flows into a system of lakes and reservoirs in the province of Manitoba, draining into Hudson Bay.
For this regional application, we evaluate the effect of river network topology density and the number of resolved lakes and reservoirs on streamflow simulation.Two river network topologies are used; HDMA, and MERIT-Basins (Lin et al., 2019).MERIT-Basins is 10 times denser than the HDMA data set and resolves 70 water bodies while the HDMA resolves only 7 water bodies in this basin.Figure 5 illustrates the differences in the river network topology and resolvable lakes and reservoirs.
The river and lake network topologies, based on the two different network topologies, can be loaded with various lake and reservoir information.Given suggested parameters from past studies, four scenarios are defined.
1. Category 1 -No lake: Lakes are not resolved on the network topology.2. Category 2 -Döll: All resolvable water bodies are considered to be natural lakes and simulated using the Döll formulation with prescribed parameters described in the global case study.3. Category 3 -Döll + HYPE: 11 water bodies are treated as reservoirs.These are the reservoirs for which the HYPE parameters are suggested from earlier studies (Andersson et al., 2015;Stadnyk et al., 2020;Tefs et al., 2021).The rest of the lakes or reservoirs are considered natural with Döll formulation.4. Category 4 -Döll + HYPE + Hanasaki: One reservoir, Lake Diefenbaker, is parameterized using the Hanasaki scheme (based on suggestion global parameters from earlier studies; for more detailed information refer to Vanderkelen et al., 2022).The rest of the reservoirs are either parameterized using HYPE formulation, if the parameters exist for water bodies, or Döll formulation in case no information is available.
The four categories and two river network topologies result in 8 combinations of model configurations and simulations.The simulations for these combinations are visualized for the cities mentioned in Figure 5, namely Banff, Calgary, Medicine Hat, and Saskatoon.The simulations of streamflow at these cities are depicted in Figure 6.For the most upstream locations, such as the town of Banff, streamflow remains similar across categories, indicating a limited impact of lakes and reservoirs.However, for more downstream locations, the difference between the simulation of various categories grows larger.This is especially clear for river flow at Saskatoon where the different categories exhibit significant differences in streamflow (Figures 6m-6p).It is noteworthy that the impact of river network topology is minimal as the coarser HDMA river network topology resolves the larger and more disruptive lakes and reservoirs.The most significant differences at regional scales result from the choice of lake and reservoir models and their associated parameters.In addition, and due to the lack of parameter calibration in this case study, the inclusion of more complex lake and reservoir models may not necessarily result in improved hydrograph simulations compared to observations.This in turn questions the applicability of using suggested off-the-shelf lake and reservoir parameters from earlier studies.

Simulations of Lake Diefenbaker for the Flood of 2013
To illustrate the relevance of the mizuRoute-Lake and reservoir scheme for local scale applications, we provide an example of Gardiner Dam and Lake Diefenbaker on the South Saskatchewan River.We specifically focus on the flood of June 2013 in which intense rainfall and rapid snowmelt in the Canadian Rockies caused flooding (Vionnet et al., 2020).The streamflow at Saskatoon during this flood was as high as 2300 m 3 s 1 (normal June streamflow is approximately 100 m 3 s 1 ).We utilize mizuRoute-Lake to address the question: "How would alternative operations of Gardiner Dam have affected peak flow at Saskatoon during the flood of June 2013?"An understanding of how different dam operations could have altered flood probabilities may be helpful for water managers in terms of planning to mitigate against future floods.
Here, the data-driven capacity of the mizuRoute-Lake is used.mizuRoute-Lake is forced with discharge estimation at hydrometric stations upstream of Lake Diefenbaker and the parametric lake models are replaced with the target lake volume, which mizuRoute-Lake replicates by adjusting the reservoir outflows.The Water Survey of Canada measures the upstream streamflow at two locations, South Saskatchewan at Medicine Hat and Red Deer at Bindloss (station IDs of 05AJ001 and 05CK004 respectively).The Red Deer River drains into the South Saskatchewan River and the South Saskatchewan River flows into Lake Diefenbaker.Many other local tributaries directly flow into Lake Diefenbaker.Among these tributaries, the Swift Current River is the biggest contributor (station ID of 05HD039).Lake Diefenbaker has two outlets, one outlet is the main outlet on the natural outflow path to the south Saskatchewan River, which includes two sets of spillways from Gardiner Dam, large emergency spillways for flood mitigation, and primary spillways that are used for hydropower generation and regulating flow for agricultural use.The streamflow from Gardiner Dam is measured at Saskatoon (station ID 05HG001).The secondary outlet from Lake Diefenbaker is a canal that drains from the Qu'Appelle Dam with limited capacity, on the order of 10 m 3 s 1 , in comparison to the main reservoir outlet at Gardiner Dam which can flow at rates of 1000s m 3 s 1 .The outflow from this secondary outlet is measured at station 05JG006.In addition to the inflows to Lake Diefenbaker, Lake Diefenbaker storage can be approximated using elevation-storage relationships from the elevation measured at Gardiner Dam (station ID 05HF003).The information on the network topology, along with the location of stations, is provided in Figure 7.Note that the network topology presented in Figure 7 resembles the topology of water management models (such as WEAP); this example is used to illustrate the potential to couple mizuRoute with existing water management models (online or offline).
In the first step, the river routing parameters of the impulse response function, velocity, and diffusivity, are calibrated to match the historical streamflow observations.The observed streamflow at upstream stations is injected into the river segments at the upstream station location.The abstraction from Qu'Appelle Dam is extracted as a demand term from Lake Diefenbaker.The historic volume of Lake Diefenbaker is provided as the target volume to emulate the historical conditions.Calibration of velocity and diffusivity parameters is performed using the pygmo Python package and its parallel computing capacity (Biscani & Izzo, 2020).The algorithm used for this optimization is Particle Swarm Optimizer (PSO) from the pygmo library of algorithms.The interaction between the diffusivity and velocity reveals a significant trade-off, where various combinations of velocity and diffusivity result in similar E NSE scores for Saskatoon streamflow (equifinality or non-uniqueness).Based on the field notes and observation during the flood event of June 2013 by the Water Survey of Canada, the velocity is recorded at 2.4 m s 1 during the Saskatoon flood.The corresponding diffusivity is approximately 13,000 m 2 s 1 .
The second step involves the creation of scenarios for reservoir management during the 2013 flood.Scenarios are built based on four criteria.Water Resources Research 10.1029/2022WR032400 1.The magnitude of the temporary drawdown of Lake Diefenbaker: During the flood event, the reservoir operator decreased the reservoir level prior to the flood wave arrival.We explore alternative scenarios in which the temporary drawdown of the lake level is up to 1.5 m, in increments of 0.25 m, lower than during the 2013 flood event.2. The date of maximum drawdown of Lake Diefenbaker: The lowest water level due to the temporary drawdown was on the 24th of June 2013, however, this is changed between the 22nd to 25th of June to explore its effect on streamflow at Saskatoon.3. How early the drawdown starts: This includes how early the scenario departs from historical values to the minimum water level during the temporary drawdown.It is assumed that this departure happens as early as the fifteenth of June (forecasting lead time).

How fast the operation goes back to historical observation after drawdown:
The faster the water level of Lake Diefenbaker is raised after the drawdown, the more water is stored and therefore the peak flow at Saskatoon is reduced.It is assumed that the water level of Lake Diefenbaker follows historical records on a date between the 22nd and the 30th of June.
Figure 8 depicts the water level scenarios, associated volume or storage scenarios, and their impact on peak flow at Saskatoon.For example, scenario 18-25-28-1.0 (green line) defines the pathway in which the temporary drawdown departs from historic water level observation on the eighteenth of June, the maximum temporary drawdown happens on the 25th of June with 1 m lower water level than historical recourse.The water level goes back to historical observation on the 28th of June.
The highest reduction in the peak flow at Saskatoon is scenario 15-23-30-0.75 with the maximum temporary drawdown of 0.75 m compared to what historically happened.This scenario decreases the peak flow estimate at Saskatoon by about 500 m 3 s 1 .Note that the scenarios constructed here are ad hoc and are meant to illustrate the capabilities of mizuRoute-Lake datadriven methods to provide potential flood control guidance and may not be operationally feasible.More elaborate and carefully designed scenarios, including accounting for uncertainties of streamflow measurement during the flood event or with higher degrees of freedom such as initial water level, can be exploited given the existing setup of this case study.mizuRoute-Lake can be used while utilizing more realistic reservoir operation scenarios based on expert knowledge to comprehensively evaluate and optimize management strategies for future flood events.

Discussion and Concluding Remarks
We have presented the implementation of lake and reservoir water balance models in a vector-based, routing scheme, mizuRoute-Lake.Three parametric water balance models are implemented in mizuRoute-Lake (Döll, HYPE, and Hanasaki) as well as a data-driven capability to simulate lake and reservoir volume from external sources (observations, remote sensing water management models, or Earth system models).mizuRoute-Lake therefore can be seen as a platform where various water balance models for lakes and reservoirs can be formulated and compared within a unified framework.
A main feature of mizuRoute-Lake is the seamless alternation between various water balance models for lakes and reservoirs which can help evaluate the river routing and lake water balance parameters when driven by runoff fields from various hydrological and land models.Additionally, the computational efficiency of mizuRoute as a result of its domain decomposition (Mizukami et al., 2021), allows the simulation of global, regional, and local streamflow in a computationally efficient manner.This computational efficiency facilitates the transfer or exploration of river and lake parameters across various scales and for different levels of detail in terms of the representation of river-lake network topology and runoff fields.mizuRoute-Lake can serve as an infrastructural Water Resources Research 10.1029/2022WR032400 connection between hydrological or land model simulations and water management demands, including irrigation requirements, enabling further exploration of their interaction (Vanderkelen et al., 2022).
Our findings from the case studies can be summarized as.
• At the global scale, the inclusion of lakes and reservoirs, improves the agreement between the streamflow simulations and observations.• At the regional scale, mizuRoute-Lake capabilities are utilized to examine the effects of uncertainty in climatic variables and operational changes on lake volume and outflow in the Lake Victoria case study.• The South Saskatchewan case study indicates the importance of the inclusion of larger lakes and reservoirs in river network topologies and that the choice of lake and reservoir water balance schemes in simulated streamflow is important.• The local case study of Lake Diefenbaker highlights the potential of mizuRoute-Lake in simulating lake outflow based on multiple scenarios in data-driven mode.It also highlights the difficulties in closing the water balance of large lakes even when information on various water balance components, inflow, outflow, and storage changes is available.
More work is needed to evaluate and scrutinize the adequacy of the water balance models presented in this work within the context of Earth system modeling.The capability of modeling the water balance of lakes and reservoirs in mizuRoute-Lake will allow for detailed future studies of the interplay between water availability, climate change, and human water management.

Appendix A: Döll
The least complex lake model in mizuRoute-Lake is the Döll formulation (based on Döll et al., 2003;Hanasaki et al., 2006).The Döll formulation is based on linking the outflow from the lake to the ratio of active storage to its maximum active storage through a power function.Parameters, states, and fluxes are presented in Table A1.
For the case when the storage is larger than inactive storage, S 0 < S, the outflow is calculated as: Outflow is set to zero, O = 0, when the storage is equal or smaller than inactive storage, S ≤ S 0 .

Appendix B: HYPE -Hydropower
The following formulations describe HYPE's representation of a hydropower reservoir.Parameters, states, and fluxes are presented in Table B1.
For hydropower reservoirs, a sinusoidal function defines the target hydropower production outflow.This function is shifted in the time given a day of the year, B phase , as flow: Next is to define the limiting factor when the lake elevation is between E prim and E lim .The following equation provides the linear scaling when hydropower production is restricted.Also when the water level, E, is lower than E prim , F lim becomes zero and when E is higher than E lim , F lim becomes one.
The production outflow for hydropower can be calculated as: If reservoir elevation, E, if larger than E emg , the emergency spillway is activated: and the outflow is either the maximum of the Q emg and Q main in mizuRoute or their summation:  Water Resources Research 10.1029/2022WR032400 Note that in the above implementation, which is the hydropower formulation of HYPE lake and reservoir, the primary spillway is assumed to be the same as turbine intake.If additional formulations from HYPE are presented then turbine(s) intakes and primary spillway together with their associated parameters can be differentiated.More elaborated HYPE lake and reservoir formulations for irrigation, water supply, and flood control can be implemented in mizuRoute-Lake if desired.

Appendix C: Hanasaki With Memory
The following presents Hanasaki's formulation as described by Hanasaki et al. (2006).All hard-coded values in the Hanasaki formulation are defined as parameters.This allows users to either utilize suggested or default values, or test other values for sensitivity and uncertainty analysis.Additionally, in the mizuRoute-Lake representation of the Hanasaki formulation, a memory for lake or reservoir inflow and demand is included.Originally, these fluxes are defined by parameters that remain constant over time.However, due to changes in hydrological and meteorological conditions upstream or water abstraction, users may want these parameters to be adjusted during longer simulation periods in mizuRoute.To address this, we have enabled memory in the Hanasaki formulation, allowing the inflow and demand parameters to be adjusted over a past window of simulation, for example, every 5 years.Parameters, states, and fluxes are presented in Table C1.
The first step is to initialize the memory and demand matrices if the memory flag for one or both inflow and demand is activated.The size of the memory matrices is 12 rows (representing the number of months) and a variable number of columns determined by the simulation time step and the specified number of years of memory  S init Auxiliary Parameter For the current simulation, it is possible that the simulation starts from a month which is different from the first month of Hanasaki formulation, an initial storage parameter that represents the past year storage at the first Hanasaki month is needed.This is different from initial storage for the restart of the model.In Hanasaki, the first month is defined as the month that monthly inflow surpasses mean yearly inflow in which i is the month of the year (January to December or 1 to 12).In case any of the memory flags, F i or F d , are activated at each simulation time step the irrigation demand (given as a time series to the model) and inflow which is simulated internally by mizuRoute from the upstream contributing area, the last column of the matrix is removed, columns are shifted for one time step and new simulation, from the current simulating time, are added.This way we keep track of past inflow and demand for each reservoir in case if deemed necessary.If the memory is activated by setting the flags for inflow, demand, or both of them, the inflow and demand parameters of Hanasaki are updated every time step averaging the past record in the memory matrix depending on the length of the months and simulation temporal resolution (some months are shorter than others).This allows the Hanasaki inflow and demand parameters to be variable in time reflecting the change in the amount of runoff from the basin and also the demand for irrigation for example, The window of memory remains to be explored and can vary for situation, models, and purpose of study while Vanderkelen et al. (2022) used a 5-year memory for both inflow and demand.
In the following step, the yearly average from the inflow and demand parameters are calculated: Target discharge is determined under various conditions.When the reservoir does not serve an irrigation purpose (flag F p set to zero or false), the target discharge (Q target ) is simply the annual inflow (I y ).
For irrigation reservoirs (flag F p set to one or true), if the annual demand (D y ) exceeds a certain fraction (β) of the annual inflow (I y ), the target discharge is calculated as: However, if the annual demand is less than the specified fraction of the annual inflow, the target discharge is determined as: The ratio of storage to the active portion of maximum storage, E r , is calculated as: Water Resources Research 10.1029/2022WR032400 The ratio of total storage to inflow to define the within-a-year or multi-year nature of a reservoir is calculated based on: Finally, the reservoir outflow for a multi-year reservoir (0.5 < c) can be calculated as: while for a within-a-year reservoir (c ≤ 0.5) the outflow can be calculated by:

Figure 1 .
Figure1.Illustrative examples of various river and lake/reservoir network topology for (a) a coarse resolution grid-based configuration where smaller lakes are unresolved in the river network, (b) a finer resolution grid-based configuration with smaller lakes that are resolved on the river network, (c) a vector-based configuration with lakes, using a low-density river network with unresolved smaller lakes, and (d) a vector-based configuration with higher density river network resolving smaller lakes on the river network.The blue and gray color indicates resolved and unresolved lakes and reservoirs on the river network topology.Resolvable lakes and reservoirs are water bodies that can be simulated within the river network topology with inflow and outflow points, while unresolvable lakes and reservoirs are often referred to as in-basin or in-grid lakes and reservoirs and are simulated in isolation or with indirect links to river network.
input fluxes Kipyegon Bosuben et al., 2022), (b) a lack of upstream discharge measurements from the many Water Resources Research 10.1029/2022WR032400 GHARARI ET AL.

Figure 2 .
Figure 2. The impact of lakes and reservoirs on river segment streamflow is assessed using the HDMA river network topology and includes 4,236 resolvable lakes and reservoirs from the global HydroLAKES data set.The impacts of lakes and reservoirs, simulated using Döll formulation, are depicted as the absolute difference in maximum discharge values with and without lakes for the period of 1985-2010 (ΔQ in m 3 s 1 ).The figure inset illustrates the improved performance of streamflow simulation compared to the Global Runoff Data Center (GRDC).Approximately 75% of the 1600 GRDC gauges downstream of resolvable lakes and reservoirs exhibit improved performance based on E NSE for the period 1985 to 2010 (with a perfect E NSE being one).Lakes and reservoirs are identified by blue.

Figure 3 .
Figure3.The river and lake network topology for Lake Victoria.The blue and gray water bodies indicate the resolved and unresolved lakes and reservoirs upstream of Lake Victoria respectively.From 207 lakes and reservoirs based on the HydroLAKES data set in the Lake Victoria basin, only 6 are resolvable over the HDMA river network topology.The red river segment indicates the Lake Victoria outlet where the hydropower plants are located near Jinja.

Figure 4 .
Figure 4. (a) Simulation of Lake Victoria level fluctuation; using default values for agreed curve, perturbation of hydraulic parameters, scenario-A, perturbation of climatic and hydraulic variables, and scenario-B in comparison to remote sense DAHITI data set.(b) Simulation of Lake Victora outflow; using default values for agreed curve, perturbation of hydraulic parameters, scenario-A, perturbation of climatic and hydraulic variables, scenario-B.Comparison of the sampled parameter sets and their performance in the scale of Nash Sutcliffe, E NSE , above 0.5 for (c) scenario-A and (d) scenario-B.AMSL is an abbreviation for "above mean sea level".

Figure 5 .
Figure 5. (a) Representation of resolvable lakes and reservoirs in the South Saskatchewan River upstream of Saskatoon for (a) HDMA and (b) MERIT-Basins river network topologies.A zoom-in to the Lake Diefenbaker area is provided for (c) HDMA and (d) MERIT-Basins river network topologies respectively based on panels (a) and (b).

Figure 6 .
Figure 6.The comparison of simulations from HDMA and MERIT-Basins and the four categories of lake and reservoir parameterizations.HDMA and MERIT-Basins simulations are presented in blue and red respectively in comparison to the observed streamflow in black for (a-d) Bow at Banff (e-h) Bow at Calgary (i-l) South Saskatchewan River at Medicine Hat, and (m-p) South Saskatchewan River at Saskatoon for the four categories of lakes and reservoirs simulations.

Figure 7 .
Figure 7. Illustration of Lake Diefenbaker and configuration of river-lake network topology and water level and streamflow measurement stations.

Figure 8 .
Figure 8. Historical observation and constructed scenarios for (a) Lake Diefenbaker water level, (b) Lake Diefenbaker storage, and (c) streamflow at Saskatoon.Historical observations alongside three scenarios are highlighted in color in contrast to all the other scenarios.Observational time series are depicted in blue while the examples of three scenarios are shown in other colors (in contrast to all the scenarios which are depicted in gray).AMSL is an abbreviation for "above mean sea level".
the memory in years for inflow (should be an integer) the memory in years for demand (should be an integer)

[
or L dm ).At each model time step, the memory is shifted forward by one time step, and new inflow or demand values are added:M inflow [i,2 : end] = M inflow [i,1 : end 1] (C1) M inflow [i,1] = I (C2) M demand [i,2 : end] = M demand [i,1 : end 1] starting month of simulation is different from Hanasaki's defined first month S should be replaced by S init ; In Hanasaki formulation, the first month is defined as the month that monthly inflow surpasses mean yearly inflow (I y ≤ I m ) [m 3 ].

Table 1
Parameters and Their Respective Ranges, Alongside Their Behavioral Ranges and Sensitivities for Scenario-A and Scenario-B