Use of Abstraction and Discharge Data to Improve the Performance of a National‐Scale Hydrological Model

Across the UK, water abstracted from ground, surface, and tidal stores is regulated through a system of licenses to protect both the sources and the environment. Similar permits are required for discharging wastewater to rivers or onto the ground. These abstractions and discharges can have a significant impact on UK Rivers, but measurements are not readily available, which discourages their use in hydrological models of river flows. However, these very unique data sets provide a means to improve the performance of spatially distributed hydrological models, particularly during periods when abstraction regulations change and at ungauged river locations. To demonstrate this, point source abstraction and discharge measurements across England have been transformed into 1 × 1 km resolution gridded data and used with an enhanced formulation of the Grid‐to‐Grid (G2G) hydrological model where these processes are mathematically represented. A comparison of G2G‐simulated and gauged river flows for 605 catchments across England between 1999 and 2014 indicates that model simulation of river flows is generally improved at gauged locations downstream of abstraction/discharge sites. The main improvement is in the simulation of low flows, for which the median performance is improved by 10.7%, however, the impact on simulation of high river flows is more modest (1.5% improvement). These results demonstrate the potential gains available to the international hydrological and land‐surface modeling community from using records of actual water use (where available) in models, in place of more widely used national statistics.

During dry summer periods, anthropogenic discharge can contribute more than 60% of the flow in some rivers, with significant implications for freshwater ecosystems and the potential for downstream abstractions (DE-FRA, 2014). Despite licenses and regulations, there are indications that almost 20% of the surface water sources are overabstracted and at risk of not meeting good ecological status (DEFRA, 2019a). Maintenance of environmental flows is essential for healthy freshwater ecosystems (Castella et al., 1995;Dunbar et al., 2004;Klaar et al., 2014;Petts, 1996) and to conserve water quality (Hutchins & Bowes, 2018;. RAMESHWARAN ET AL. 10.1029/2021WR029787 3 of 24 Limited but good quality data are currently available for thousands of individual locations across England consisting of monthly actual (and licensed) abstraction data, HOF river conditions, and discharge data. These provide a baseline to understand how river flows have been affected by licensed abstraction and discharges, and enable estimates to be made of both observed and natural river flows across England. In order to meet the needs of a changing environment, the current system of regulation is under review (Abstraction Reform Report: DEFRA, 2019a) to provide efficient and secure water supply for people, businesses, and agriculture while protecting the environment. In other words, the reform is looking to facilitate sustainable use of water resources with a strong catchment focus that is responsive and flexible enough under future challenges. The current system of regulation has no dynamic link between the availability of water for abstraction upstream of a HOF condition and the availability of river flow in the whole catchment (EA-OFWAT, 2011). The proposed changes to managing abstraction licenses will impact on both the volume of water abstracted and downstream river flows within catchments. It is vital to incorporate anthropogenic processes such as abstraction and discharge within hydrological models to study present day and projected future impacts of water demands and climate change on river flows (Wada et al., 2013). Liu et al. (2017) note that there is an urgent need for good quality data on human water use in order understand the impact on the hydrological cycle, and also the need for improved mathematical representation of such processes within models. Water managers would also benefit from improved management techniques and modeling tools to balance the often-competing demands of different users and the natural environment.

Methodological Overview
Across the UK, abstraction data are available for thousands of individual locations, each with a license for use, an amount, an indication of when abstraction can take place, and the actual amount of water abstracted (generally less than the license amount). Discharges data are available for permitted or consented locations which include flow limits. Management of these licenses and permits is undertaken by the four UK regions (England, Wales, Scotland, and Northern Ireland), and each region is able to make a subset of these data available for research use under a license agreement which safeguards the confidentiality of the license holder and location. Some regions provide information on the licensed amounts, while others also provide a record of the actual amount of water abstracted each year. The work presented here will focus on England, which abstracts the greatest volume of water of the four regions -on average 16.8 billion cubic meters per year between 1999 and 2014 from all three sources (surface water, groundwater, and tidal water) -and for which records are available of the actual volumes abstracted.
Conversion of thousands of individual abstraction amounts and associated locations into a single spatially consistent data set is a nontrivial task. One issue is that for many licensed uses (e.g., Fish Pass/Canoe Pass, River Recirculation, and Hydroelectric Power Generation), abstracted water is returned immediately to a river after use and thus has little overall impact on downstream river flows. A second data set provides estimates of individual discharges to rivers from sewage treatment works (STW) and other large sources. In order to estimate consistent data sets for the net abstraction and discharge of water in each 1 × 1 km grid cell in England from both surface water (rivers) and groundwater, pragmatic decisions have been made (Section 3). Figure 1 presents a schematic of the methodology used, highlighting the three main stages of the work: conversion of point abstraction and discharge data into monthly 1 × 1 km grids across England (Section 3), use of the gridded abstraction data in an enhanced G2G hydrological model formulation (Section 4), and finally, an analysis showing how use of the new data and model formulation has led to improved model simulations of gauged river flows.

Abstractions, Hands off Flow (HOF) Conditions, and Discharges
This section describes the steps used to provide a consistent set of surface water, groundwater, and tidal water abstraction data, HOF condition information and anthropogenic discharge data on a 1 × 1 km spatial grid. and tidal water) active at any period of time during the years between 1999 and 2014 and discharge permits during 2017, together with the locations of HOF conditions. The Environment Agency (EA) regions are also shown for reference. The maps highlight the wide spatial distribution of abstraction license and discharge permit locations across England (about 68,000 abstraction locations in total, of which 45.0% is surface water, 54.5% is groundwater, and 0.5% is tidal water; and there are about 7,200 discharge locations). It is worth noting that not all licenses permitting abstraction are used in practice. For example in 2014, water was abstracted from only about 50% of the 34,000 active licenses within that year. Although, the tidal water abstraction data are not currently used with the G2G hydrological model (Figure 1), they are included here to provide a more complete understanding of abstraction levels across England.

Abstractions
Every year, abstraction license holders in England are required to submit their recorded amount of water quantity (in most cases measured using a water meter) that has been abstracted (known as "returns") to the EA on 28th April and 28th November, depending on the terms of their license. Here, a subset of these data consisting of the water abstraction licenses operational from 1999 to 2014 and associated monthly returns (in some cases the total within the abstraction period) were provided by the EA (under license for research purposes) based on information in the National Abstraction Licensing Database (NALD).
Each abstraction license includes information on: • abstraction location • authorized period of abstraction (three categories: Summer -1 April to 31 October, Winter -1 November to 31 March, and All year) • source type (groundwater, surface water, or tidal water) • licensed maximum annual and daily quantities (m 3 ) • primary use description (e.g., Agriculture, Energy Production, Water Supply, and Industry) • point-purpose descriptor (number of points and uses per license).
Currently, there are 57 primary use descriptions with an associated use code and loss factor category ( Table 1). The loss factor provides a high-level indication of the percentage of the abstracted water that is "lost" to water resources, that is how much is not returned to the river/landscape by the license-holder or subsequent use. Any one abstraction license does not necessarily correspond to one location or one use, so the "point-purpose descriptor" indicates the number of abstraction point locations and purposes per license. Options are (a) single point -single purpose, (b) single point -multi-purpose, (c) multi-point -single purpose, and (d) multi-point -multi-purpose, which makes the production of simple summary gridded data sets more challenging.
The returns data set provides very similar information without abstraction locations to the abstraction license, but importantly, it includes a record of actual abstraction values for each month or for some cases the total within the abstraction period. Yearly variations in actual abstractions are due to a variety of reasons including the weather, the issue of new licenses or changes to existing licenses, changes in the abstraction amount for different sectors, and improvements to the efficient use of water. Note that NALD does not hold returns for abstractions less than 20 m 3 day −1 , and from 2008, abstractors with a license amount less than 100 m 3 day −1 were no longer required to submit records of abstraction.
The location-based abstraction returns data were converted to gridded data sets of monthly abstraction totals for each of the 57 primary uses (Table 1) as follows: • Most abstraction returns values were provided monthly, however for licenses where only total annual volumes were available, these were divided equally over the abstraction period (start to end months). • Any apparent inconsistencies in license returns were resolved using simple assumptions. For example, license returns submitted for multiple locations but for the same purpose were summed. Multiple repeated license returns (identical total submitted for multiple locations and/or uses) were ignored (only one value was kept).  For a few licenses where the returns data contained uses not listed in the license information file, additional uses were retained. • For more complex licenses, which return one abstraction total for multiple points and/or uses, pragmatic assumptions have been made to ensure every abstraction point is associated with an abstraction value and one or more uses. For example, for those licenses where a single abstraction return value (and use) is associated with multiple points, the return value is divided equally between the points. Similarly, for licenses which only provide a single total return value for multiple points and multiple purposes, the return values are equally divided between the points and uses. • The licensed abstraction return values within each 1 × 1 km grid cell are added together to derive 1 × 1 km resolution monthly grids of total actual abstracted water (m 3 month −1 ) for each of the 57 different uses for both surface water and groundwater abstraction. • These derived monthly grids represent total water abstracted for the 57 different uses from surface water, groundwater, and tidal water sources and do not take account of water immediately returned to source by the license holder. A further step (Section 3.4) applies to certain types of abstractions to convert the total volume of water abstracted for each use into a net volume abstracted.
Despite the (above) pragmatic assumptions, this new data set provides a valuable spatially consistent record of monthly water abstractions for 57 primary uses across England between 1999 and 2014. The data set could be extended to include more recent data and/or other UK regions (Wales, Scotland, and Northern Ireland) as data become available.

Hands off Flow (HOF) Conditions
Surface water abstraction is constrained by a HOF value (m 3 day −1 ), requiring abstraction to cease (or reduce) if the river flow falls below this threshold. This requirement is designed to prevent detrimental impact of excessive abstraction on the environment, and protect river ecosystems during periods of low flows particularly during drier years. This means that during drought periods when the river flow is below the local HOF threshold, the license holder will be temporarily unable to abstract their full licensed amount. For this study, HOF conditions for surface water abstractions were obtained from the EA Water Resources Geographic Information System (WRGIS; provided under license in 2017). The license number and point location in the HOF data set enabled the HOF values to be linked with the surface water license returns data obtained from the NALD. Each HOF value is associated with a location and use. For 1 × 1 km grid cells containing multiple abstraction points, uses, and HOF values, the highest HOF value within each 1 × 1 km grid cell was used. The final output data set consists of a 1 × 1 km resolution grid of HOF values (m 3 day −1 ).

Discharges
The anthropogenic discharge data used alongside the abstraction data consist of discharge consent information for England obtained from the WRGIS July 2017 version (provided under license in 2017). This data set represents effluents from sewage treatment works (STW) and other large (>20 m 3 a day) direct discharges to the river system. It excludes natural river flows arising from rainfall-runoff processes across the upstream catchment. The data provide: discharge consent number, site name, location, purpose, consented maximum flow rate, consented dry weather flow, and recent actual discharge rate. The recent actual discharge daily rate is based on observed summer or consented dry weather flow discharge. Where recent time series anthropogenic discharge data are available (e.g., from STW), the data provided by EA consist of the Q 95 of the time series which is typical of a dry summer outflow to the water source. If time series is not available, the consented dry weather flow or derived values from EA internal models are used instead in the WRGIS. The final output consists of a single annual 1 × 1 km resolution discharge data set (m 3 day −1 ) for England, derived by summing all recent actual discharge values within each 1 × 1 km grid cell.

Summary Grids and Statistics: Abstractions and Discharges
The previous three subsections provide 1 × 1 km gridded estimates of abstraction totals for surface water, groundwater, and tidal water sources, surface water abstraction constraints (HOF), and anthropogenic discharges. In practice, a proportion of the total water abstracted is returned directly to the source by the license holder (see loss-factor column in Table 1 and Section 3.1) and another proportion is included in the anthropogenic discharge data (Section 3.3). However, although the abstractions and discharges data sets are related, it has not been possible to link individual abstraction licenses and discharge permits. In other words, there is no linkage between abstraction license numbers/use descriptions and discharge consent numbers/purposes. Often, the discharge associated with an abstraction occurs at a different location or river.
The loss factor for each of the 57 primary uses is listed in Table 1 and indicates the percentage of the abstracted water that is not returned or net loss to water resources. The four loss factor categories (EA, 2020) are High (100%), Medium (60%), Low (3%), and Very Low (0.3%). For three of these categories (high, medium, and low abstraction losses) an assumption is made here that any water returned to the river is accounted for in the discharge data set. The advantage of using separate abstraction and discharge data is that discharges can take place at a different location or river to the original abstraction. This spatial discrepancy can be accounted for in the spatially distributed discharge data set used here. However, for the abstractions associated with "Very Low" losses (termed "Through Flows," e.g., Fish Pass/Canoe Pass, River Recirculation, and Hydroelectric Power Generation) the returns are so high and localized that an assumption that returns are included in discharges cannot be made. For these "Very Low" loss abstractions this uncertainty has led to the creation of two sets of gridded monthly abstraction data to represent the envelope of uncertainty. One data set assumes all abstracted water is removed from source ("100% Abstraction"). The second "Weighted Abstraction" also assumes 100% abstraction except for the "Very Low" loss category where only the 0.3% of the water volume is removed mainly due to conveyance losses. When each abstraction data set is used in hydrological modeling (Section 4) returns are provided by the discharge data set (Section 3.3). The "100% net abstraction" data set represents a scenario of higher volumes of water abstracted and assumes that the statutory discharges and through flow returns are exactly as provided in the discharge data set. Whereas, the second (weighted) abstraction data set reflects a lower net volume of water abstracted and would be expected to have a more modest reduction to downstream flows.
To ensure that surface water abstraction and discharges are applied to rivers, values for surface water abstraction and discharge that are located on land are moved downstream until they are located on a river grid cell (defined in Section 4.2). This ensures that surface water abstractions are removed from water channels with water capacity to enable abstraction to take place. In practice, most surface water abstraction and discharges were already located on river grid cells and only a minority of abstraction locations were moved downstream, generally no more than 5 km. Figure 3 summarizes abstractions from all three sources (surface water, groundwater, and tidal water) with associated processes including discharges for primary use sectors. The average (1999-2014) abstraction amount and amount for each source and sector is also provided. On average, tidal waters are the dominant source (47.9%) of total abstracted water followed by surface waters (39.5%) and groundwater (12.6%), though the total water 9 of 24 abstracted will vary from year to year. The dominant use sectors for each source (with abstractions greater than a billion cubic meters per year) are Water Supply, Energy Production, and Agriculture for surface water abstraction, Water Supply for groundwater abstraction, and Industry for tidal water abstraction.
The discharge total (for 2017) showed that about 40% of the total mean abstraction (1999-2014) from all three sources (surface water, groundwater, and tidal water) is returned back to surface water or tidal water. The rest is either lost in the system (consumption, evaporation, transpiration, and conveyance loss) or reaches the system by other means (through flows and infiltration).

Background: The Grid-To-Grid (G2G) Hydrological Model
The Grid-to-Grid (G2G) is a national-scale hydrological model that provides estimates of river flows, runoff, and soil moisture on a 1 × 1 km grid across Great Britain (Bell et al., 2009;Moore et al., 2006). The G2G model formulation represents the processes of runoff-production and flow routing over a wide area, and across Great Britain it is typically run with a time-step of 15 min. The G2G is used operationally for countrywide forecasting over England and Wales by the Flood Forecasting Centre (Price et al., 2012), and over Scotland by the Scottish Flood Forecasting Service Maxey et al., 2012). Limited information on abstractions/discharges was used in the configuration of G2G across England and Wales, but no use was made in its configuration for the Scottish Flood Forecasting Service. The G2G is also used to assess the impact of climate change on floods (Bell et al., 2012(Bell et al., , 2016, low flow frequency (Kay et al., 2018), and droughts (Rudd et al., 2017(Rudd et al., , 2019. A particular advantage of G2G is that it has one spatially consistent configuration for the whole model domain and is able to represent a wide range of hydrological regimes due to the use of spatial data sets of terrain, soil/ geology, and land cover in its construction. G2G estimates a value of river flow for every 1 × 1 km grid cell across Great Britain and does not require at-site calibration using flow observations. Until now, the river flow estimates produced by G2G have more readily represented natural (rather than observed) flows since they do not take a detailed account of abstractions and discharges. The following section describes the set of model enhancements now implemented to accommodate the use of spatially distributed abstraction and discharge data.

Modifications to the Hydrological Model
The G2G model is modular in form and distinguishes between runoff-production and lateral routing of runoff to form river flow. The runoff-production scheme divides the landscape into square grid cells of vertical soil columns that are subject to precipitation and evaporation gains and losses. Some of the rainwater entering the soil column drains laterally to adjacent grid cells, while saturation-excess flow contributes to surface runoff. Water also moves downwards via percolation and drainage, thereby contributing to groundwater (sub-surface) flow. The G2G is often run using an optional snowmelt component (Bell et al., 2016), but that option has not been used here.
In G2G, the lateral flow routing along surface and sub-surface pathways employs kinematic wave equations, applied in one-dimension over a two-dimensional river network. Two formulations are used and both belong to the class of models based on the Horton-Izzard equation or nonlinear storage model. This family of equations assumes that outflow, q is related in a nonlinear way to the volume of water in store, S, such that q = kS m with parameters k > 0 and m > 0. The routing schemes used in the G2G take the following general form where u is the lateral input of water to the store, and a = mk 1/m and b = 1−1/m are model parameters (Dooge, 1973;Moore, 1999;Moore & Bell, 2002).
In the G2G model, flow routing of surface and sub-surface runoff takes place in surface and sub-surface pathways, over land and river cells, linked by a return flow term representing water transfer between sub-surface and surface pathways (Bell et al., 2007(Bell et al., , 2009Moore et al., 2006). River cells are identified in terms of a critical drainage area or river network density beyond which flow in a 1 km grid cell is assumed to be via a river. The flow routing scheme can take a range of options. Here, for surface river channels a quadratic storage model (m = 2) is employed, which is applied to a varying width channel network (Ciarapica & Todini, 2002;Moore, 2007), and elsewhere (subsurface land and river cells, and surface land cells) a linear storage model (m = 1) is used (Bell et al., 2007).
Data sets for gridded monthly abstraction and daily rate of discharge water volumes (Section 3) are used to include abstraction/discharge in the lateral input of water, u, which was previously confined to surface and subsurface runoff and the return flow. The model schematic in Figure 1 illustrates how abstraction and discharge data are used within the G2G hydrological model. The monthly abstraction (m 3 month −1 ) and daily discharge rate (m 3 day −1 ) are divided equally between the 15 min G2G routing model time-steps.

10.1029/2021WR029787
11 of 24 For the routing scheme presented in Equation 1, surface flow for river grid cells q r has lateral inputs u increased through the addition of anthropogenic discharge, Q s , and decreased by surface water abstraction, A s , thus taking the form but for surface flow for land grid cells q l , the lateral inputs u remain the same = + Here, R l and R r denote land and river return flow respectively, and u l and u r are inflows for land and river grid cells, respectively (which include the runoff generated by a runoff-production scheme). The surface water abstraction amount A s is only subtracted if the local river flow q r is above the HOF threshold.
For the subsurface, where the subscript b denotes subsurface (baseflow) pathways, the lateral inputs u to subsurface river grid cells take the form Here, flow inputs are reduced by the groundwater abstraction A g if there is sufficient flow available; otherwise the excess abstraction volume is subtracted from the baseflow stores of neighboring grid cells. Similarly, groundwater abstraction A g is subtracted from the lateral inputs to subsurface land grid cells, u, using = − − if there is sufficient water available; otherwise the excess abstraction volume is subtracted from the baseflow stores of neighboring grid cells.

Meteorological Data, Driving Data, and Hydrological Model Simulations
The G2G hydrological model (Bell et al., 2009) requires driving data consisting of spatially distributed (gridded) time series of precipitation and potential evaporation (PE). The driving data used here consist of 1 × 1 km daily precipitation from CEH-GEAR (Gridded Estimates of Areal Rainfall; Keller et al., 2015;Tanguy et al., 2016) and 40 × 40 km monthly short grass PE from the Met Office Rainfall and Evaporation Calculation System (MORECS; Hough & Jones, 1997). For use in G2G, the 40 × 40 km PE data were copied to each of the corresponding 1600 1 × 1 km grid cells of the hydrological model grid and then divided equally over each time step. The 1 × 1 km resolution daily precipitation was divided equally over each 15-min model time step.
To evaluate the impact of including abstractions and discharges in G2G model simulations of river flows, a number of model simulations were undertaken for the period 1 January 1999 to 31st December 2014. A model initialization or "spin-up" period of 2 years from 1 January 1997 to 31st December 1998 was used. The three G2G model simulations were evaluated: • Natural: the standard G2G formulation (with no abstractions or discharges). • 100% Abstraction: the enhanced G2G formulation, with time series of 100% abstractions, plus discharges. • Weighted Abstraction: the enhanced G2G formulation, using time series of 100% abstractions except for "Very Low" loss category where only the 0.3% percentage of the abstraction is accounted (as described in Section 3.4), plus discharges.

Hydrological Model Performance Assessment
Four performance scores were used to quantify different aspects of the agreement between modeled and gauged flows; two based on the daily time series, one based on the magnitude of flow errors, and one based on the flow duration curve (FDC).
Two of the time series performance scores are based on the model efficiency criterion of Nash and Sutcliffe (1970), defined as: RAMESHWARAN ET AL.

10.1029/2021WR029787
12 of 24 where Q o,i is the gauged flow for time step i, Q m,i is the modeled flow for time step i, is the mean of observed data, and n is the number of time steps. The NS can range between −∞ and 1 where NS = 1 means a perfect match between modeled and observed data, NS = 0 indicates that the modeled data are as accurate as the mean of the observed data and NS < 0 indicates that the mean of the observed data is a better predictor of the flow than the model. The NS is more suitable for assessing model performance at high flows, so it is adapted for assessing low flows by taking the natural logarithm of the flow data, to increase sensitivity to low and mid-range flows; where ε is a small number usually defined as = ∕100 . The NS log can range between −∞ and 1 which is interpreted the same as for NS.
The BIAS indicates the magnitude of errors in modeled daily flows relative to gauged daily flows: The BIAS can range from −∞ to +∞ with a value of 0 indicating no model bias, BIAS > 0 indicating model overestimation and BIAS < 0 indicating model underestimation.
The FDC performance score, the percentage bias in low flow volume lfv, compares the statistical characteristics of the flows rather than the time-step equivalence. It is calculated from the low flow end of the FDC, which is obtained by ranking the flows from a (daily) time series and selecting the flow corresponding to the percentile point p (between 1 and 100); Q m,p and Q o,p are thus the flow equaled or exceeded p% of the time. Following Kay et al. (2015): where the function f is taken as the square root. lfv only compares up to the 95th percentile flow (from the 70th) so as not to include extreme low flow values, which can be more severely affected by errors in flow measurements due to instrument inaccuracies in shallow flows or low velocities, changes in channel shape and/or weed growth and sedimentation (Petersen-Øverleir et al., 2009). A positive lfv value indicates that the modeled flow is generally greater than gauged flow.
The performance of the G2G simulations of daily mean river flow is assessed by comparing with gauged daily river flow data for 605 catchments (Figure 4a) using data from the National River Flow Archive (NRFA: nrfa.ceh. ac.uk). Flow data for as many catchments as possible were used in the performance assessment and catchments in England were only excluded from the analysis if no observations were available for the assessment period (1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014). The large number of catchments provides good spatial coverage across England, but many smaller catchments are nested within larger catchments so there is some overlap.
The performance assessment of the new G2G flow simulations against observations is presented for three sets of catchments: all catchments, abstraction-dominated (red catchments in Figure 4b), and discharge-dominated (blue catchments in Figure 4c). The abstraction-or discharge-dominated catchments are identified by comparing the annual mean surface water abstraction and discharge for each. Discharge-dominated catchments are those for which this difference is negative. Of the 605 study catchments, 348 were discharge-dominated and 253 were abstraction-dominated (the latter includes 18 with abstraction from groundwater but not surface water without any discharges). A further 4 catchments had no groundwater or surface water abstractions or discharges and were excluded from the analysis (yellow catchments in Figure 4d).

How do Abstraction and Discharge Impact on River Flows?
Across the 605 study catchments, the balance between the volumes of water abstracted and discharged varies depending on the catchment as shown in Figure 4b, which highlights catchments (in red) for which mean abstraction volumes (1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014) are typically larger than discharges. However, even for catchments where mean annual abstractions and discharges balance, there will be some months, particularly during dry summers, when the volume of water abstracted exceeds the discharge, resulting in river flows below the natural value. 14 of 24 On average, three times as much water is abstracted from surface waters than from groundwater sources as indicated in Figure 3. Model-simulated hydrographs reflect this, and for most catchments, groundwater abstraction has a lesser impact on downstream river flows than surface water abstraction or catchment discharges.
By way of example, Figure 5 shows the influence of introducing each anthropogenic intervention (abstraction or discharge) in turn on the G2G model simulated daily river flows for one of the largest catchments, the Thames at Kingston (catchment area ∼9,948 km 2 ). Observed (gauged) river flows are shown with a gray line and the standard G2G-estimated natural flows are shown in green. The other hydrographs show the impact of introducing groundwater abstraction only (brown dashed line), surface water and groundwater abstractions only (yellow dashed line), and all abstractions and discharges (pink line). Note that in this example, the abstraction totals assume 100% abstraction. The simulated hydrograph that includes all abstractions and discharges in the G2G (pink line) is closest to the observed (gauged) flow. In this catchment, the discharges are ∼30% of the total abstraction volume.
Figures 6a-6c compares gauged and G2G-simulated daily river flows for three catchments from 1st January 2014 to 31st December 2014. For each catchment, the hydrographs show G2G natural flows (green), the impact of "Weighted Abstraction" (blue dashed line), and the greater impact of "100% Abstraction" (pink).
For the heavily abstracted Thames at Kingston catchment, the G2G model reproduces daily gauged river flows reasonably well, particularly in the case of the "100% Abstraction" run, which is lower than the "Natural" run as expected (Figure 6a). The simulation with "Weighted Abstraction" slightly overestimates the river flow. On other hand, in the discharge-dominated Trent at Drakelow Park, where there are no "Very Low" loss abstractions, the influence of discharges is clear. In this catchment, the flows from the "Natural" G2G simulation are too low but both the "100% Abstraction" and "Weighted Abstraction" simulations reproduce the gauged daily river flows fairly well (Figure 6b). In the Darwen at Blue Bridge, the "Weighted Abstraction" flow simulation is closest to the gauged river flows because this catchment is mainly dominated by the "Very Low" loss abstractions and the "Weighted Abstraction" volume is only 30% of the total ("100% Abstraction") abstraction (Figure 6c). In some catchments, the "100% Abstraction" G2G model simulations are closest to gauged river flows, but in other catchments, the "Weighted Abstraction" simulations perform better.
As stated before, it is not possible to link the abstraction licenses with discharge permits, but these comparisons suggest that some "Very Low" loss abstraction licenses might not have associated discharge permits due to the nature of the abstraction. Hence, some of these abstractions should be excluded from the total abstraction volumes used for model simulations. It is also worth noting that in the heavily abstracted Thames catchment, an estimated 20% of the water in the abstraction data sets is not removed from the G2G model stores (and water balance) because in dry summer months, simulated river flows fall below the HOF condition and prevent abstraction from taking place. In reality, this water may well be abstracted by the license holder if they do not have an accurate estimate of gauged river flows at the point of abstraction. This issue does not occur in the Trent and Darwen catchments, where the abstractions are almost fully accounted for.  Figure 7 presents boxplots of the G2G model performance when compared to gauged daily flows for three sets of catchments: all 605 catchments, 253 abstraction-dominated, and 348 discharge-dominated. The boxplots compare the skill scores (NS, NS log , BIAS, and lfv) from standard G2G model for "Natural" flows (green), G2G with "Weighted Abstraction" and discharges (blue), and G2G with "100% Abstraction" and discharges (pink). The boxplots of NS values show slight improvement across all the model simulations and all three sets of catchments (1.5% improvement). The boxplots of NS log values indicate an overall improvement in model performance for low flows for all catchments when abstraction and discharge data are included, but particularly for the "Weighted
Based on the four assessment criteria, the median performance across all 605 study catchments is improved through the use of abstraction and discharge data in the G2G hydrological model. The improvements in model performance are most apparent in discharge dominated catchments for which NS log is improved by 22.6% and BIAS is improved by 2% or above (2.0% for "Weighted Abstraction" and 2.6% for "100% Abstraction"); however, lfv estimates are 2% to 3% worse (by 2.7% for Weighted Abstraction' and by 2.2% for "100% Abstraction"). For discharge-dominated catchments where the Natural G2G simulations tend to underestimate flows (negative BIAS and lfv), use of abstractions and discharges in the model results in more positive BIAS and lfv performance measures indicating a greater tendency to overestimation of river flows.
In abstraction-dominated catchments, the median improvements in model performance are more modest and the simulations using "Weighted Abstraction" were improved by 1.4% and 1.7% in NS and NS log , respectively, and by 0.2% in BIAS. Use of "100% Abstraction" led to no improvement in any of the median performance criteria. Both the "Weighted Abstraction" and "100% Abstraction" simulations that use abstractions/discharges have negative or lower negative median values of BIAS and lfv compared with the "Natural" G2G simulations. This shift in lfv towards negative underestimates of river flows indicates that in some catchments more water has been abstracted than in reality and this particularly affects the 100% abstraction simulation. The geographical locations of gauging stations for which the upstream catchments are most heavily impacted by artificial influences are shown in Figure 8a. Dark brown-shaded catchments are more significantly discharge dominated and dark blue/green catchments are more abstraction dominated. The occurrence of relatively high levels of abstraction in the chalk aquifers of South East England are as expected, but the map also highlights that many of the most heavily influenced catchments lie in the North and Central (historically industrial) regions of England.
The accompanying maps of the G2G model performance score (NS log and BIAS) shown in Figures 8b and 8c highlight the differences in performance between "Natural" G2G simulations and those that take account of abstractions and discharges. These maps confirm that the most significant improvements in model performance (green dots in Figures 8b and 8c) tend to occur in catchments that are most heavily impacted by artificial discharges (brown dots in Figure 8a), though decreases in model performance can also occur in some catchments. Differences in model performance between the "Weighted Abstraction" and "100% Abstraction" simulations are mainly apparent in the South West, particularly Wessex, and Devon, and Cornwall.

Is Temporal Variation in Abstraction Data a Requirement for Good Hydrological Model Performance?
The work presented here shows how records of monthly abstraction and discharge data can be included in process-based hydrological models, leading to improvements in model performance in anthropogenically influenced Figure 8. Maps of the ratios of "net abstraction/mean flow" for 605 catchments (net abstraction is the difference between annual mean surface water abstraction and the annual mean discharge, mean flow data is obtained from the National River Flow Archive (NRFA), negative ratio indicates discharge-dominated, and positive ratio indicates abstraction-dominated) and performance score (NS log and BIAS) differences between two cases ("Weighted Abstraction" -"Natural"; top) and ("100% Abstraction" -"Natural"; bottom).

10.1029/2021WR029787
18 of 24 catchments. However, regular monthly or annual data on human water use are often not available for historical model simulations and are definitely not available for projected future scenarios of water availability. Long-term mean values of the monthly actual abstractions (and daily rate of discharges) could instead be used to support these analyses. To understand the feasibility of such an approach and the likely errors in flows incurred through the use of long-term mean monthly abstraction values in place of temporally varying "actual" data, the G2G flow simulations for 605 catchments (Sections 5.1 and 5.2) were repeated using mean monthly abstraction data values for the period 1999 to 2014. Figure 9 presents scatterplots showing how the use of mean monthly abstraction volumes in place of actual values affects G2G model performance. In general, model performance is very similar and there are only a handful of catchments where the NS, NS log , BIAS, and lfv values are substantially different between the runs with actual or mean monthly abstractions. For example in the "mean" runs, the NS log differences are within 5% of those from the actual runs in 572 catchments (95%) for the "Weighted Abstraction" case and 523 catchments (87%) for the "100% Abstraction" case.
The differences are most apparent for the "100% Abstraction" runs and are mainly due to large changes in abstraction volumes occurring between 1999 and 2014, for which associated annual discharge data are not available (the discharge data used here are for a single year, 2017). For example, in the three catchments circled in green in Fig

Discussion
In this study, a grid-based hydrological model (G2G) has been modified to realistically account for surface water and groundwater abstractions and discharges as a result of human activities. This paves the way for simulating projected future changes in river flow incorporating changes in water use as well as climate change.
Although substantial progress has been made in this study, the challenges of incorporating future infrastructure improvements such as a national water transfer network, alongside abstraction reform and other water resource management efficiencies are extensive.
The relative success with which actual abstraction values can be replaced by long-term mean values indicates that mean monthly abstraction values could potentially be used as the basis for a "business as usual" scenario of near-future water use and could also be perturbed to reflect the impact of future drought-occurrence and projected population change on water use.
There are inherent uncertainties in hydrological modeling and the process of gauging flows (Beven, 1993;Butts et al., 2004;Coxon et al., 2015), and it is important to recognize the degree of uncertainty associated with the research presented here. This analysis into the value of using abstraction and discharge data to improve a distributed hydrological model has focused on just one hydrological model, the Grid-to-Grid (G2G), and the impact on the performance of other models, including calibrated conceptual catchment models, has not been evaluated. Previous assessments showed that G2G performed well in simulating both high (Formetta et al., 2017) and low flows (Rudd et al., 2017), but can perform less well in catchments impacted by human influences, such as abstractions, discharges, and managed reservoirs (Bell et al., 2009(Bell et al., , 2012. This study has shown how spatiotemporal data on abstractions and discharges can be used to improve the model across England, but other human influences such as reservoirs, channel management, and canal infrastructure have not been considered. In order to provide G2G (and other distributed models) with the spatiotemporal abstraction/discharge data required, license data for individual sites have been discretized spatially (to a 1 × 1 km resolution square grid) and temporally (monthly). Uncertainties will arise from discretizing these data to a 1 × 1 km representation of the drainage network, and potentially, some abstractions and discharges for upstream sites could be erroneously included in the wrong catchment. Similarly, the locations of discharges, HOF conditions, and surface water abstractions were assumed to be rivers, so any such licenses located in non-river G2G cells were moved to the nearest river cell downstream. Another potential source of uncertainty is the HOF condition, which was implemented in terms of the model-simulated flow at each time-step, rather than daily mean. This approach may have limited surface water abstraction more or less strictly than in reality, and further information about how the HOF condition is applied in practice would be beneficial.
One of the greatest potential sources of uncertainty in including both abstraction and discharge data in a hydrological model, is that some abstraction use-types return a high proportion of the water back to the river. At present, there is no direct link between individual abstraction licenses and discharge permits, and thus there is some uncertainty about how many of those returns are already included in the discharge data. To address this uncertainty, two sets of potential abstraction returns have been evaluated (Section 3.4) to provide an envelope of uncertainty. The difference in G2G performance using the two abstraction estimates ("Weighted Abstraction" and "100% Abstraction") indicate that the uncertainty associated with how discharges and abstraction returns are linked has only a modest impact on river flows (less than 3.6% difference in impact on NS and NS log and less than 1.4% on lfv and BIAS). It is hoped that further work will reduce many of the uncertainties identified in combining spatiotemporal data of abstractions and discharges in hydrological models.
The modeling of human impacts on river flows using recorded spatiotemporal water abstractions and discharges presented in this paper provides a first step to advance the hydrological model capabilities and offers a valuable tool for simulating human-modified and natural river flows in both spatially distributed hydrological and land-surface models (Boone et al., 2009;Sood & Smakhtin, 2015). The representation of the impacts of human-induced changes on water resources in global-or national-scale hydrological models is an important but challenging issue ( Sutanudjaja et al., 2018), and in many areas of the world anthropogenic impacts can no longer be neglected (Haddeland et al., 2014). Wada et al. (2017) discussed the need for better representation of the human-water interface in hydrological models and highlighted a range of current challenges in human-water interactions in hydrological modeling which included a lack of human water management information. Ongoing national improvements in monitoring and publication of anthropogenic water demand data (e.g., USGS Water-Use Data and Research program: www.usgs.gov/mission-areas/water-resources) now pave the way to greater use of such data in hydrological and water resource models. The grid-based approach presented here shows how this can be achieved. The performance results from including such data in the area-wide G2G hydrological model (median performance 10.7% improvement in low flows), demonstrate the potential gains from using records of actual water use in place of near-static national statistics. These model enhancements were made possible through the reconfiguration of thousands of individual licensed actual abstraction and discharge records into spatiotemporal grids, a process which not only simplifies their use in hydrological models, but in the long-term may help to overcome data security issues associated with the potential release of raw license information for scientific research.
In England, surface water and groundwater abstractions totaled ∼10.4 billion cubic meters in 2017 and increased by 3.0% and 7.5%, respectively, since 2014 (DEFRA, 2019b). Across England, ongoing reform of water abstraction management aims to ensure resilience of future water supplies while protecting the environment (DE-FRA, 2019a). These reforms also highlight a future requirement for "dynamic catchment management," so it is vital to develop effective tools to support this goal. The new G2G model development presented here can be used as a potential tool for water resources management and environmental assessment under climatic and anthropogenic changes. There is also potential for this new modeling capability to support determination of the EFI and HOF conditions for catchments for present day and projected future changes in demand and climate.
A recent report by the National Infrastructure Commission (NIC, 2018) indicates that over the coming decades England risks water deficiency due to pressures from climate change impacts, an increasing population, and the need to protect the environment. In order to improve the effective management of water and secure a long-term water supply the report recommended some actions which included (a) improve infrastructure (through a national transfer network in England and new infrastructure) and (b) reduce demand (from 141 L per person per day to 118 L). Along with the hydrological model enhancements outlined here, the new gridded spatiotemporal water abstractions and discharges with the sectorial information provide key data required for developing future water demand in order to enhance our understanding of climate impacts and human influences within catchments. Ongoing work by the authors seeks to provide these (currently license-restricted) gridded abstraction and discharge data sets in some form to other researchers, to support water resource analyses for both present day and projected future periods. However, the grid-based methodology presented here could be applied to any region for which recorded abstractions and discharges are available, and the results presented here demonstrate that the use of such data can improve the performance of hydrological models.

Conclusions
In the future, the growing demand for water due to rising population compounded by adverse climate change impacts will lead to detrimental effects on socio-economic developments and environments. It is essential to enhance national-scale hydrological modeling by integrating human anthropogenic influences as an important driver of the environment. This research provides a methodology to satisfy this requirement, through the development of high-resolution monthly gridded data sets of actual abstractions and anthropogenic discharges. There are ongoing uncertainties in the proportions of abstracted water that are immediately returned to the river or to another river in statutory discharges. These uncertainties have been accommodated through the development of upper ("100% Abstraction") and lower but realistic ("Weighted Abstraction") estimates of abstraction totals for England. The model simulations presented here for catchments across England demonstrate how these (or similar) data can be included into a process-based gridded hydrological model and lead to improvements in model performance.
Despite the various challenges and data limitations identified, the G2G hydrological model has been enhanced significantly from the previous version (Bell et al., 2009(Bell et al., , 2012, which typically provided simulations of area-wide natural river flows (as opposed to gauged river flows). The new model formulation accounts for the influence of actual volumes of abstracted and discharged water on downstream river flows. Both surface water and RAMESHWARAN ET AL.
10.1029/2021WR029787 21 of 24 groundwater sources of abstraction are included, but tidal abstraction sources have not been considered because the G2G is not currently used to simulate flows in tidal rivers. Visual inspection of the impact of abstractions and discharges on G2G-simulated flow hydrographs indicates that the approach of abstracting and discharging water to flow routing stores in the model is effective.
The performance of the G2G model has been assessed by comparison of simulated daily river flows with gauged observations for 605 English catchments for 16 years (1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014). Results show that model performance is generally improved through the use of monthly actual abstraction and daily discharge data, particularly in catchments for which anthropogenic influences are discharge-dominated. Model performance has been assessed using both the upper ("100% Abstraction") and lower ("Weighted Abstraction") estimates of abstraction totals for England, however, it has not been possible to identify a preferred option that provides a representative estimate of net water abstractions across all catchments. Further investigation into the links between abstractions and discharges will undoubtedly shed light on this uncertainty, and where available, use of annual discharge data in place of data for a single year (2017) would be highly beneficial. The results of this study for England have shown that anthropogenic interventions (alteration from natural system) are particularly significant in North and Central (historically industrial) regions, and in the chalk aquifers of South East England and South West (Wessex and Devon and Cornwall).
An improved ability to incorporate anthropogenic water use in hydrological and land-surface models is vital for understanding how human behavior, water management policies, and projected climate change will impact on future water resources. Population growth, demands for increased food production, and economic growth until 2050 are expected to be heterogeneous across the world resulting in competition between water demand, available water resources, and water pollution, which could lead to water scarcity (Boretti & Rosa, 2019). In England, the population is projected to increase from 56 to 62 million between 2018 and 2043 (ONS, 2018), potentially increasing the demand for water. Concurrently, climate change is expected to modify precipitation and temperature extremes, which in turn could alter the hydrological response within catchments. Unless managed well, the impacts of climate change, alongside increasing demands from an increasing population, could lead to adverse effects on society and the environment. Ongoing work based on the very unique data sets and model developments presented here is investigating the impacts of future scenarios of anthropogenic water use and climate change, and should shed some light on these issues.
Despite the critical importance of human interventions on current and future water scarcity (Haddeland et al., 2014), to the authors' knowledge there are no previously published examples of spatially distributed hydrological models configured to use actual (recorded) distributed water use data. This paper presents a novel approach to the use of spatiotemporal records of water abstraction and discharge data in hydrological models, by configuring individual licensed abstraction values, hands off flow conditions, and discharges onto a 1 × 1 km national-scale grid, then adapting a grid-based hydrological model to use them. The results presented here demonstrate the potential gains available to the international hydrological and land-surface modeling community from using records of actual water use (where available), in place of more widely used national statistics. their staff Richard Davis, Richard Thornton, Beverly Atkinson, and Jenny Yarr for supplying the abstraction, hands off flow condition and discharge data, and for providing support. Thanks to Bob Moore for advice on international hydrological literature, thanks also to Helen Gavin (University of Oxford) and Matt Fry (UK-CEH) for helping with licensing of the abstraction, hands off flow condition, and discharge data. The authors are grateful to the Editor and two reviewers (Dr Gemma R Coxon and anonymous reviewer) for their constructive comments.