Use of point scale models to improve conceptual understanding in complex aquifers: An example from a sandstone aquifer in the Eden valley, Cumbria, UK

Understanding catchment functioning is increasingly important to enable water resources to be quantified and used sustainably, flood risk to be minimized, as well as to protect the system from degradation by pollution. Developing conceptual understanding of groundwater systems and their encapsulation in models is an important part of this understanding, but they are resource intensive to create and calibrate. The relative lack of data or the particular complexity of a groundwater system can prevent the development of a satisfactory conceptual understanding of the hydrological behaviour, which can be used to construct an adequate distributed model. A time series of daily groundwater levels from the Permo‐Triassic sandstones situated in the River Eden Valley, Cumbria, UK have been analysed. These hydrographs show a range of behaviours and therefore have previously been studied using statistical and time series analysis techniques. This paper describes the application of AquiMOD, impulse response function (IRF) and combined AquiMOD‐IRF methods to characterize the daily groundwater hydrographs. The best approach for each characteristic type of response has been determined and related to the geological and hydrogeological framework found at each borehole location. It is clear that AquiMOD, IRF and a combination of AquiMOD with IRF can be deployed to reproduce hydrograph responses in a range of hydrogeological settings. Importantly the choice of different techniques demonstrates the influence of differing processes and hydrogeological settings. Further they can distinguish the influences of differing hydrogeological environments and the impacts these have on the groundwater flow processes. They can be used, as shown in this paper, in a staged approach to help develop reliable and comprehensive conceptual models of groundwater flow. This can then be used as a solid basis for the development of distributed models, particularly as the latter are resource expensive to build and to calibrate effectively. This approach of using simple models and techniques first identifies specific aspects of catchment functioning, for example influence of the river, that can be later tested in a distributed model.

of data or the particular complexity of a groundwater system can prevent the development of a satisfactory conceptual understanding of the hydrological behaviour, which can be used to construct an adequate distributed model. A time series of daily groundwater levels from the Permo-Triassic sandstones situated in the River Eden Valley, Cumbria, UK have been analysed. These hydrographs show a range of behaviours and therefore have previously been studied using statistical and time series analysis techniques. This paper describes the application of AquiMOD, impulse response function (IRF) and combined AquiMOD-IRF methods to characterize the daily groundwater hydrographs. The best approach for each characteristic type of response has been determined and related to the geological and hydrogeological framework found at each borehole location. It is clear that AquiMOD, IRF and a combination of AquiMOD with IRF can be deployed to reproduce hydrograph responses in a range of hydrogeological settings. Importantly the choice of different techniques demonstrates the influence of differing processes and hydrogeological settings. Further they can distinguish the influences of differing hydrogeological environments and the impacts these have on the groundwater flow processes. They can be used, as shown in this paper, in a staged approach to help develop reliable and comprehensive conceptual models of groundwater flow. This can then be used as a solid basis for the development of distributed models, particularly as the latter are resource expensive to build and to calibrate effectively. This approach of using simple models and techniques first identifies specific aspects of catchment functioning, for example influence of the river, that can be later tested in a distributed model.

| INTRODUCTION
Understanding catchment functioning is increasingly important to enable water resources to be quantified and used sustainably, for example Hutchins et al. (2018), flood risk to be minimized, for example Rogger et al. (2017), as well as to protect the system from degradation by pollution, for example Wang and Burke (2017). However, catchments often have a high degree of heterogeneity and complexity in their geology and topography, for example Di Lazzaro et al. (2015), which affects their surface and groundwater hydrology, for example Oudin et al. (2010). To manage such catchments well a good understanding of their functioning is required. This can be achieved with a variety of data analysis approaches. Typically these involve a series of activities: a combination of data gathering and monitoring, the interpretation of these data to develop a conceptual understanding of surface and groundwater flow regimes and the subsequent testing of that understanding using mathematical modelling, for example Abesser et al. (2017). The refinement of the conceptual model and its encapsulation in the numerical model is, by its nature iterative and benefits from the use of different data types to develop understanding and to compare model outputs (Schilling et al., 2019). This sequence of activities or workflow leads, ideally, to the creation of a model with the ability to be used to make forecasts or predictions of future behaviour under differing conditions. However, these activities are time consuming and require significant resources (i.e. person years of effort), they require access to suitable datasets as well having the appropriate techniques that can be used to successfully analyse these data. The interpretation of time series of observed data from catchments, such as groundwater levels or stream flows, are essential for the understanding the response of catchment hydrological system to the inputs of rainfall and groundwater recharge. To be reliable this analysis and interpretation should be carried out in a systematic and reproducible way and could take the form of statistical analysis, transfer function approaches such as impulse response function (IRF) or relatively simple lumped parameter models. Application of these techniques means that the data can be interrogated efficiently and the conceptual understanding developed and subsequently tested without the need to develop a fully distributed numerical model.
Observed hydrological catchment variables are usually measured sequentially in time, and when collected over a predominantly fixed sampling interval they form a historical time series (Cowpertwait & Metcalfe, 2009), allowing the investigation of temporal behaviour.
From a statistical point of view such historical time series can be treated as different sequences of random variables that can be interpreted by time series analysis. Groundwater systems can be conceptualized as filters transforming an input signal (e.g. rainfall) into an output signal (e.g. flow or groundwater level) by application of a transfer function (Delbart et al., 2016;Mangin, 1984). Once defined, these mathematical relationships can be used to help determine the functioning and structure of aquifers. There are number of time series techniques that have been routinely used to characterize groundwater systems, with a range of functions (extensively described by Box et al., 2015) that can offer a systemic and quantified approach to analysing these systems.
An IRF is a way of converting a series of input time series into an output variable. Typically, this consists of a relationship between response factor and time (see Figure 1 for an example) that can take various shapes according to the behaviour of the modelled system.
Applied successfully an IRF can define the temporal relationship of the particular input variable, for example rainfall, and the output variable, such as groundwater head. It can help characterize the system response and be related to the physical processes operating in the catchment. Recently, the IRF methodology has been used to analyse groundwater flooding in the United Kingdom to offer explanations for differing observation borehole groundwater level responses to a flooding event in winter 2013/2014 (Ascott et al., 2017). IRFs have also been used to understand the impact of rainfall/drought events and changing groundwater abstraction on groundwater levels in the United States (Russo & Lall, 2017) and to estimate recharge rates to Australian aquifers (Hocking & Kelly, 2016 Deterministic models provide a well-established way of simulating groundwater systems and addressing particular questions such as data gaps, for example in the Dumfries Basin; (Jackson et al., 2005) and the impacts of climate change on water resources, for example in the Berkshire Downs (Jackson et al., 2011). However, a fully distributed, time variant model can often be resource-intensive to develop and relies on a mature understanding of groundwater flow in the aquifer and a well-developed conceptual model for its ability to reproduce observed behaviour. Where resources, data or understanding are less available lumped parameter or point models offer a parsimonious method by which the conceptual understanding of the observed groundwater level response at a borehole can be tested. These are relatively simple models, which represent parts of the hydrological system as a series of interconnected single stores which can represent different volumes in the system. They are often applied in situations where there is insufficient data to adequately parameterise a distributed model or where the hydrogeology appears so complicated or uncertain that this is the only way that it can reasonably be mathematically represented. Recent examples of the application of lumped parameter models are found in Mackay et al. (2014aMackay et al. ( , 2014b) who demonstrate the application of different model structures using the AquiMOD code (Mackay et al., 2014a(Mackay et al., , 2014b to help improve conceptual understanding and simulate groundwater hydrograph responses in a range of different UK aquifers. Lumped models have been used to simulate discharge to rivers from aquifers (Anaya & Wanakule, 1993;Rozos et al., 2004), and water levels in a representative observation well within the aquifer (Barrett & Charbeneau, 1997;Scanlon et al., 2003) or head gradient towards a river (Hughes, 2004).
To explore how simpler approaches, which stop short of a fully distributed model, can be applied to develop conceptual understanding a case study in the Permo-Triassic sandstone in the Eden Valley, Cumbria has been used. A systematic study of the available groundwater level time series data was, therefore, undertaken by the application of AquiMOD, IRF and combined AquiMOD-IRF methods to daily hydrographs. The geological and hydrogeological framework of the study area, the groundwater level data set and AquiMOD and the IRF methods (basic and stress related implemented in AquiMOD) are outlined. The AquiMOD model code applied to study the variation between the different borehole groundwater responses is described and the subsequently obtained results presented. Following on from this, the IRF methods, both basic and stress-related, are presented and described. Finally, the implications for understanding the recharge and groundwater flow processes operating in the Eden Valley Permo-Triassic sandstone aquifers are discussed, with reference to the results from the application of both AquiMOD, IRF and the combined AquiMOD-IRF approach to analyse the groundwater level time series.

| Study area
The River Eden Valley (Cumbria, UK) is a large rural area with a relatively low population density, in which agriculture and tourism provide the main sources of income. Permo-Triassic sandstones form the major aquifers in the region and have the potential to provide significant groundwater resources (Butcher et al., 2006). A number of water management issues have been identified, which operate over a wide range of scales for this area, including flooding (Leedal et al., 2013;Mayes et al., 2006), pollutant transport, particularly nitrates (Wang & Burke, 2017) and ecology (Hulme et al., 2012).
However, there is no detailed conceptual model and associated numerical groundwater model for the Permo-Triassic sandstone aquifers in the Eden Valley, which could be used to address these issues. Moreover, any investigation of the impact of climate change on the groundwater flow in the River Eden catchment would require a reliable understanding of the aquifers' responses to recharge at different time scales.

| Geological and hydrogeological setting
The Permo-Triassic rocks of the Eden Valley lie in a fault-bounded basin (Stone et al., 2010) (approximately 50 km long and 5-15 km wide) that is straddled to the southwest by the high ground of Lake District and to the northeast by the hills of the North Pennines ( Figure 2). This basin contains Permian and Triassic strata (see Table 1) which dip gently to the north east ( Figure 2). The Pennine Fault and associated North Pennine escarpment form the eastern boundary of what has been interpreted as a half-graben, resulting in Permo-Triassic rocks outcropping against Carboniferous or Lower Palaeozoic rocks. To the west, the Permo-Triassic succession wedges out against the underlying Carboniferous strata (Allen et al., 1997), which consist of a sequence of layers of limestones, sandstones, mudstones and coals (Table 1; Figure 2).
The Penrith Sandstone formation, which lies unconformably over the Carboniferous, was deposited in a structurally-controlled intermontane basin orientated along the present Eden Valley. These sandstones, mostly of aeolian origin, reach a thickness of about 900 m in the centre of the basin (see cross-section; Figure 3). The basal breccias locally known as Brockram become progressively more dominant southwards. This is composed of angular fragments of dolomitised limestone embedded in a strongly cemented calcareous sandstone matrix (Millward, 2004). The Penrith sandstone (Hughes, 2003) itself consists of well-rounded and well-sorted, medium to coarse grains.
Less well-sorted finer grained sandstone beds with thin mudstone intercalations are common, mainly at the top of the sequence and at the margins of the basin, indicating episodes of fluvial deposition. In the northern part of the basin, parts of the top 100 m of the formation have been secondarily cemented by silica. Where such cement is abundant, the relief is stronger (Hughes, 2003;Millward, 2004;Stone et al., 2010). These cemented sandstones are much indurated and exhibit a very low hydraulic conductivity (Butcher et al., 2006;Waugh, 1970) Over three quarters of the Eden catchment bedrock geology is covered by Quaternary superficial deposits ( Figure 3). Extensive areas of exposed bedrock are mainly restricted to the Lake District, the Northern Pennine escarpment and the outcrop of the Great Scar Limestone Group. Nevertheless, exposed areas of sandstone ('drift windows') are present, mainly in the southern part of the catchment.
The stratigraphy of the superficial deposits is complex, with interdigitations of sand, gravel, silt and clay that may each develop their own piezometric levels, resulting in complex perched water tables above the bedrock formations (Allen et al., 1997).
The Penrith and St Bees Sandstones are considered as the major aquifers in the Eden Valley (Table 1). They are characterized by moderate-high permeability and porosity. The Penrith Sandstone displays both vertical and horizontal heterogeneity (in terms of cementation and grain size), however the St Bees Sandstone tends to be more homogeneous and likely to behave as one aquifer. The hydrodynamic properties of both sandstones are summarized in Table 1 of Lafare et al. (2016), which uses core data from Allen et al. (1997). Generally, the regional groundwater flow is dominated by intergranular flow whilst flow into boreholes is predominantly contributed through fractures, which are only locally inter-connected (Allen et al., 1997 (Allen et al., 1997). The sandstone aquifers are covered by superficial deposits of variable lithology and thickness (up to 30 m in the northern part of the Eden catchment). It F I G U R E 2 Geology of the Eden catchment seems likely that these deposits will have a significant impact on recharge and its distribution (Butcher et al., 2006).
The Permo-Triassic sandstones lie in a shallow synclinal structure aligned along the Eden valley. Where exposed they have been found to be heavily faulted though the superficial deposits probably limit the observation of much of the faulting. However, its impact on the groundwater flow system may be important especially in areas where silicification is found.
The principal aquifer types within the Eden catchment are fourfold: 1. Unconfined sandstone with little or no superficial deposit cover.

Unconfined sandstone covered by superficial deposits (>5 m thick)
and an unsaturated zone within the sandstone.  (Lafare et al., 2014). It can be seen that the hydrographs from the different observation boreholes provide a variety of groundwater level responses in terms of shape and amplitude. Geographical, geological and hydrogeological information was collated and used to provide a qualitative description of the setting for each borehole. This information is summarized in Table 2 along with metrics and parameters related to the subsequently described modelling approaches.  (Lafare et al., 2016). Several characteristic behaviours have been identified and related to the geological and hydrogeological setting of each borehole. In this study, a range of lumped groundwater level time series modelling techniques are implemented and tested using the same data set, with the aim of assessing which modelling approach, which structure and which parameterization is the most suitable for simulating each characteristic behaviour. To achieve this, AquiMOD models have been calibrated to the groundwater time series, the IRF approach has been applied and finally, a combined IRF-AquiMOD approach is implemented which allows input stresses (rainfall, potential evaporation [PE] and river stage) to be used to define the IRF. These approaches are described in turn below.

| AquiMOD lumped groundwater level time series modelling
The AquiMOD model code has been developed and applied to various sites in the United Kingdom (Mackay et al., 2014a(Mackay et al., , 2014b. A summary is provided below for completeness, but further details are available in Mackay et al. (2014aMackay et al. ( , 2014b) and its application for seasonal forecasting is found in Mackay et al. (2015). AquiMOD consists of three basic elements: soil, unsaturated zone and the saturated zone, as described below: 1. Soil processesrecharge: The modified FAO method (Griffiths et al., 2008) is used to calculate recharge from the soil zone on a daily basis.

| Impulse response functions: Basic application
In order to assess and/or refine the previously proposed classification, an IRF methodology was applied, to obtain a fit to the The recharge calculation for this method is described in Long and Mahler (2013). The convolution is a time-series operation that is commonly used in non-distributed hydrologic models to simulate streamflow, spring flow, or groundwater level in response to recharge.
The use of convolution in modelling also has been described as a linear reservoir model and a transfer-function model. The discrete form of the convolution integral for uniform time steps used here is: where h i À j is the IRF, u j is the input or forcing/stress function; j and i are time-step indices corresponding to system input and output, respectively; N is the number of time steps in the output record; β i is an optional time-varying IRF scaling coefficient; φ i represents the errors resulting from measurement inaccuracy, sampling interval, or simplifying model assumptions; and d 0 is a hydraulic-head datum, namely the level to which hydraulic head would converge if the local recharge was eliminated (Long, 2015). The difference between i (output time step index) and j (input time step index), that is i À j, represents the delay time from impulse to response, and the IRF represents a distribution of these delay times. The input function is, in this case, the recharge, and the system response the groundwater level fluctuations. The IRF of the hydrologic system is approximated by a parametric function: the gamma function, equivalent to the Pearson type III function: where λ and η are non-dimensional shape parameters. The equation is approximated in RRAWFLOW by the discrete form: where t is time centred on each discrete time step, t 0 and N are time centred on the initial and final time steps, respectively, and Δt is the time step duration (Long, 2015). An additional scaling coefficient ε is introduced for increasing the range of hydrological applications: where ε (non-dimensional) compensates for hydrological systems that do not have a one-to-one relation between system input and output  Using a combination of manual trial-and-error and automatic optimisation based on the Monte Carlo approach (Hill & Tiedeman, 2006), the IRF parameters were fitted in order to produce a good fit to the borehole hydrographs. A single IRF was first considered, and its parameters fitted. This allowed the range of parameters used as input for the optimisation process to be reduced, and to include a second IRF. The resulting IRF for each borehole was then extracted, plotted, and described using a number of metrics. These metrics were chosen to quantify a number of characteristics of the IRF shapes. To define these metrics, the IRF was assumed to be a frequency distribution of the transit times of the response (hydraulic head). The selected metrics quantify the IRF shape independently from the scale so that comparisons are not weighted by IRF amplitude (which can vary from one location to another). Therefore, a number of ratios are calculated. The metrics included two scale-independent moments, skewness and kurtosis, and five ratios: standard-deviation/mean, standard-deviation/ memory, mean/memory, mode/memory, peak-height/area. These seven metrics were quantified for wet and dry periods separately, resulting in 14 metrics. These metrics are tabulated along with the general shapes of the optimized IRF, allowing comparison of the results for each borehole, and a new classification to be proposed.
2.6 | AQUIMOD-IRF: Impulse response functions using different stresses The modelling approach described above allows the identification of whether a particular groundwater level fluctuation behaviour requires the combination of two (or more) IRFs to obtain a reasonably good fit. This methodology was extended to use IRF to simulate groundwater level fluctuation, using rainfall, potential evapotranspiration (PE) and surface water level (river stage) as driving variables.
Indeed, such a modelling approach has been shown to be able to To calibrate the model, Monte Carlo simulation was performed using an initial range of parameters, for all the boreholes. An initial fit, using only the climatic stresses (rainfall and evaporation) was obtained. This allowed a reduction in the range of parameters of the climatic IRF θ p (see Equation (A4)) and the parameters produced which scale the evaporation (i.e. f in Equation (A3)). Using the reduced initial ranges, a second Monte Carlo run was carried out, this time including the surface water level stress and the associated IRF. In this way, the contribution of the surface water level stress to the improvement of the fit to the groundwater level fluctuations could be assessed. The boreholes for which the surface water level has a significant influence were then identified.

| RESULTS
The results are described and discussed first separately for each method used, and then summarized and discussed as a whole. The first application of AquiMOD to simulate the borehole response is described, then the single and double responses and finally the double response function with stresses are presented.

| Application of AquiMOD
A number of model structures have been tested; these include varying the number of layers for the saturated zone from one to three, as well as the inclusion or absence of an Unsaturated Zone. The results are summarized in Table 2 for the best-fit model (highest NSE) for each AquiMOD model. Some of this poor performance can be explained by the short data periods, for example Coupland and Renwick. It is also possible to improve the fit by removing the Unsaturated Zone components for the shallow boreholes situated on the banks of the River Leith, that is numbers 17/18 and numbers 20/21. However, in general it is more difficult to identify a model structure that reproduces effectively the hydrographs for the boreholes drilled in the Penrith sandstone. It is, therefore, necessary to understand what processes can be contributing to this failure to reproduce the hydrographs with the model.
Given that the best fit, consistently good for the calibration and the validation period, was obtained for the boreholes drilled into the St Bees sandstone formation, an example of the match obtained between modelled and observed is presented for the Hilton OBH (No. 9) which has a NSE of 0.94 for both calibration and validation ( Figure 5). To illustrate the identifiability or otherwise of the parameters obtained during calibration, an example of a sensitivity analysis is presented ( Figure 6).Specific yield shows identifiability and these are consistent between the fitted parameters (and model structures) for most of the St Bees boreholes. The exception is Scaleby, No. 6, which is situated in the north, on the outcrop of the Kirklinton sandstone, which is more silicified than the St. Bees sandstone, so represents a significantly different lithology.
It appears that to simulate the borehole response for the Penrith sandstone a different approach is required to that of just using AquiMOD as a lumped parameter groundwater model. The methods used to investigate the flow processes that could be operating, but which may not be able to be simulated by a lumped model are, therefore, described in the following section.  Table 3).

| Single, double impulse response function
F I G U R E 5 Comparison of observed and hydrograph and AquiMOD model output for best Nash Sutcliffe Efficiency (NSE) for calibration and validation at Hilton (No. 9; St. Bees) 4. A single IRF with an exponentially decreasing shape is obtained for the two boreholes situated in the Brockram at the base of the Penrith sandstone, for example Coupland (No. 14).
As mentioned earlier this work is an extension of that presented by Lafare et al. (2016) where three types of aquifer response were identi- To investigate how surface water flows and levels can influence groundwater response more fully the stress related IRF was implemented.   Figure 12) response function.

| AQUIMOD-IRF: Stress related double impulse response functions
The stress-related IRF provides another way of decomposing a groundwater level fluctuation time series, and provides reliable information on the degree of connection between aquifers and surface water in this groundwater system. undertaken by producing the IRF curves would then be able to identify the likelihood for a successful model calibration.

|
The shortcomings of the AquiMOD approach can be addressed using the time series IRF incorporated into the AquiMOD code structure (AquiMOD-IRF). In particular, the influence by other factors such as river stage on groundwater hydrographs can be explored by adding this as a stress within the IRF approach. It can be seen (e.g. Figure 10) that including this stress improves the simulation of groundwater hydrograph behaviour. This is particularly the case for the shallow boreholes which show a distinct influence from the river. Indeed proximity to rivers becomes an important factor in successfully deploying the combined AquiMOD-IRF approach. This work has clearly shown that the use of the lumped parameter models and IRFs in the analysis of groundwater level time series can distinguish the influences of differing hydrogeological environments and the impacts these have on the groundwater flow processes. They can be used, as shown in this paper, in a staged approach to help develop reliable and comprehensive conceptual models of groundwater flow. This can then be used as a solid basis for the development of distributed models, particularly as the latter are resource expensive to build and to calibrate effectively. This approach of using simple models and techniques first identifies specific aspects of catchment functioning, for example influence of the river, that can be later tested in a distributed model. Further as demonstrated here, these techniques allow the identification of particular processes to be included in the distributed model, for example two timescales of T A B L E 3 IRF summary statistics response (fast and slow). For distributed models, the zonation of hydrograph response identified using these techniques can be used directly in providing an initial basis for the distribution of parameters.

| SUMMARY AND CONCLUSIONS
It has been demonstrated that these techniques can help in the definition of areas of significant heterogeneity, the influence of various lithological horizons and of surface water/groundwater interaction on the groundwater piezometric behaviour.

ACKNOWLEDGEMENTS
Hughes publishes with the permission of the Executive Director, British Geological Survey (UKRI). The work was funded by a grant (NE/I006591/1) under the NERC Changing Water Cycle programme.
Dr Chris Jackson is gratefully acknowledged for his support in extending AquiMOD to include IRF. Dr Majdi Mansour is acknowledged for providing an internal BGS review. Two anonymous reviewers are thanked for their comments which helped improve the paper.

DATA AVAILABILITY STATEMENT
Data used in this paper are available from the Environment Agency and UK Met Office via the following APIs:

ORCID
Andrew G. Hughes https://orcid.org/0000-0001-9940-1813 same as that of precipitation p, but it is negative and is modelled as