Compartmentalisation and groundwater–surface water interactions in a prospective shale gas basin: Assessment using variance analysis and multivariate statistics on water quality data

An environmental concern with hydraulic fracturing for shale gas is the risk of groundwater and surface water contamination. Assessing this risk partly involves the identification and understanding of groundwater–surface water interactions because potentially contaminating fluids could move from one water body to the other along hydraulic pathways. In this study, we use water quality data from a prospective shale gas basin to determine: if surface water sampling could identify groundwater compartmentalisation by low‐permeability faults; and if surface waters interact with groundwater in underlying bedrock formations, thereby indicating hydraulic pathways. Variance analysis showed that bedrock geology was a significant factor influencing surface water quality, indicating regional‐scale groundwater–surface water interactions despite the presence of an overlying region‐wide layer of superficial deposits averaging 30–40 m thickness. We propose that surface waters interact with a weathered bedrock layer through the complex distribution of glaciofluvial sands and gravels. Principal component analysis showed that surface water compositions were constrained within groundwater end‐member compositions. Surface water quality data showed no relationship with groundwater compartmentalisation known to be caused by a major basin fault. Therefore, there was no chemical evidence to suggest that deeper groundwater in this particular area of the prospective basin was reaching the surface in response to compartmentalisation. Consequently, in this case compartmentalisation does not appear to increase the risk of fracking‐related contaminants reaching surface waters, although this may differ under different hydrogeological scenarios.


| INTRODUCTION
The rapid expansion of hydraulic fracturing (fracking) to exploit unconventional shale gas reservoirs in the United States has led to a range of environmental concerns: induced seismicity (Davies, Foulger, Bindley, & Styles, 2013); water usage and contamination (Kondash, Lauer, & Vengosh, 2018;Vengosh, Jackson, Warner, Darrah, & Kondash, 2014;; fugitive methane (CH 4 ) emissions (Boothroyd, Almond, Qassim, Worrall, & Davies, 2016;Boothroyd, Almond, Worrall, Davies, & Davies, 2018); human health effects (Currie, Greenstone, & Meckel, 2017); air quality and noise (Goodman et al., 2016); and surface footprint (Clancy, Worrall, Davies, & Gluyas, 2018). Potential contamination of surface waters and groundwater from spills or subsurface contaminant migration has been a particularly common concern (Vidic, Brantley, Vandenbossche, Yoxtheimer, & Abad, 2013). As surface waters and groundwater can be hydraulically connected by pathways, contamination of either water body could result from surface activities, for example, spills and surface water discharge (Gross et al., 2013;Olmstead, Muehlenbachs, Shih, Chu, & Krupnick, 2013), or from the potential subsurface upward migration of formation fluids, stray gas or injected fluids (usually predominantly water but chemicals can be added to: reduce friction; help carry proppants; prevent biological growth and metal corrosion; and remove drilling mud damage) (Myers, 2012;Osborn, Vengosh, Warner, & Jackson, 2011;Warner et al., 2012). Consequently, the vulnerability of surface waters and shallow groundwater resources (<400 m deep as defined by UKTAG, 2011) must now also be considered from a bottom-up perspective (e.g. Loveless et al., 2019) in addition to the classic top-down approach for groundwater vulnerability from surface sources (e.g. Palmer & Lewis, 1998;Worrall & Kolpin, 2004). In both cases an essential part of understanding the vulnerability of surface waters or groundwater is identifying groundwater-surface water interactions, which are indicative of potential pathways contaminants may follow.
Literature reports of proposed water contamination from fracking operations are relatively rare compared to the number of stimulated boreholes and are often disputed. In Weld County, CO, Gross et al. (2013) reported 77 surface spills (0.5% of active wells) between July 2010 and July 2011 contaminated groundwater with benzene, toluene, ethylbenzene and xylene (BTEX) components of crude oil. In northeastern Pennsylvania and southeastern New York, Darrah, Vengosh, Jackson, Warner, and Poreda (2014) reported seven discrete clusters of fugitive gas contamination from 114 groundwater samples, and in central Texas one discrete cluster from 20 groundwater samples. Well integrity failure was hypothesized as the most likely contamination pathway  and has also been proposed by others (e.g. Llewellyn et al., 2015). In Susquehanna County, PA, Jackson et al. (2013) and Osborn et al. (2011) found that shallow groundwater CH 4 concentrations increased with proximity to the nearest shale gas well. Conversely, it was argued that CH 4 is naturally ubiquitous in groundwater and elevated CH 4 concentrations relate to topography and groundwater geochemistry ( Molofsky et al., 2016). For other nations considering or in the early stages of shale exploitation, it is therefore important that the risk of water contamination is assessed, particularly where surface waters and groundwater form important natural resources.
Surface waters and groundwater in England provide on average 70 and 30% of public water supply, respectively (BGS, 2019a).
Water resources in England are managed under the Water Resources Act 1991 (UKPGA, 1991a) and the Water Industry Act 1991 (UKPGA, 1991b), as well as their subsequent revisions. Furthermore, the European Union Water Framework Directive requires EU member states to achieve good chemical and quantitative status of all water bodies (EU, 2000). Site based environmental regulation in England, including at shale gas sites, is carried out by the Environment Agency (EA). Activities related to the onshore oil and gas industry require a range of environmental permits, for example mining waste permits, and authorisations under the Environmental Permitting Regulations 2016 (UKSI, 2016). These permits control discharges and any other relevant risks to the water environment. The EA also determine and publish water protection zones (e.g. Groundwater Source Protection Zones and Drinking Water Protected Areas Safeguard Zones) to protect water resources, as well as publishing River Basin Management Plans every 6 years which consider the water environment in each river basin. To date, two fracking operations (Preese Hall and Preston New Road), both located in the Bowland Basin,1 northwest England, have taken place (Figure 1). These operations targeted the Bowland Shale, which is considered to be England's largest prospective shale gas resource (Andrews, 2013).
The slow development of shale gas resources compared to that in the United States has provided the opportunity to undertake environmental baseline assessments of surface waters and groundwater (e.g. Ward et al., 2018), and further understand the water contamination risk posed by any fluids moving from the deep to shallow subsurface. Historic water quality monitoring, along with focused sampling, can be used for determining baseline conditions and understanding controls on risks to water quality. For example, the influence of underlying bedrock geology on surface water quality, and therefore groundwater interaction and the presence of hydraulic pathways in specific river catchments (Jarvie, Oguchi, & Neal, 2002;Neal et al., 2011;Oguchi, Jarvie, & Neal, 2000) or geographic regions (Rothwell et al., 2010a(Rothwell et al., , 2010bThornton & Dise, 1998). Statistical analyses of groundwater and surface water quality data are often employed to infer interaction (similarities indicating hydraulic pathways and vice versa) (e.g. Guggenmos, Daughney, Jackson, & Morgenstern, 2011).
However, geological information is not always included as an objective parameter. Likewise, groundwater-surface water interactions can be interpreted and quantified using the baseflow index method, but the inclusion of geological parameters is also subjective because it requires an initial 'expert judgement' (Bloomfield, Allen, & Griffiths, 2009). Additionally, the baseflow index method requires hydrograph data. In the United Kingdom, hydrograph data are generally only available on major rivers and tributaries, and are therefore not usually available in the same spatial density as water quality monitoring datasets.
This study includes geological bedrock formations in the statistical analysis of surface water and groundwater quality data from a prospective shale gas basin to investigate groundwater-surface water interactions, and thus potential contaminant pathways. Furthermore, Wilson, Worrall, Davies, and Hart (2017) showed that groundwater compartmentalisation by low-permeability faults can restrict regional horizontal groundwater flow and encourage upward flow, thereby increasing the vulnerability of shallow groundwater to contamination from the upward migration of fracking-related fluids. However, as yet no study has demonstrated if compartmentalisation increases the risk to surface waters with respect to contamination from below or whether compartmentalisation can be identified from surface water F I G U R E 1 Map of the study region (red box on inset map) showing surface water and groundwater sampling locations, and the shale gas sites of Preese Hall (PH) and Preston New Road (PNR). Source: Prospective area of the Bowland Shale from Andrews (2013). Contains OS data © Crown copyright and database right (2018) quality data alone. Although compartmentalisation can be effectively identified using subsurface data, for example, water levels, chemistry and pressure (Hamaker & Harris, 2007;Hortle, Xu, & Dance, 2009;Mohamed & Worden, 2006), the drilling of new groundwater monitoring boreholes can be expensive and time-consuming. For example, when monitoring for groundwater contamination at Pavillion, WY, the expense of drilling boreholes was the main limiting factor in the number of monitoring boreholes installed by the Environmental Protection Agency (DiGiulio, Wilkin, Miller, & Oberley, 2011). In some prospective basins surface water quality data may provide an alternative costeffective and higher spatial resolution method for identifying compartmentalisation and assessing groundwater-surface water interactions as a means of evaluating the vulnerability of water resources to contamination from shale gas operations. Therefore, the aims of this study were to determine: if groundwater in underlying bedrock formations influences surface water quality, thereby indicating potential contaminant pathways; if groundwater compartmentalisation could be identified from surface water quality data; and if groundwater compartmentalisation increases the risk to surface waters?

| APPROACH AND METHODOLOGY
The main approach taken was a factorially designed survey of a newly collected surface water quality dataset. It was not possible to use existing EA surface water quality data because of the sparser sampling density and the complexity in inconsistent sampling frequency since the establishment of the publically available EA dataset in the year 2000. However, publically available EA groundwater quality data since the year 2000, compiled by Wilson, Worrall, Davies, and Hart (2019), were analysed with the new surface water quality data to further the interpretation.

| Study region
This study considered the rivers and aquifers that cross the Bowland Basin in northwest England. The basin contains the Bowland Shales, which may be the United Kingdom's largest prospective shale gas resource (Andrews, 2013). Bedrock geology across the study region ranges in age from Carboniferous to Triassic. In the low-lying west of the basin (the Fylde) bedrock consists of the Triassic Mercia Mudstone and Sherwood Sandstone Groups (Figure 2). In the northern Precipitation over the study region can be split into two zones which correspond to both elevation and bedrock geology. The lowlying Fylde, which is predominantly arable land, has average precipitation <1,000 mm/year. Across the higher elevations of the Forest of Bowland, which is made up of moorland and rough pastures, average precipitation is 1800 mm/year (Mott MacDonald, 1997, 2010. The study region is also split in two by the two major river catchments present: the River Wyre and the River Ribble catchments. The River Wyre, The Sherwood Sandstone Group forms the principal aquifer in the eastern Fylde and is the focus of groundwater abstractions in the study region. Recharge of the Sherwood Sandstone Group is considered to occur by two mechanisms. In the northern Fylde most recharge is considered to occur as vertical leakage through the overlying superficial deposits where low-permeability glacial till is absent (Mott MacDonald, 1997;Sage & Lloyd, 1978). In the southern Fylde it is thought that lateral inflow from the adjacent Carboniferous strata, driven by the topographic difference, helps recharge the Sherwood  Unfiltered water samples (25.0 mL) were acidified in the field (1.0 mL of 30% nitric acid) to fix metal ions prior to laboratory analysis.
Water temperature, electrical conductivity, pH and redox potential were measured in the field using electrode methods. Samples were refrigerated on the same day as returning from the field. A further F I G U R E 2 Map of the study region showing surface water sampling locations with respect to underlying bedrock geology and faults mapped by the BGS. PH and PNR are the shale gas sites of Preese Hall and Preston New Road, respectively. Source: Contains BGS data © Crown copyright and database right (2019). A BGS/EDINA supplied service 111 locations, including 5 locations common to the first campaign, were sampled from 10 to 14 September 2018 in the River Ribble and Hodder catchments. In total 239 surface water samples were collected from 234 unique locations ( Figure 1).

| Temperature correction for electrical conductivity
To compare electrical conductivity between sampling locations it was necessary to normalize field conductivity measurements to a standard temperature. In line with environmental water quality monitoring undertaken by the EA, field conductivity was normalized to 25 C (specific conductance) using the linear equation of Sorensen and Glass (1987): where EC t is electrical conductivity measured in the field at temperature t ( C), EC 25 is electrical conductivity at 25 C, and a is a temperature compensation factor. A standard value of a = 0.02 was used (Hem, 1985;Matthess, 1982).

| Tidally influenced sampling locations
Due to the low-lying nature of the Fylde and proximity to the Irish Sea, some sampling locations were tidally influenced. Surface water samples from these locations could be some mixture of sea and fresh water, depending on the tidal direction and river discharge. This study focussed on fresh water and so samples considered to be dominated or strongly influenced by sea water were removed from the dataset ( Figure 1). Locations to be removed were identified by abnormally elevated conductivity measurements, and in some cases elevated pH and reduced redox potential compared to non-tidal water samples.

| Duplicate sampling locations
To combine the datasets from the two fieldwork campaigns, five locations sampled during the second campaign were common to both campaigns. The five duplicate locations were chosen to cover the geographical extent and varying elevation of the study area. The two sets of results for specific conductance, pH and redox potential from these five locations were compared using one-way analysis of variance (ANOVA). The one factor considered ('Campaign') had two levels ('First' or 'Second' campaign) and the ANOVA was run with and without elevation as a covariate ('Elevation'). Sample location elevations were extracted from Ordnance Survey (OS) Terrain 50 data (OS, 2019) using Esri ArcGIS 10.3. Statistical significance was judged at the 95% probability of the factor not having zero effect. Prior to ANOVA, measurements were tested for normality using the Anderson-Darling test (Anderson & Darling, 1952) and were transformed if necessary.

| Ion concentration analysis
Of

| Factorial survey design
The study was designed to answer two questions using ANOVA: does bedrock geology influence surface water quality and can groundwater compartmentalisation affect surface water quality data?
To assess the former question three ANOVAs were run. Firstly, a one-way ANOVA was run on the surface water field measurements and ion concentrations (collectively referred to as 'determinands'), with and without elevation included as a covariate ('Elevation').
Underlying bedrock geology at sampling locations, determined using British Geological Survey (BGS) 1:625000 Bedrock Geology data, was To assess the potential effect of groundwater compartmentalisation on surface water quality data a three-way ANOVA was con- Prior to any ANOVA, data were tested for normality using the Anderson-Darling test (Anderson & Darling, 1952) and transformed if necessary. Statistical significance was judged at the 95% probability of the factor or interaction not having zero effect. Results are presented as least squares means (otherwise known as marginal means). The proportion of the variance explained by significant factors, interactions and covariates was calculated using the generalized ω 2 method (Olejnik & Algina, 2003). Where factors had more than two levels, post hoc Tukey tests were carried out to assess where significance lay within factors.
Power analysis was also performed post hoc to estimate what effect size could have been detected given the sampling design used to investigate the impact of compartmentalisation, that is, the effective detection limit for differences between bedrock formations and across the Woodsfold fault. Power analysis was performed using G*Power 3.1 software (Faul, Erdfelder, Lang, & Buchner, 2007)-a priori the acceptable power was set at 0.95 (a false negative probability β = .05). The G*Power software measures effect size (f ) using the measured value of ω 2 as derived above from the method of Olejnik and Algina (2003):

| RESULTS
Four surface water samples were identified as being dominated or strongly influenced by sea water. These samples were removed prior to further analysis, leaving a total of 235 surface water samples from 231 unique sampling locations. All field measurements (235 samples) and ion concentrations (170 samples) are provided in Tables S1 and S2, respectively.

| Duplicate sampling locations
For field measurements at the duplicate sampling locations (Table S3), the Anderson-Darling test indicated no transformations were required prior to ANOVA. ANOVA showed that differences in specific conductance, pH and redox potential between the fieldwork campaigns were not significant (Table 1). Elevation was not a significant covariate for pH and redox potential but was significant for specific conductance, however the Campaign factor remained insignificant. with respect to the difference between the campaigns, further assuring that the two campaigns could be directly compared. Given these results the data from the two campaigns were combined for further analysis without any corrections required. Anderson-Darling tests indicated that field measurements required no transformation prior to ANOVA. ANOVA showed that

| Surface water quality and bedrock geology
Geology was a significant factor controlling the specific conductance, pH and redox potential of surface water samples across the basin (  (Table 2). Elevation was a significant covariate for all ions, except Fe, and increased the overall fit of the models (i.e. R 2 values increased), but in all cases did not alter the significance of the Geology factor, that is, there was a geological control on surface water quality over and above that due to elevation. The proportion of variance explained by the Geology factor was greater than that explained by the Elevation covariate for all ions except Mg (Table 2).
Post hoc Tukey tests showed that significant differences lay between the Mercia Mudstone Group, the Sherwood Sandstone Group, and the Millstone Grit and Bowland High and Craven Groups for Ca, Mg, Na and SO 4 ( Figure 5 and Table 2). For Fe and Mn, the Mercia Mudstone Group was the only bedrock formation that was significantly different from all other formations, excluding the Lower Coal Measures ( Figure 5 and Table 2). For K, the Mercia Mudstone and Sherwood Sandstone Groups were grouped together and were significantly different from the Millstone Grit and Bowland High and Craven Groups ( Figure 5 and

| Groundwater quality and aquifers
All groundwater determinands except pH were log-transformed prior to ANOVA because of improved Anderson-Darling test values compared to the raw data. ANOVA showed that Aquifer was a significant factor in explaining differences in all determinands, bar pH, across the aquifer formations (  Figure 5 and Table 3).

| Combined surface water and groundwater quality ANOVA
Prior to ANOVA all determinands, except pH, were log-transformed.
ANOVA results showed that Geology was a significant factor for all determinands (  Figure 6). No significant differences were observed for any determinands between groundwater and surface waters of the Coal Measures, and only K showed a significant difference between groundwater and surface waters of the Sherwood Sandstone Group ( Figure 6). For the Carboniferous, significant differences between groundwater and surface waters occurred for Ca and Na, but not for  The power analysis performed for the experimental design was irrelevant for the Geology factor because all determinands proved to be significant at the 95% probability. For the Water body factor the detectable difference was greater for each determinand than for the Geology factor (Table 4), i.e. it was easier to detect differences between the different geological formations than the difference between the water bodies. For example, for specific conductance the detectable difference between surface waters and groundwater was 142 μS/cm whereas the detectable difference between the different geological formations was lower at 125 μS/cm. However, it should be noted that while the detectable difference for Na concentration was 1850 mg/L, the detectable difference for pH was only 0.15 pH units which may not be meaningfully physical given the typical accuracy of field measurements for this determinand.

| Water quality trends and end-members
The PCA reduced the eight surface water determinands from the 170 surface water samples to two PCs, which explained 79.5% of the variance in the data (Table 5). All determinands had positive loadings in PC1, suggesting that PC1 was a general concentration component.
In PC2 the strongest positive loadings were for Fe and Mn, and the strongest negative loading was for SO 4 (Table 5).  (Table S2). SW-B was located in the northwest PCA on the combined surface water and groundwater data also reduced the eight determinands to two PCs, which explained 75.3% T A B L E 2 ANOVA and post hoc Tukey test results for the field measurements and ion concentrations of the data variance (Table 5). In PC1 all determinands had positive loadings, again suggesting PC1 was a general concentration component. In PC2 the strongest loadings were for Fe and Mn, but with reversed polarity to the PCA on just the surface water data (

| Identifying compartmentalisation
The compartmentalisation of groundwater in the central and southern Fylde by the Woodsfold fault is readily identifiable from groundwater quality data. However, the potential effect on surface water quality is unknown. Therefore, a three-way ANOVA was conducted on the surface water quality data to investigate if groundwater compartmentalisation affects surface water quality and thereby, if surface water quality data might be used to identify groundwater compartmentalisation.
Prior to the ANOVA all determinands, except Ca, pH and redox potential, were log-transformed because of improved normality.
When single factors were considered the only significant differences observed were for SO 4 between the Wyre and Ribble catchments and Mn across the Woodsfold fault (Table 6). When two-way and threeway factor interactions were considered no significant differences were observed (Table 6). Critically, the interaction between the Catchment factor (with levels Wyre or Ribble, that is, north or south in the Fylde) and the Fault factor (with levels East or West of the Woodsfold fault) was not significant, indicating that compartmentalisation across the Woodsfold fault did not significantly affect the surface water quality data. This result could have been due to a lack of effect of groundwater compartmentalisation on surface waters, or that the sampling design was not sufficient to detect the difference, that is, the power of the design was not sufficient and a false negative exists.
However, the power analysis showed that it would have been possible at 95% significance and 95% power to detect an effect which explained 10.7% of the original variance. For specific conductance this corresponded to a detectable difference of 157 μS/cm across the Woodsfold fault (Table 6), which is smaller than the difference in mean specific conductance observed between the Mercia Mudstone and Sherwood Sandstone Group surface water samples ( Figure 9). Therefore, the survey design was capable of detecting a compartmentalising effect with a difference smaller than that observed between the underlying bedrock formations.  (Neal et al., 2011;Rothwell et al., 2010aRothwell et al., , 2010bSoulsby et al., 2007;Thornton & Dise, 1998). Fylde (Wilson, 1990;Wilson & Evans, 1990). Considering SW-C is  (Tellam, 1995).

| Does compartmentalisation affect surface water quality?
The   Figure 4) is based on BGS (1990) and Ove Arup and Partners Ltd. (2014c) 4.4 | Implications for shale gas exploitation  (Sophocleous, 2002). As a result, the receptors of potential contaminants would also change in response to directional flow changes in the pathway.
For the study region surface water quality data indicated regional-scale groundwater-surface water interaction through the superficial deposits, highlighting the importance of localized investigations to identify specific pathways and receptors in shale gas environ-

| Study limitations
In The chemical analyses in this study were limited to field data and ion concentrations due to financial constraints. The additional analysis of other ion concentrations and isotopes (radiogenic and stable) may improve understanding of groundwater-surface water interactions.
Additional non-chemical analyses that could be undertaken to investigate groundwater-surface water interactions include the use of hydrograph and temperature data. Hydrograph data can be used to estimate the contribution of groundwater to river flow (or vice versa) through the calculation of baseflow indices (Eckhardt, 2008) and flow accretion values and indices (Grapes, Bradley, & Petts, 2005). However, the use of hydrographs in basins such as the Bowland Basin may be limited by the spatial density of gauging stations, which are often only located on major rivers and tributaries. Conversely, water temperature data can often be collected easily from a greater number of locations. Given that groundwater can be warmer or cooler than surface waters, temperature anomalies in surface waters could be used to infer groundwater-surface water interaction (Briggs, Hare, Boutt, Davenport, & Lane, 2016).

| CONCLUSIONS
Understanding how groundwater and surface water systems interact is essential for the management and protection of water resources. A variety of data types and methods exist to investigate groundwatersurface water interactions but these can be costly, and thus spatially limited at a basin scale. However, surface water quality data can often be collected and analysed at lower costs, resulting in datasets to which rigorous statistical analyses can be applied rather than subjective interpretations of small datasets. In prospective shale gas basins identifying groundwater compartmentalisation and groundwatersurface water interactions is particularly important for understanding potential contaminant pathways. Using surface water quality data from a prospective basin we showed that bedrock geology was a significant factor influencing surface water quality across the prospective basin, implying regional-scale groundwater-surface water interactions despite the near-ubiquitous presence of superficial deposits with an average thickness of 30-40 m. Principal component analysis supported this conclusion by showing that surface water compositions were constrained within groundwater end-member compositions. Surface water quality data showed no relationship with previously identified groundwater compartmentalisation, even though the statistical analysis was of sufficient power to identify the impact of different aquifer geology on the same surface waters. We propose that although surface waters appear to interact with a weathered bedrock layer via shallow circulation through glaciofluvial sands and gravels of the superficial deposits, there is no chemical evidence to suggest that deeper groundwater in this area of the prospective basin was reaching the surface in response to compartmentalisationenhanced flow. Consequently, compartmentalisation in this area of the prospective basin does not appear to increase the risk of frackingrelated contaminants reaching surface waters.