Baseflow Age Distributions and Depth of Active Groundwater Flow in a Snow‐Dominated Mountain Headwater Basin

Deeper flows through bedrock in mountain watersheds could be important, but lack of data to characterize bedrock properties limits understanding. To address data scarcity, we combine a previously published integrated hydrologic model of a snow‐dominated, headwater basin of the Colorado River with a new method for dating baseflow age using dissolved gas tracers SF6, CFC‐113, N2, and Ar. The original flow model predicts the majority of groundwater flow through shallow alluvium (<8 m) sitting on top of less permeable bedrock. The water moves too quickly and is unable to reproduce observed SF6 concentrations. To match gas data, bedrock permeability is increased to allow a larger fraction of deeper and older groundwater flow (median 112 m). The updated hydrologic model indicates interannual variability in baseflow age (3–12 years) is controlled by the volume of seasonal interflow and tightly coupled to snow accumulation and monsoon rain. Deeper groundwater flow remains stable (11.7 ± 0.7 years) as a function mean historical recharge to bedrock hydraulic conductivity (R/K). A sensitivity analysis suggests that increasing bedrock K effectively moves this alpine basin away from its original conceptualization of hyperlocalized groundwater flow (high R/K) with groundwater age insensitive to changes in water inputs. Instead, this basin is situated close to the precipitation threshold defining recharge controlled groundwater flow conditions (low R/K) in which groundwater age increases with small reductions in precipitation. Work stresses the need to explore alternative methods characterizing bedrock properties in mountain basins to better quantify deeper groundwater flow and predict their hydrologic response to change.


Introduction
Baseflow represents stream water derived from both shallow and deep subsurface flow paths that sustain late summer streamflow after precipitation or snowmelt events cease. Baseflow is recognized as an important source to stream water in mountainous watersheds (Gabrielli et al., 2012;Hale & McDonnell, 2016;Miller et al., 2016;Rumsey et al., 2015). It reflects the integrated effects of surface processes controlling hydrologic partitioning of rain and snowmelt to evapotranspiration (ET) and subsurface flow, as well as the distribution of subsurface permeability and the relative importance of groundwater circulating to different depths. Interflow or shallow, ephemeral flow through the soil (and saprolite) occurs through either strong permeability contrasts or a seasonally rising water table. Interflow reaches a stream network on the order of days to weeks. In contrast, saturated groundwater moving through alluvial and bedrock units can have a wide range of flow paths and travel times reflect lithology, fracture networks, and geologic structure (Heidbüchel et al., 2012). Groundwater contributions to streams are important for the biologic integrity of the river network (Meyer et al., 2007;Missik et al., 2019), while the time water spends in the subsurface interacting with host material directly influences biogeochemical processes that control mineral weathering (Winnick et al., 2017) and carbon dynamics Perdrial et al., 2018). Subsurface residence time also indicates the degree of catchment memory of past inputs to reflect hydrologic sensitivity to land use and climate change (McGuire et al., 2005) and potential persistence of contamination (Mahlknecht et al., 2017).
There is a growing recognition that deeper parts of bedrock aquifers in mountain watersheds could be an important component of the hydrologic system (Condon et al., 2020), transmitting and storing a larger amount of water and having a larger influence on stream biogeochemical processes than previously appreciated. However, the codependant relationships between climate, topography, and vegetation in mountain watersheds and their interactions with subsurface geology to affect depth of subsurface flow paths and resultant stream water age distributions remains poorly understood. This is largely due to a lack of data characterizing watershed-scale heterogeneity of snow distribution and melt dynamics (Deems et al., 2006;Harpold et al., 2012), soil water storage and losses to ET (Allen et al., 2013;Wang & Dickinson, 2012), subsurface hydraulic properties (Meixner et al., 2016), and active circulation depths (Frisbee et al., 2016).
Lumped parameter approaches have been applied to numerous catchments lacking detailed hydrologic characterization (McGuire & McDonnell, 2006). The age of streamflow (or catchment transit time) is estimated by the convolution of time-varying inputs of an environmental tracer (e.g., 3 H, δ 2 H, δ 18 O, and Cl) applied uniformly across a watershed and lagged through the subsurface by assuming a travel time distribution (e.g., piston flow [PF], exponential, gamma, Weibull, and dispersion) that is adjusted to match observed tracer concentrations in streamflow. Recently, lumped-parameter approaches have been developed that include variance in travel time distributions to address seasonal changes in flow pathways and mobilization of stored water of varying age (Botter et al., 2011;Harman, 2015;van der Velde et al., 2012). These analytical solutions have been applied to a variety of scenarios, including the influence of snow processes on streamflow (Fang et al., 2019) and selective vegetation uptake (Smith et al., 2018). In contrast to these low-order modeling strategies, numerical mechanistic models and particle tracking can include complex boundary conditions and directly represent physical and hydrological characteristics dictating catchment flow pathways that determine the travel time distribution (Danesh-Yazdi et al., 2018;Engdahl et al., 2016;Maxwell et al., 2016). These models provide a powerful platform to study basin sensitivity to changing climate and other conditions, but their application in steep, mountainous basins is still limited by data scarcity, with bedrock hydrologic data on permeability, porosity, and flow rates extremely rare. Data that do exist are complicated by the difficulty in characterizing fractured bedrock (Cesano et al., 2003) that are typically dominant in mountain basins. Lack of subsurface data in mountain systems makes quantifying groundwater flow at depth and its relative importance in mountain hydrology uncertain. This knowledge gap remains a major impediment to properly incorporating the deeper subsurface flow system into hydrologic models of mountainous watersheds and assessing the relative importance of groundwater on stream water exports.
To address data scarcity inherent to mountain systems, we apply a method for dating baseflow presented by Sanford et al. (2015) using stream water N 2 , Ar, SF 6 , and CFC-113 observations collected over a 12-hr period in Copper Creek, Colorado (24 km 2 ), a snow-dominated, headwater basin of the Colorado River. The method's effectiveness for describing a unique streamflow age distribution has been demonstrated in low-to-moderate-gradient streams but has not been applied in a high-gradient system where it could be hampered by fast gas exchange velocities to the atmosphere (Gleeson et al., 2018;Solomon et al., 1998). This study seeks to validate the gas tracer methodology in a high gradient stream and use the baseflow age information as an indirect observation of subsurface properties within a previously published hydrologic model of the basin . The second objective is to use the gas tracer informed hydrologic model to explore the relative importance of shallow versus deeper subsurface flow contributions to stream water as a function of baseflow age. Lastly, we explore sensitivity of groundwater flow paths to changes in climate and land use.

Site Description
The East River (85 km 2 ) is a seasonally snow covered, mountainous watershed in the headwaters of the Upper Colorado River (Figure 1) located near Crested Butte, Colorado. A comprehensive overview of the site is provided by Carroll et al. (2018) and Hubbard et al. (2018). Climate is continental subarctic, and stream discharge is dominated by snowmelt with peak flows occurring in early June and receding through the summer and fall. Monsoon rains occur in the summer months. This study is focused on Copper Creek (24 km 2 ), the largest tributary of the East River. Land cover is predominantly barren alpine (50%) and conifer forests (36%) with smaller representations of meadows, shrubs, and aspen. Elevations range from 2,880 to 4,128 m. The upper portion of the watershed and adjacent peaks is underlain by Tertiary granodiorite. This younger, intrusive rock has upturned Paleozoic and Mesozoic sedimentary strata into steeply dipping hydrostratigraphic units that underlay the lower portion of Copper Creek (Figure 1b). Talus, rock glaciers, and alluvial fans dominate surficial deposits within Copper Creek's glacially sculpted valleys (Gaskill et al., 1991). Descriptions of individual geologic units are provided in Table 1.  (Gaskill et al., 1991) with geologic units described in Table 1. Original hydrological modeling of Copper Creek for years 1987-2018 captures a wide range of snow accumulation and snowmelt scenarios and estimates annual precipitation equal to 1.39 ± 0.27 m/year, of which 78 ± 7% is snow (mean ± sd; see section 3.2 for a description of method) . Annually, the basin is simulated as energy limited (potential ET < precipitation) with average ET losses equal to 36% of annual precipitation. The model indicates that the bulk of ET is lost from the soil zone (67% ET), and lesser amounts are lost from sublimation (13% ET), canopy evaporation (9% ET), and groundwater ET (11% ET). On average, stream water is estimated 65% interflow and 35% groundwater, with groundwater volumetric contributions to streamflow relatively stable across the historical period.

Atmospheric Gas Tracers 3.1.1. Sample Collection and Analysis
Stream water dissolved gas samples were collected following the approach presented by Sanford et al. (2015). Samples were collected hourly for two relatively inert atmospheric gases (N 2 and Ar) and two gas age tracers (SF 6 and CFC-113) over a 12-hr period. The stream sampling site, CC03 (Figure 1), was chosen because it is located low in the watershed, allowing an integrated baseflow age estimate representing most of Copper Creek but is above a steep canyon containing large waterfalls to minimize degassing. The stream section immediately upstream of the site was free of deep pools and zones of anomalously high turbulence. The location is 2 km above the confluence of Copper Creek to the East River and resides just below the geologic interface between the upper basin bedrock of granodiorite and the lower basin sedimentary strata of higher permeability. The experiment was conducted on 27 August 2017, which is late enough in the year to potentially avoid monsoon rains and possible surface runoff contributions to the stream yet early enough to still have large diurnal stream temperature fluctuations characteristic of summer months. Stream water temperature was measured every 5 min (Solinst Level Logger Edge M3001 LT F6/M2) beginning the day Table 1 Geologic Units in Copper Creek (Modified From Gaskill et al., 1991)

10.1029/2020WR028161
Water Resources Research before and extending through the gas sampling period. Local barometric pressure was obtained from a Rocky Mountain Biological Laboratory weather station located 1.9 km from CC03 ( Figure 1) and adjusted for the elevational difference of 70 m. Water samples for N 2 , Ar, SF 6 , and CFC-113 were collected in duplicate every hour from 8:30 a.m. to 8:30 p.m. (supporting information Tables S1 and S2), to capture minimum and maximum diurnal stream fluctuation. Samples were collected using a peristaltic pump and polyethylene tubing placed several centimeters off the stream bed (stream depth 0.3 m). Sample containers were filled using techniques described at the U.S. Geological Survey's (USGS) Reston Groundwater Dating Laboratory website (USGS, 2017), stored on ice, and shipped the next day to the USGS Dating Laboratory for analysis. Samples for CFC-113 and SF 6 were analyzed using purge and trap gas chromatography with an electron capture detector (GC-ECD), while samples for N 2 and Ar were analyzed using gas chromatography with a thermal conductivity detector (GC-TCD, USGS, 2017). Measurement errors based on the sample duplicates were 1.7% for N 2 , 1.4% for Ar, and 1% for SF 6 and CFC-113, these being consistent with labreported errors (USGS, 2017).
Dissolved SF 6 concentrations were measured in three perennial springs located in the vicinity of CC03 to provide independent groundwater age information ( Figure 1). Samples were collected in July and October 2017. Argon and N 2 were measured in two of these springs. Spring CCS is located higher in the Copper Creek watershed 1.7 km from CC03, and springs RCS and RGS are located within adjacent tributaries of the East River 5.4 and 6.7 km from CC03, respectively. All three springs are at elevations between 3,200 to 3,500 m. Dissolved gases were collected using a peristaltic pump with the intake attached to a PushPoint pore water sampler (PPX36, https://www.mheproducts.com/) inserted into the shallow sediment at the bottom of each spring pool. Sample collection followed the same protocols indicated above (USGS, 2017), and all samples were analyzed by the USGS Dating Lab.

Tracer Data Interpretation
Following Sanford et al. (2015), measured stream dissolved gas concentrations were used with stream temperature and local atmospheric pressure measurements to simultaneously solve for the rates of gas and water exchange into and out of the stream, as well as the concentration of the gases in groundwater discharging into the stream. A control volume approach accounts for all inputs and outputs of water and gas along the length of the stream and assumes gases are nonreactive. The change in gas concentration in the stream (C s [M/L 3 ]) with time [t] was described by the following mass balance equation (Equation 11 in Sanford et al., 2015): where C gw (M/L 3 ) is the gas concentration in groundwater discharging into the stream; C e , (M/L 3 ) is the atmospheric equilibrium gas concentration; τ w (t) is the water residence time in the stream, equal to the stream depth divided by the rate of upward groundwater seepage through the streambed; and τ g (t) is the gas residence time in the stream, equal to the stream water depth divided by the gas transfer velocity (v g [L/t]). The gas transfer velocity governs the rate of gas exchange between the stream and the atmosphere. Stream concentrations are an intermediate value between C gw and C e ; if τ w << τ g , then C s approaches C gw , and if τ w >> τ g , then C s approaches C e and it is difficult to discern C gw .
The equilibrium gas concentration fluctuates due to diurnal stream temperature oscillations and can be computed using Henry's Law with the form: where P a is the dry atmospheric pressure (atm), T is the stream temperature (Kelvin), and a, b, c, and d are gas-specific coefficients. The method takes advantage of the oscillatory variation of C e with time to simultaneously solve for C gw , τ w , and τ g for each gas using an explicit finite difference representation of Equation 1. Because τ g is one of the computed parameters, a key advantage of this method, over many other techniques using dissolved gas tracers in streams to characterize groundwater inputs, is that it does not require an independent determination of v g . The purpose of measuring concentrations of N 2 and Ar, in addition to the age tracer gases, is that they can estimate the recharge temperature (T r ) and excess air concentration (A e , [M/L 3 ]) for groundwater. The recharge temperature is the temperature at the water table at the recharge location, and A e is an excess component of air dissolved in groundwater due to the dissolution of air bubbles trapped when the water table rises during recharge events (Stute & Schlosser, 2000). The unfractionated air model of excess air formation (Aeschbach-Hertig et al., 2000) was assumed in this study. When T r and A e were known, estimated values of C gw for SF 6 and CFC-113 were used to calculate their atmospheric concentrations at the time of recharge, which in turn provides a mean age for baseflow through use of an assumed travel time distribution (Busenberg & Plummer, 2002). Sanford et al. (2015) assumed a Weibull distribution in which the cumulative distribution function (CDF) is defined as follows: in which the scale parameter (k) and shape parameter (n) are adjusted to match the computed atmospheric concentration to the historical record (USGS, 2017). The Weibull distribution is equivalent to the exponential mixing model when n = 1.
Values of C gw for each gas were estimated using a modified version of the Excel spreadsheet calculator developed by Sanford et al. (2015), which minimizes the misfit between measured and modeled values of C s employing the automated General Reduced Gradient solver tool (provided as supporting documentation).
Modifications included a reduced time step from 0.25 to 0.1 hr and expressing the misfit between measured and modeled values with the sum chi-square (χ 2 ), rather the sum squared error, to assess statistical significance of predicted water column concentrations. The sum χ 2 is the square of the difference in measured and modeled values divided by the square of the measurement error. Following Sanford et al. (2015), up to nine parameters can be adjusted to match stream gas concentrations: T r , A e , groundwater excess N 2 (potentially present from denitrification), τ w , a single gas residence time for N 2 and Ar, gas residence times for SF 6 and CFC-113, and C gw values for SF 6 and CFC-113.
For springs CCS and RCS, T r and A e values were derived from measured N 2 and Ar concentrations and used to compute a PF groundwater age from the measured SF 6 concentration using standard methods (USGS, 2017). The PF model (uniform sample age) was assumed instead of a more realistic age distribution (mixed-age sample) because the purpose of the spring samples was to provide a reasonable estimate of expected groundwater ages for the watershed, and the PF age likely differs little from the actual sample mean age for samples as young as these springs (e.g., <3 years difference assuming an exponential model for samples <15 years old). The recharge elevation was assumed to be the approximate mean elevation of the portion of the watershed directly upslope of the sampled spring. For spring RGS, Ar and N 2 were not collected and the mean T r and A e from the other two springs was used in the calculation of the PF age.

Integrated Hydrologic Model
Simulated energy and water budget components in Copper Creek rely on the USGS Groundwater and Surface water Flow model (GSFLOW, Markstrom et al., 2008). GSFLOW dynamically couples the USGS Precipitation-Runoff Modeling System (PRMS, Markstrom et al., 2015) and the Newton formulation of the USGS 3-D Modular Groundwater Flow model (Harbaugh, 2005;Niswonger et al., 2011) and 1-D simplification of Richard's equation for the unsaturated zone (Niswonger et al., 2006). The model describes daily surface and groundwater interactions related to ET including soil evaporation and plant transpiration, canopy interception, snow sublimation, and groundwater ET. The hydrologic model also estimates interflow, groundwater recharge, change in groundwater storage, and groundwater-surface water exchanges derived from differential gradients between groundwater and stream water elevations (Huntington & Niswonger, 2012) and deeper groundwater flow based on lithology and geologic structure.
Hydrologic model parameterization was based on the approach of Carroll et al. (2019). The finite difference grid resolution was 100 m with elevations resampled from the USGS National Elevation Dataset. Landfire (2015) was used to derive parameters of dominant cover type ( Figure S5b), summer and winter cover density, canopy interception characteristics for snow and rain, and transmission coefficients for shortwave solar radiation. Climate forcing for water years 1987-2018 uses minimum and maximum daily temperature lapse rates defined by the two proximal Snow Telemetry (SNOTEL, Figure 1) stations adjusted for aspect. Scofield SNOTEL precipitation was spatially distributed using LiDAR-derived snow depth 10.1029/2020WR028161

Water Resources Research
observations from the Airborne Snow Observatory (ASO, Painter et al., 2016) during peak snow accumulation on an average water year (4 April 2016, Figure S5a) and adjusted for simulated loses associated with sublimation, canopy interception, and early melt prior to the flight.
Maximum soil water storage was conceptualized as a field capacity threshold above which water is partitioned to either lateral interflow or allowed to percolate downward via gravity drainage into the unsaturated zone (recharge). The spatial distribution of maximum soil storage was the product of rooting depth obtained from Landfire (2015) and available water content as a function of soil type (NRCS, 1991). The Copper Creek geologic model ( Figure S5c) contains nine hydrostratigraphic units with 12 layers ranging in thickness from 8 to 120 m for a total thickness of 400 m. Fracture networks were not simulated. Instead, we used effective hydraulic conductivity that decreased exponentially with depth. The decay coefficient was established as a compromise between isotropic conditions and a fully localized flow system (Jiang et al., 2009), while the surface hydraulic conductivity was optimized to match average observed baseflow at the stream gauge located at the terminus of the basin.

Simulated Baseflow Age Distribution
Baseflow was simulated as the sum of saturated groundwater flow through alluvial and bedrock units and seasonal, shallow interflow. These two components were handled separately because GSFLOW only produces a velocity field for the groundwater system. To accommodate this limitation, GSFLOW was run fully coupled to generate interflow and all fluxes into and out of the groundwater system as well as the resulting water table elevations. Groundwater fluxes included recharge from water seepage below the soil zone plus ephemeral stream leakage, and groundwater ET losses back to the atmosphere. The groundwater model was then decoupled from GSFLOW, and particles (Mpath7; Pollock, 2016) were draped onto the water table surface and run forward in time using GSFLOW generated fluxes. Following Gusyev et al. (2014), individual flow path ages were weighted by calculated recharge but were also corrected by removing the volumeweighted age distribution of groundwater ET determined with backward particle tracking. Particle tracking established the groundwater age CDF delivered to the stream. Lastly, the volume of late August interflow was assumed to be less than 1 year old and added to groundwater age distribution to establish the baseflow age CDF.
Simulated baseflow age distributions test the hydrologic model's ability to reproduce the observed gas tracer water column concentrations at the end of August 2017. Initial water table elevations and water fluxes for water year 2017 initiated the particle tracking simulation followed by historical climate conditions for 1987 to 2018 repeated until all particles exit the watershed. A Weibull distribution (k and n, Equation 3) was adjusted to the flow model baseflow age distribution to solve for stream water SF 6 water column concentrations using Equation 1. The approach was Sanford et al. (2015) in reverse. The calculation of SF 6 water column concentrations required tracer-based estimates of A e and T r , an assumed recharge elevation of 3,400 m (the approximate mean elevation of the watershed above CC03) and the historical atmospheric concentration record (USGS, 2017). If the age distribution was unable to replicate observed stream concentrations based on the sum χ 2 , then subsurface porosity was adjusted to modify water velocities while avoiding any change to the original hydraulic gradients. If adjusting porosity was insufficient to create a statistically significant reproduction of observed gas tracers, then recalibration of the hydrologic model was considered. Recalibration focuses on alluvial and bedrock parameterization based on groundwater age sensitivities (refer to supporting information S2) and not the partitioning between interflow and recharge. The amount of interflow was largely dictated by the maximum soil storage, and it is constrained by observed streamflow during spring snowmelt. However, there is a feedback between groundwater parametrization and interflow such that if water table elevations rise into the soil zone then interflow was generated. If water table elevations are changed through reparameterization of the groundwater system, then some adjustment to the soil system may need to occur to match observed spring runoff. Using the recalibrated model, additional transient simulations explored interannual variability in baseflow age distributions across a range in historical climate conditions. These years include an extremely wet year (1995), a dry year at the end of a multiyear drought (2002), a dry year with a high-precipitation monsoon (2012), a dry year with a low-precipitation monsoon (2018), and the median water year (1998).
Steady-state particle tracking provided information on how deeper groundwater flow paths may change by shifting the historical mean groundwater condition as a function of altering precipitation, temperature, and forest presence. The historical median snow accumulation water year (1998) was used as a baseline 10.1029/2020WR028161

Water Resources Research
condition from which precipitation was incrementally adjusted by 0.4 to 1.8 of the historical daily value with no warming (+0°C). These conditions were repeated for +4°C and +10°C warming applied to both minimum and maximum daily temperatures. The +4°C condition falls in the range of expected end-of-century temperature increase in the East River (Hay et al., 2011), while +10°C forced all snow to fall as rain.
Historical years were also run to steady state to explore variable precipitation type and timing on groundwater age distributions, while the effects of the spatial distribution of precipitation and temperature were tested by assigning uniform daily values. This removed gradients associated with elevation, storm tracking, and snow redistribution. Lastly, the relative importance of forest influences on energy and water budget partitioning and baseflow age was explored by removing deciduous and conifer forests from the basin. Forests were replaced by a barren cover type and the reparameterization of transmission coefficients for shortwave radiation, interception of precipitation, and rooting depths. Maximum soil storage and its conductance were not altered. Hypothetical scenarios were not representative of possible future conditions but a means to explore thresholds in groundwater flow paths as a function of climate and land use. For each scenario, daily climate for a complete water year was repeated until quasi-steady-state conditions occurred. Quasi-steady state was complete when combined changes in saturated and unsaturated groundwater storage were less than 1% the annual water budget. Cell-specific groundwater fluxes (recharge, ephemeral stream losses, and groundwater ET) were aggregated to an annual sum and applied to the water table surface of the steady state, decoupled groundwater model for particle tracking.

Atmospheric Gas Tracers
Measured N 2 , Ar, and SF 6 concentrations along with computed T r , A e , and PF ages for the three sampled springs are shown in Tables S3 and S4. PF ages ranged from 3 to 14 years. The samples from CCS were collected in June after large snowpack accumulation and had the youngest ages (3 to 6 years). The younger ages and warm mean T r of 9.2°C (near the top of the expected range of 0-10°C; see supporting information S1) suggest that CCS either contained a substantial fraction of very young water recharged only weeks prior to sampling during the spring or the sampled water partially reequilibrated with the atmosphere. The other springs were sampled in October with PF ages 7 to 14 years. Mean T r and A e values for the spring samples were 5.5°C and 0.0011 cm 3 STP/g, respectively.
Measured concentrations of N 2 and Ar in Copper Creek were very close to computed equilibrium concentrations for the stream water (Table S2 and Figures S2a and S2b), indicating a relatively small τ g (relatively large v g ) for N 2 and Ar. Because v g is generally positively correlated with stream gradient (Gleeson et al., 2018), a relatively large v g is consistent with Copper Creek's steep gradient of~0.09 above the sample location. In contrast, measured concentrations of SF 6 and CFC-113 were below equilibrium concentrations for the stream water (Table S2 and Figures S2c and S2d). This suggests that (a) the age of groundwater discharging into the stream was sufficiently large that the difference between C gw and C e was nontrivial (on the order of years rather than weeks/months) and (b) though τ g values were relatively small for Copper Creek, they were still large enough so that the age tracer C gw signal was maintained in the stream. Therefore, despite Copper Creek's high gradient, v g for the age tracer gases were sufficiently slow to permit application of the method proposed by Sanford et al. (2015). Although SF 6 and CFC-113 were generally well mixed in the atmosphere, local atmospheric concentrations on the day of sampling could be slightly below the Northern Hemisphere 6-month average values used to compute C e (USGS, 2017), such that C s was actually equal to C e to invalidate the method. A comparison of the atmospheric concentrations required to produce the observed stream concentrations with multiple North American atmospheric monitoring sites indicates that this scenario was unlikely ( Figure S1).
The measured gas concentrations in the stream did not provide a unique solution for C gw for either SF 6 or CFC-113, allowing a broad range of possible ages for baseflow (supporting information S1). However, the range of allowable age tracer C gw values was well constrained on the high end because atmospheric concentrations of these age tracers have generally increased since their introduction in the middle twentieth century. This high-end constraint on C gw can potentially provide a reliable minimum age constraint for baseflow. Note that allowable CFC-113 C gw values were sufficiently low to indicate recharge predominantly before the mid-1990s when atmospheric concentrations started decreasing. The CFC-113 concentrations 10.1029/2020WR028161

Water Resources Research
were discarded from the analysis because, for a given value of τ w , the estimated CFC-113 C gw consistently produced a substantially older mean age (generally by >10 years) than produced by the simultaneously estimated SF 6 C gw regardless of the assumed form of the travel time distribution. This age discrepancy was likely due to either SF 6 contamination from terrigenic production in the subsurface (e.g., Friedrich et al., 2013), CFC-113 degradation occurring under low-oxygen conditions in parts of the aquifer (e.g., Bockgard et al., 2004), or both. A similar discrepancy was observed by Sanford et al. (2015) at some sites in northern Virginia, USA, and they attributed this to terrigenic SF 6 contributions based on evidence of terrigenic SF 6 in groundwater samples from local springs and wells in which concentrations reflected impossibly high atmospheric concentrations. For the Copper Creek samples, we believe that CFC-113 degradation is a more likely explanation because the spring samples in the East River displayed no clear evidence of terrigenic SF 6 contributions. Furthermore, the range of allowable CFC-113 PF and exponential mean ages for baseflow (>30 years) was substantially older than most reported groundwater ages for other mountain watersheds underlain by predominantly crystalline rock (generally <20 years; Manning, 2009;Manning et al., 2019;Plummer et al., 2001;Visser et al., 2019). As such, using the SF 6 measurements to determine a maximum C gw and minimum baseflow age was assumed a more appropriate and conservative approach because any terrigenic additions would increase the estimated C gw , whereas CFC-113 degradation would decrease the estimated C gw .
The maximum SF 6 C gw was estimated through a series of best fit model solutions in which CFC-113 was excluded. A range of SF 6 C gw values were specified, and resulting model fits were evaluated (Table 2 and Figure 2a). Model fits to observed stream water SF 6 concentrations, defined by the sum χ 2 metric, were similar and acceptable (p > 0.1) for SF 6 C gw values up to 2.2 fmol/L. For C gw > 2.3 fmol/L, model fits declined rapidly and became unacceptable. For the purposes of numerical modeling, we moved forward with the requirement that any flow model-generated baseflow age distribution must produce an SF 6 C gw concentration <2.3 fmol/L to be consistent with the stream age tracer measurements. As noted, in section 3.1.2, values of T r and A e must be defined to compute C gw for a given age distribution. Since C s and C e were essentially equal for N 2 and Ar, the stream N 2 and Ar concentrations provided no meaningful constraints on T r and A e ( Figure S3). Therefore, the mean T r and A e values derived from the spring samples were assumed in the computation of C gw for the flow model-generated age distributions (see supporting information S1 for additional discussion).

Integrated Hydrologic Model 4.2.1. Baseflow Age Calibration
Simulated streamflow and the fraction of streamflow that is interflow for a range of historically variable water years are provided in Figure S6. Interflow dominates stream water source during snow melt (April-July) and can be bolstered in the summer and fall by monsoon rains. Interflow fractional contributions to Copper Creek in late August 2017 during the gas tracer experiment are 22% and align with the same fraction of interflow in the median water year (1998) and a dry water year with near-normal monsoon rains (2012). Figure 3 shows the resulting baseflow age distribution at the sampling location CC03 using the originally published flow model. Median baseflow age is 1.5 years, and water table elevations are shallow with 64% of recharged water moving through the top model layer which is alluvium (<8 m). Figure 4a illustrates the hyperlocalized, topographically controlled and very young groundwater flow paths above the sampling location where granodiorite is the dominant bedrock. Using a Weibull distribution fit to the GSFLOW baseflow age output (k = 0.28, n = 0.44), the SF 6 groundwater concentration is calculated at 2.74 fmol/L and the sum χ 2 is 147 indicating a statistically insignificant replication of SF 6 stream concentrations (Figure 2).
Geologic parameter adjustments based on a sensitivity analysis (refer to supporting information S2 Figures S6 and S7) to promote older baseflow, with the goal of lowering estimated SF 6 groundwater concentrations, include increasing the granodiorite surface hydraulic conductivity fourfold over the original value of 1.16 × 10 −7 to 4.66 × 10 −7 m/s and lowering the ratio of horizontal to vertical hydraulic conductivity, or VKA, from 10 to 3. Accompanying changes in bedrock properties was a slight increase in soil storage to replicate observed stream discharge (Nash Sutcliffe Efficiency Log discharge = 0.78). Bedrock reparameterization lowers predicted water table elevations (median depth of groundwater flow equal to 112 m), reduces groundwater flow through the alluvium from 64% to 22% (Figure 3b), and produces a baseflow age distribution (Weibull k = 0.79, n = 0.66, Figure 3a) capable of reproducing groundwater SF 6 concentration of 10.1029/2020WR028161

Water Resources Research
2.2 fmol/L to statistically replicate SF 6 stream water concentrations (sum χ 2 = 12.9, Figure 2). The hydrologic model produces a median baseflow age at CC03 of 7.5 years, which falls within the range of October ages for perennial springs. Figure 4b shows older flow path are now generated in the upper portions of Copper Creek.

Baseflow Age Sensitivity
Using the calibrated model to assess a variety of historical water years indicates that groundwater age contributions are relatively stable 11.8 ± 0.7 years, while late summer baseflow ages range between 3 and 12 years as a function of contributing interflow (Table 3 and Figure S7). A sensitivity analysis perturbing long-term climate from its median condition by incremental changes in the historical median water year daily precipitation shows that groundwater age distributions shift progressively toward older water with STP/g = cubic centimeters at standard temperature and pressure per gram of water; underlined parameter values were specified not estimated; -= not estimated or computed; NA = not applicable.

10.1029/2020WR028161
Water Resources Research decreased precipitation ( Figure S11). Warming by +4°C decreases groundwater ages slightly for very wet conditions. Warming during a drought (e.g., 0.8P) removes relatively younger water, while warming during a more intensive drought (e.g., 0.4P) affects all flow paths and the entire distribution shifts older. Warming the basin until snow is converted to rain (+10°C) increases groundwater ages for all conditions, but increases are most dramatic under a drier climate. Removing the forest from Copper Creek increases recharge and shifts groundwater toward younger ages with decreased ages most notable for dry conditions. The median groundwater ages from all scenarios increase with decreasing precipitation and collapse about a single exponential function defined by average annual net recharge, with net recharge defined as recharge minus groundwater ET ( Figure 5). Steady-state historical simulations capture spatial and temporal variability of climate variables experienced in the basin over the last 32 years. Resulting median groundwater ages fall within the range of simplistic scenarios developed from water year 1998. Historic conditions similarly plot along the net recharge exponential function. Assuming spatially uniform climate effectively decreases net recharge for the same amount of annual precipitation. The resulting increase in median age is similar to decreasing precipitation by 20% using the spatially distributed climate inputs.

Discussion
Stream water source in the late summer is composed of both shallow epemeral flow through the soil and saturated groudnwater flow through alluvial and bedrock units. Deeper groundwater flow through unweathered bedrock is often treated as negligible in catchment studies (Kirchner, 2009). However, there is a growing awareness that deeper groundwater flow may be an important component in mountain hydrology (Gabrielli et al., 2012;Hale & McDonnell, 2016), and there is a general call to include the bottom of the groundwater system in conceptual models (Brantley et al., 2007;Brooks et al., 2015;Manning & Caine, 2007). However, a fundemental challenge in hydrology is how to define the bottom of the watershed and assess its importance (Condon et al., 2020). The challenge is amplified in steep, snow-dominated, mountain watersheds. These watersheds provide 60-90% of the freshwater world wide (Viviroli & Weingartner, 2008) and are especially vulnerable to climate change (IPCC, 2019). Data describing bedrock properties and deep subsurface flow are scarse in these systems, and little is known how projected warming or reduced snowpack will affect bedrock flow paths and associated streamflow response.

Age of Baseflow and Depth of Active Groundwater Flow
To help constrain groundwater flow paths in a mountain watershed, we present a novel approach that combines a sophisticated numerical hydrologic model and a new method for dating baseflow using dissolved N 2 , Ar, CFC-113, and SF 6 . The gas tracer experimental procedure is relatively convenient and cost effective as it takes a single day to perform and does not require expensive drilling. Drilling is often impossible in mountainous watersheds due to logistical challenges related to steep topography, deep snowpack, and (in our case) wilderness designation. The stream tracer experiment was not designed for high-gradient rivers with fast gas

10.1029/2020WR028161
Water Resources Research exchange velocities to the atmosphere, and its use in Copper Creek is, in of itself, a methodological question on its effectiveness in alpine environments. Because of fast gas exchange velocities, the approach could not provide a unique baseflow age determination in Copper Creek. However, it did provide a relatively robust upper limit on the groundwater SF 6 concentration and an indirect method for assessing the minimum subsurface travel time in the basin.
The originally published GSFLOW model generates shallow groundwater flow moving predominantly through the alluvium situated on much less conductive bedrock. For late summer conditions in 2017, when the tracer experiment was conducted, baseflow median age is estimated very young at 1.5 years. This produces a statistically insignificant replication of observed SF 6 stream water concentrations. To match stream water SF 6 concentrations, it is necessary to deepen groundwater flow through the upper portions of Copper Creek. Deeper flows increase the age of groundwater (12.2 years) and, with inclusion of interflow, produce a baseflow median age of 7.5 years. In addition to reproducing observed SF 6 stream concentrations, the median age is consistent with PF ages in perennial spring samples collected in October 2017 (10.2 ± 3.4 years). Earlier studies on tranist time modeling in mountainous catchments have tended toward younger ages of 1-5 years (see review by McGuire & McDonnell, 2006) with estimated travel time distributions reliant upon stable isotopes. Yet these tracers cannot inform transport times longer than 4 years, and their exclusive use can bias age distributions and understanding of how catchments store and transmit water (Stewart et al., 2010). Techniques for establishing longer travel times include use of tracers sensitive to older waters (e.g., 3 H, CFCs, and SF 6 ), and a growing number of recent studies suggest that baseflow mean ages in headwater streams may be older than previously thought (>10 years) (Cartwright et al., , 2020. For example, 3 H/ 3 He groundwater ages from 17 streamside piezometers along a 3.5-km reach of Handcart Gulch in the Colorado Front Range were all between 8.9 and 19.1 years . The updated Copper Creek hydrologic model follows this trend in acknowledging older groundwater contributions in mountainous waterersheds. Reconceptualization of the Copper Creek groundwater flow system to produce older water discharged to the stream was largely accomplished by increasing the hydraulic conductivity of the dominant bedrock (granodiorite). This lowered water table elevations and forced more groundwater to travel deeper through the subsurface. We did not explore variations in the hydraulic conductivity depth decay coefficient even though the development of local versus regional groundwater flow patterns is sensitive to this term. Instead, we maintained an average decay coefficient (Jiang et al., 2009) and adjusted the surface conductivity value to limit the solution space. One can speculate that high-relief watersheds like Copper Creek contain older than expected stream flow related to greater permeability and porosity at greater depths due to more intense recent uplift and tectonism associated with mountain building (Jasechko et al., 2016). Final surface pemeability for the granodiorite (4.6 × 10 −7 m/s) falls within the typical range of 10 −8 to 10 −6 m/s for zones of active flow in fractured crystaline bedrock in mountain settings (Katsura et al., 2009;Welch & Allen, 2014). The recalibrated model lowers water table depths such that only 22% of recharged water moves through the alluvium and produces a median depth of groundwaterflow equal to 112 m, with 30% of recharged water reaching depths greater than 200 m. This is somewhat deeper than the maximum depth of active groundwater circulation in crystaline rocks of 100-200 m based on a limited number of prior studies (Markovich et al., 2019;Welch & Allen, 2014). However, Frisbee et al. (2017) estimated active circulation upward to 1,000 m in the crystalline metamorphic rocks of the Sangre de Cristo Mountains, New Mexico. Active circulation depths are a function of tectonic history, lithology, structure, and climate (weathering), and characteristic active flow depths for different bedrock geologic conditions in mountain settings remains largely unknown (Markovich et al., 2019).
We acknowledge that subsurface routing is inherently nonunique in groundwater modeling, and the use of a single hydraulic conductivity value for a given depth for each hydrostratigraphic unit is simplistic and likely overestimates deep flow paths along the highest elevations in Copper Creek in response to low water table elevations. Fractured rock effective porosity is also highly uncertain and complicated by diffusive exchange between mobile water in the fractures and immobile water in the matrix. Effective porosity for age estimates is likely between matrix and fracture porosity. Fractored porosity in unweathered crystaline rocks <1% is common (Tullborg & Larson, 2006), and based on an 80-m borehole drill core within contact-metamorphosed interbedded shale and sandstone in a proximal basin to the East River, we have observed a matrix porosity of 1-10%, centering around 3%. For comparison, modeled effective porosity 10.1029/2020WR028161

Water Resources Research of the granodiorite in Copper
Creek is equal to 2.5% (depth ≤18 m) and 1.25% (depth >18 m) and deemed appropriate. In light of uncertainties in hydraulic conductivity and porosity, we emphasize that our model is constrained using the minimum tracer-based age for baseflow, and solutions containing older baseflow ages associated with deeper groundwater flow and storage remain entirely plausible for Copper Creek. The current rendition of the Copper Creek hydrologic model suggests that 10% of groundwater is >50 years. However, these ages are beyond the ability of SF 6 to estimate. Future work would benefit by incorporating tracers such as 39 Ar and 14 C to better constrain older groundwater (IAEA, 2013).

Controls on Baseflow Age
The physical reality of mountainous watersheds is that heterogenities in the system in combination with spatiovariable water inputs create highly diverse flow paths through the basin. As a result, the age distribution of stream water is not time invariant but responds dynamically as the nature of overland flow and hydrologic connectivity change and flow paths and velocities vary with water storage (Botter et al., 2011;Engdahl et al., 2016;Van Der Velde et al., 2012). Transient analysis of individual water years spans the full range of historical climate conditions and associated variability in predicted late summer baseflow median age (3-12 years). The range in baseflow age is due to simulated interflow which is tightly coupled to interannual climate variability with deep and persistent snowpack and/or a wet summer monsoon driving younger baseflow. Our results agree with other studies showing the mobilization and mixing of younger shallow water with older water following wet periods (Howcroft et al., 2018;Hrachowitz et al., 2013). Specifically, Copper Creek baseflow ages align with recent work in Providence Creek, a granodiorite headwater basin in the southern Sierra Nevada mountains of California, with stream ages ranging from 3.3 to 10.3 years with wetter years releasing younger water, as determined using ranked storage functions constrained by radioactive isotopes (Visser et al., 2019). Our results are also consistent with spring mean ages in Sagehen, California, with ages varying from 3-7 years and controlled by the magnitude of the new fraction (<1 year) that correlated positively to annual maximum snow water equivalent (Manning et al., 2012). Neither of these two studies included tracers capable of dating premodern water (recharged before 1950), meaning that, as with our study, reported ages and interannual age variations could be greater. However, Manning et al. (2012) reported that most samples contained little premodern water based on multiyear comparisons between different tracers.
In contrast to variable baseflow age controlled by interflow contributions, Copper Creek simulated groundwater flow paths and associated ages showed low variance about the mean (11.8 years) deviating by only 0.7 years despite drastically different snow accumulation and stream dynamics for the years assessed. Interannual stability in groundwater flow represents a basin in equilibrium with its historical climate and watershed structure (i.e., topography, land use, and geology). Topographic controls have also long been recognized as controlling local, intermediate, and regional groundwater flow systems (Toth, 1963;Winter et al., 2001) and have been identified as the single most important control on catchment-scale transport (McGuire et al., 2005). Additionally, the influence of precipitation magnitude and type , vegetation (Rukundo & Doğan, 2019), bedrock lithology (Onda et al., 2006), and subsurface connectivity (Tetzlaff et al., 2009) have been found to influence recharge and subsequent groundwater flow to streams. The spatial distribution of water inputs may also matter. Recharge in mountain watersheds preferentially occurs in the upper subalpine as a function of large and persistent snowpack (Broxton et al., 2015;Musselman et al., 2008). Badger et al. (2019) using a distributed hydrologic model and remotely sensed estimates of snowpack found more uniform snow distributions melted out 5 weeks earlier and produced up to 9.5% less stream flow than allowing for spatial variability in snowpack. Likewise, we show that assigning a uniform distribution of climate variables effectively reduced recharge to produce older groundwater ages more closely aligned to a basin with 20% less precipitation. In contrast, the sensitivity of median groundwater age to different timings of climate inputs, based on historical variability, is not large. Instead, computed median ages track those determined by the timing of 1998 inputs and scaling the precipitation volume up/down. The exception was during the exceptionally dry years of 2012 and 2018. These years had temperature anomalies of 1.3°C and 2.1°C, respectively, but median ages are larger than the precipitation-scaled 1998 given +4°C. This suggests that under water-limited conditions, the timing of climate variables becomes an important control on groundwater flow paths and residence time.

Water Resources Research
Forest removal has the potential to increase recharge (and streamflow) by decreasing interception and tree water use, but the effect is muted by corresponding increases in soil evaporation and snowpack sublimation (Biederman et al., 2014;Pugh & Small, 2012). Literature suggests at least a 20% reduction in tree stand is needed to detect a change in water export (Brown et al., 2005;Stednick, 1996), and as such, we removed all the forest as an extreme case. Results indicate that total tree removal changes the timing and quantities of Copper Creek's water balance ( Figure S10) but for wetter conditions does not drastically alter groundwater recharge values nor the median groundwater age. Only when the basin is simulated as very dry does increased recharge reduce subsurface residence times with the effect more prominent with warming.
The influence of basin structure on groundwater flow paths is highlighted in Figure 5b, in which travel times collapse toward a single exponential function across a range in simulated recharge conditions. Given the relative insensitivity of vegetation changes to shift the shape and position of the exponential function, it is hypothesized that basin groundwater storage is responsible for travel time sensitivity to recharge. Storage is determined largely by permeability (i.e., lithology) and the ratio of circulation depth to topographic length scale. For example, three-dimensional numerical models have provided guiding principles on recharge controlled versus topographically controlled groundwater systems that in turn drive the length scale of flow paths and associated ages in mountain systems (Gleeson & Manning, 2008;Markovich et al., 2019). To illustrate, Figure 6 depicts the conceptual model of flow path end-members dictated by the ratio of net recharge to hydraulic conductivity (R/K). High R/K produces higher water table elevations and increases the influence of topographically controlled, local flow paths on streamflow generation. If water table elevations are high enough to support perennial streams, then the system is likely permeability limited such that increases in recharge have little effect on changing groundwater flow paths and the median age of groundwater is stable. Conversely, watersheds with lower R/K have deeper water table elevations controlled mainly by the recharge rate. Under these conditions, ephemeral streams emerge, flow paths are less constrained by local topography, and groundwater flow conditions become increasingly sensitive to changes in recharge.
The original Copper Creek model established a very high R/K in the upper portions of the basin to produce a permeability-limited groundwater flow system with shallow water table elevations in which baseflow ages are buffered from possible decreases in recharge. With model reevaluation, the R/K ratio in Copper Creek is lowered. The water table is still topographically controlled, but the newly calibrated model suggests that Copper Creek is much closer to the recharge controlled condition in which groundwater flow paths deepen and groundwater ages increase with relatively small reductions in precipitation. The larger the deviation from the historical median precipitation toward a drier state, the greater the sensitivity of groundwater age is to either recharge decreases (warming) or increases (forest removal). Ameli et al. (2018), using a combination of tritium tracer and a semianalytical flow and transport model in a New Zealand headwater basin, also found groundwater flow paths that lengthen and water ages contributing to streams increase indirectly to recharge rate. This rain-dominated, New Zealand, catchment underlain by early Pleistocene conglomerate reflects a recharged controlled basin that was not readily apparent in Copper Creek without use of the gas tracer observations. Our results suggest that characterization of the bedrock groundwater flow system is fundamentally important to establishing R/K and determining where a basin resides on the topographic and recharge controlled continuum in order to better quantify how groundwater flow may respond to changing climate and land use. 10.1029/2020WR028161

Conclusions
There is growing awareness that deeper parts of bedrock aquifers in steep, mountain watersheds could be an important part of a watershed's hydrologic system by storing and transmitting larger amounts of water and having a greater influence on stream source than previously indicated. However, deeper parts of mountain aquifers are very difficult to characterize and information on hydraulic conductivity, porosity, and flow rates at depth remain scarce and the true importance of deeper groundwater flow across different geologic settings remains largely unknown. This knowledge gap is a major impediment to our ability to predict how surface water flow may respond to changes in precipitation, temperature, or land use. Here we present a proof-of-concept for a new and efficient approach for characterizing deeper groundwater flow a in mountain watershed using stream water concentrations of N 2 , Ar, CFC-113, and SF 6 . While interflow produces considerable variability in baseflow age as a function of interannual variability in climate, the deeper groundwater flow comontent of baseflow was found more stable (age~12 years). Using gas tracer observations in streamflow, we provide solid evidence of nontrivial groundwater flow to streams that occurs at considerable depth in a mountain watershed of the Upper Colorado River underlain by fractured crystalline rock. The implication for the conceptual model of groundwater flow in this mountain watershed is substantial. Using age tracers to inform an integrated hydrologic model, we move Copper Creek from a topographiclally controlled basin with groundwater flow paths (age) insensitive to changes in precipitation (and recharge) toward a boarderline recharge controlled groundwater basin in which groundwater flow paths are sensitive to increased aridity. This study clarifies, through a case example, the importance of characterizing the bedrock groundwater system in mountain watersheds to determine where it resides on the R/K spectrum to better predict how groundwater and surface water interactions in steep, mountain basins may respond to future changes in climate or land use.