Transport and Water Age Dynamics in Soils: A Comparative Study of Spatially Integrated and Spatially Explicit Models

We address transport by transit times (i.e., the age of water parcels leaving a storage as discharge, deep loss, or evapotranspiration) in subvertical soil systems, key to our understanding of water quality in catchments and streams. While the use of field and lysimeter observations to constrain and validate modeling approaches is generally accepted, different views exist on the relative ranges of applicability of spatially integrated or spatially explicit approaches. This study specifically illustrates how one class of spatially integrated models of transport, based on StorAge Selection (SAS) functions, fares with respect to spatially explicit hydrologic models in a case where detailed tracer data from experimental lysimeters exist and for which both approaches are viable. Data from two lysimeters experiments that differ in atmospheric conditions, size of the installation, tracer type, soil texture, and vegetation are used to contrast results from two transport models: tran-SAS (space implicit) and HYDRUS-1D (space explicit). Results suggest that although the two lysimeters are characterized by different transit time distributions, their underlying transport mechanisms are similar and represented well by both models. The comparison between the two models results in robust estimates of the transport timescales and clearly shows that percolation fluxes at the bottom of a lysimeter tend to drain the relatively old components of the soil storage. We conclude that the convergence of the approaches in a geomorphologically simple and data-rich problem supports extensive uses of the spatially integrated approach in cases where the scale of the problem, or subgrid parameterization needs, may limit the applicability of detailed modeling approaches.


Introduction
The fate of solutes in the storage and in the release(s) within a catchment has long been studied in the framework of the dynamics of water ages in hydrologic control volumes. This is primarily due to the fact that the age of a water parcel renders information on how long it has been in contact with the immobile water, on its flow path, and on the hydrologic processes it has undergone (e.g., Botter, 2012;Destouni & Graham, 1995;Kirchner et al., 2000;McDonnell et al., 2010;Rinaldo & Marani, 1987;Rinaldo et al., 2006;Sprenger et al., 2019;Stelzer & Likens, 2006;Wachniew et al., 2016). Transit time distributions (TTDs), defined as the distribution of the age of water parcels at their exit through drainage/streamflow or evapotranspiration, are widely accepted as the tool to quantify storage or release of solutes (Botter et al., 2005;Kirchner et al., 2000;McGuire & McDonnell, 2006;Rinaldo et al., 2015;Sprenger et al., 2019). As factors like urbanization and climate change urge us to be proactive in our assessment of the possible impacts on water quality (Danesh-Yazdi et al., 2016;Sprenger et al., 2018), the critical zone represents a starting point to estimate the long-term impact of such changes (Cullen et al., 1992;Grant & Dietrich, 2017;Wilson, 2018).
Experiments on large weighing lysimeters (e.g., Gebler et al., 2015;Groh et al., 2018; are a powerful tool to investigate hydrologic transport in the critical zone. Although lysimeters may have drawbacks due to unrealistic profiles of water pressure head when used to estimate groundwater recharge (see Healy, 2010), three advantages are particularly relevant: Boundary conditions are known (Groh et al., 2016); changes in storage are measured, and thus, evapotranspiration rates can be calculated at high resolution (Hannes et al., 2015;Peters et al., 2017;Schrader et al., 2013); the bottom drainage integrates all Recently, there has been a growing interest in understanding the relations between these two model classes, recognizing that simple transport models can effectively and efficiently subsume the complexity of a hydrologic system and that the description of the main physical processes can aid parameterization of conceptual models. Model intercomparisons have been carried out to determine how different model types are able to match a measured tracer signal Stumpp, Nützmann, et al., 2009), assess the effect of spatial heterogeneity on the shape of TTDs (Ali et al., 2014;Danesh-Yazdi et al., 2018;, and compute lumped TTDs and SAS functions starting from simplified 1-D (Benettin et al., 2013;Kirchner et al., 2001) or detailed 2-D (Pangle et al., 2017) and 3-D (Jing et al., 2019;Kaandorp et al., 2018;Maxwell et al., 2018;Remondi et al., 2018;Yang et al., 2018) process-based models.
However, little research has been carried out in soils to assess which features of the physical system control the parameters of spatially integrated formulations. This is particularly important in the context of large-scale models (e.g., Bierkens et al., 2014) whose spatial resolution is often the size of entire catchments, and simplified transport models are needed as subgrid components (Heße et al., 2017;Jing et al., 2019).
Model intercomparisons are also useful to test the robustness and consistency of TTD estimates. Several problems related to water age in soils have recently emerged in the hydrologic debate. A commentary by Berghuijs and Allen (2019) suggested that waters flowing out of a system, such as streamflow draining a catchment or percolation down a soil column, tend to be younger than the waters stored in those same systems. While several modeling studies conducted in relatively humid climates (e.g., Benettin et al., 2017;van der Velde et al., 2012;Wilusz et al., 2017) confirm this general tendency in the case of catchment's streamflow, there is no research supporting or rejecting this assumption for soil percolation. As estimating the age of soil water is key to inferring water age for the hydrologic compartments that are connected to the soil (Sprenger et al., 2019), the answer to this problem should not be influenced by the choice of a spatially implicit or spatially explicit model.
Here, we use data from two contrasting lysimeter experiments to calibrate a process-based model (HYDRUS-1D) and a conceptual model based on SAS functions (tran-SAS). We evaluate results in terms of tracer transport and TTDs in the drainage flux. We do not aim to tell which model is "better." Rather, we look at similarities and differences among the results to sort out if and how the conceptual representations of transport processes in soils by the SAS-based model can be directly associated with driving physical processes. Analysis of the experimental and model results allows us to (i) test whether fluxes at the bottom of a soil column tend to drain the younger or older soil water, (ii) illustrate how a simple lumped model based on SAS functions can capture the interplay between matrix and preferential flows occurring within a lysimeter, and (iii) discuss the potential of this type of model comparison and its implications for upscaling local transport processes to larger scales.

Lysimeter Experimental Data
We use data from two lysimeter experiments that differ in most features like size, location, soil type, experiment duration, tracer type, and application design.
The first experiment took place between 1998 and 2003 at the Helmholtz Zentrum München, Neuherberg, Germany (from hereon referred to as NH) and has been described by ), Stumpp, Nützmann, et al. (2009), and Stumpp and Maloszewski (2010. In brief, the NH lysimeter consists of an undisturbed sandy soil of Humic Cambisol with 1-m 2 surface and 2-m depth. No suction is applied at the lower boundary and water drains when the bottom becomes saturated (seepage face boundary condition). Weight variations are measured through high-precision load cells. The site is at an elevation of 484 m above sea level, and climate is humid with 891 (±102) mm of precipitation and 431 (±57) mm of evapotranspiration during the study period. The soil is mostly sandy (>87%) with some silt (<13%) mostly at lower horizons and some gravel (2%) at the top horizon. The lysimeter is situated in a 1-ha agricultural field and was planted with crops that rotated through the years. The lysimeter has a pore space volume of about 0.78 m 3 (considering a porosity of 39%), and considering a hypothetical saturation in the range 0.2-0.8, the water storage would range from 156 to 624 mm. Weekly bulk samples of precipitation and bottom drainage were collected and analyzed for water stable isotopes 18 O and 2 H (Figure 1). 18 O in precipitation follows a yearly cycle ranging from −20%0 to 0%0. Drainage samples have a less evident yearly cycle and range from −11%0 to −7%0. As reported by , water samples do not show evidence of evaporative fractionation and suggest that water draining to the bottom of the lysimeter is subject to negligible evaporation. Additional experimental details are reported in the supporting information (section S1).
The second experiment took place between March and August 2016 at the EPFL campus, Lausanne, Switzerland (from hereon referred to as EPFL). The EPFL lysimeter shares the same infrastructure as the one described by , but it was modified by means of a lifted funnel (Figure 1) to accommodate a smaller soil volume and allow for shorter transit times through the soil block. The lysimeter was freely draining; its surface was ∼ 1.1 m 2 , and in the modified configuration it had a depth of roughly 1-m. The soil was a repacked and uniform sandy clay loam. On top of the soil, a natural grass cover developed 1 year prior to the experiment start. The lysimeter was subjected to natural climatic conditions, but additional irrigation was provided to ensure wet experimental conditions, especially during the end of the experiment when evapotranspiration fluxes dominated the water balance. Given an estimated porosity of 0.38, the total pore volume was roughly 0.42 m 3 and the water storage would range from 84 to 336 mm for saturation in the range 0.2-0.8. The tracer experiment consisted in the simultaneous application of five fluorobenzoic tracers through a single 1-hr irrigation of 20 L. The five tracers were previously tested in a different experiment by  and were shown to undergo some degradation under unsaturated conditions. For this study, we focus on Tracer 4 (2,6-DFBA) because its mass recovery was almost complete (∼95%), indicating that degradation and plant uptake were minimal. The tracer breakthrough curve of Tracer 4 is illustrated in Figure 1. The tracer started to peak after 15 days from the injection, although it was already detected in the drainage after 3 days. The peak of the curve remained roughly constant for about 1 month and then slowly decreased to almost zero over about 2 months. All data from the EPFL experiment are unpublished and they can be found online (http://doi.org/10.5281/zenodo.3457960).

Transport Models
We investigate transport processes through the two lysimeters by means of a conceptual (lumped) model based on SAS functions and a process-based (1-D) model based on the Richards equation.
The selected lumped model is based on SAS functions . As such, it requires specification of the SAS function for each outflow. We used the software tran-SAS v0.1 (Benettin & Bertuzzo, 2018) and defined the SAS functions of drainage and ET fluxes as power functions with a single parameter k (analytical formula in section S2). The exponent k indicates whether an outflow is dominantly made of young (k < 1) or old (k > 1) water parcels available in the water storage. For the NH lysimeter, the simulation time step was set to 24 hr; the model was spun-up using available hydrologic and isotopic data over 590 days before the start of the simulation, and 18 O was treated as a conservative tracer because no isotope fractionation was observed in the bottom drainage . For the EPFL lysimeter, the simulation time step was 8 hr (which had improved numerical accuracy and yet low computational times due to the short experiment duration) and no spin-up was used because no tracer was present in storage prior to the tracer application. To take into account the selective uptake of tracer by plants (i.e., the capability to uptake the tracer at a lower concentration than in the soil water, which results in an increased concentration of the tracer left behind), we modeled evapoconcentration as in Benettin et al. (2017) by means of an evapoconcentration factor ET . In both simulations, the initial storage was left as a calibration parameter and the prior distribution was set equal to the range of plausible storage values given by the pore space volume (see section 2.1). In both case studies, the model was calibrated using the DREAM ZS algorithm (ter Braak & Vrugt, 2008;Vrugt et al., 2009) against the measured tracer data at the drainage. Additional details on the calibration procedure are included in section S2.
We selected HYDRUS-1D (Šimůnek et al., 2016) as spatially explicit (1-D) model to solve transport at both lysimeters. For comparison with results by , we used version 4.16.0110. In HYDRUS-1D, the fluid flow is based on the solution of the Richards equation, while solute transport is based on the advection-dispersion equation. Plant transpiration and solute uptake are considered as a local sink term in both equations. For the NH experiment, we took the same setup as . The 2-m-deep soil column was discretized into 201 equidistant nodes (node distance 1 cm). Although the soil is organized into three soil horizons, the measured hydraulic parameters are similar and thus a single set of hydraulic parameters (K S , s , r , , n) and longitudinal dispersivity D L was used for the entire soil column. The parameters of the Genuchten-Mualam hydraulic model (K S , s , , n) and the longitudinal dispersivity D L were found by running the inverse modeling routine embedded in HYDRUS-1D using weekly drainage and isotopic ratios measured at the bottom of the lysimeter and daily volumetric soil water content (total 2,163 data points).
HYDRUS-1D does not allow any amount of tracer to move with water in the evaporation flux, causing a potentially large increase in the tracer that is left behind. To avoid this problem and to simplify the computation of TTDs in HYDRUS-1D (see section 2.3), we set the evaporation flux to zero and allocated the entire 10.1029/2019WR025539 (measured) ET flux to plant transpiration. This assumption is consistent with the lack of evaporative fractionation in the drainage samples, and its effects are discussed with the results. At each time step, root water uptake was estimated from the bulk ET flux multiplied by the Feddes linear stress response function (Feddes et al., 1978). This bulk root water uptake was distributed along the root length according to a normalized root density function which declines linearly from 1 at the surface to 0 at the rooting depth. An annual cycle of root growth from 10 cm to a maximum of 50 cm at the end of each year was considered. For water transport, the upper boundary condition is given by the time variables weekly precipitation fluxes and the lower boundary condition is seepage face of zero pressure head, which implies that drainage occurs only when the bottom of the lysimeter is fully saturated. For solute transport, the upper boundary condition is the weekly measured 18 O flux, while the lower boundary condition is set as zero concentration gradient (i.e., the liquid phase concentration of the drainage water is the same as the lowest element of the soil column). Time steps in HYDRUS-1D were not fixed and ranged from 0.001 to 1 day. Further details are in section S3. The model was spun-up using available hydrologic and isotopic data over 590 days prior to the beginning of the experiment. As the system needs to be in equilibrium at the start of the simulation, the initial condition for the flow equation is a linear pressure decline from a value of −200 cm at the top to a value of 0 at the bottom.
For the EPFL experiment, the 1-m-deep soil block was discretized into 200 grid points of 0.5-cm width. As for this experiment the soil was mechanically mixed before filling the lysimeter; the properties of the soil block are assumed to be homogeneous, and they are described through the van Genuchten-Mualem model. The parameters of the soil hydraulic model (K S , s , r , , n) and the longitudinal dispersivity D L were estimated using the embedded inverse modeling tool against measured water flux and tracer concentration at the bottom of the lysimeter (total of 294 measurements). The inverse modeling was rerun with different initial values to verify the convergence of parameters to a global minimum in the objective function. As the lysimeter surface was fully covered with grass, we expect evaporation to be a minor component of the total ET flux. For this reason and for consistency with computations for the NH case study, we set evaporation equal to zero and assigned the total (measured) ET flux to transpiration. The impact of this assumption is discussed with the results. The parameters of the Feddes linear stress response function for grass are taken from HYDRUS-1D library. In this case, we assumed a constant root length of 15 cm and no root growth, but we also tested the sensitivity of model results to root lengths of 7 and 30 cm (see section S4). Similar to the NH model setup, we used measured fluxes (daily in this case) of precipitation and irrigation as upper boundary condition and seepage face of zero pressure head at the lower boundary. The FBA tracer was considered as nonreactive (absence of chemical or physical nonequilibrium), but maximum plant uptake concentration was limited to 2 mg/L. This parameter controls the mass recovery at the bottom of the lysimeter, and it can be estimated by matching the measured total exported mass. The maximum plant uptake concentration is a parameter similar to ET in tran-SAS which accounts for selective uptake of plants. As for the NH case study, the prescribed boundary condition at the top and bottom is the Cauchy (liquid phase concentration of the infiltrating water) and zero concentration gradient, respectively. The time step in the EPFL case study varied between 10 −5 and 1 days. The initial condition for the flow equation is linear pressure decline from a value of −100 cm at the top to a value of 0 at the bottom. For solute transport it is zero concentration throughout the domain as the injected tracer is artificial and absent to the medium prior to the experiment. Further details are in section S3.

Computing TTDs From Tracer Breakthrough Curves
Transit (or travel) time distributions can be seen in a "forward" (fTTD) or "backward" (bTTD) mode: In the former, one considers when the different water molecules that enter a catchment through precipitation will reach an output like streamflow or evapotranspiration; in the latter, one considers when the different water parcels that form an output had entered the catchment through previous precipitations. Similarly, one can define (e.g., Benettin et al., 2015) forward or backward residence time distributions RTDs (i.e., the age distributions of the water stored within the system), by focusing either on when the current water storage will be released to an output in the future (fRTD) or on when water parcels forming the current water storage had entered through precipitation in the past (bRTD). Forward and backward TTDs are related by continuity and only coincide at steady state. For a system with one input I(t) and multiple outputs O k (t), fTTD and bTTD are related (Botter et al., 2011;Niemi, 1977) by (1) where O k indicates the fraction of input I that eventually goes to output O k and ∑ k O k = 1. Equation (1) states that the amount of input entering at a certain time t i and leaving as output O k at a later time t e is equal to the amount of output that has age t e − t i at time t e . While the relationship between forward and backward residence time distribution is complex and it is not discussed here, the relationship between bRTDs and fTTDs is relevant to this study. bRTDs can be obtained at any time t by keeping track of the amount of precipitation I(t i ) that has not been yet removed by the outflows O k . This translates into Models based on the use of SAS function use catchment-scale water age distributions as part of the simulation so they naturally output time-variant TTDs and RTDs. Spatially explicit models, instead, often need to be coupled to particle tracking algorithms (e.g., Maxwell et al., 2018;Yang et al., 2018). In case the focus is the catchment scale, one can use a spatially explicit transport model to compute virtual tracer breakthroughs and use these to compute all the age distributions. This approach was already used by Sprenger, Seeger, et al. (2016), Pangle et al. (2017), and Sprenger et al. (2018), and it is based on the following steps: (i) run the model to obtain virtual tracer's mass breakthrough curves . m k (t i , t e ) in one or more outputs k for each precipitation event, (ii) normalize by the mass of applied virtual tracer to obtain normalized mass breakthrough curves . M k , and (iii) compute the fraction O k of input that is recovered in each output after the breakthrough is over (i.e., compute the integral of . M k ). After those steps are taken, fTTDs are readily obtained as fTTD(t Then, bTTDs and bRTDs are computed through equations (1) and (2), respectively. Finally, SAS functions are obtained as the ratio between bTTD and bRTD as originally defined by Botter et al. (2011) and they can be readily redefined over a transformed age domain as done by van der Velde et al. (2012) and Harman (2015). Overall, this approach is not efficient as the model has to be run as many times as the number of precipitation events that are to be tracked, but it is extremely simple and can work for any transport model regardless of its formulation and computing language.
The use of virtual tracer breakthroughs to compute water age distributions requires that the virtual solute be treated as completely passive by the model. This means that the virtual tracer concentration should not increase or decrease as it moves through the medium (nor if it is uptaken by plants); that is, it should be removed from the system in the same portion as the water parcel that it is tracking. In traditional HYDRUS-1D versions, root uptake can be set as passive but evaporation can only remove water while leaving the solute behind. To allow fully conservative behavior for the virtual solute, we set evaporation (which is expected to be low at both sites) equal to zero and considered a transpiration flux equal to the total evapotranspiration flux.
For both tracer experiments, we computed forward age distributions for drainage (fTTD Q ) and evapotranspiration (fTTD ET ) with both models for all precipitation events larger than 1 mm/day. Computations were run at daily resolution. For each event we also computed the partitioning coefficients Q that quantify for each precipitation the fraction that is released as drainage (the fraction released as ET is obtained as ET = 1− Q ).

Models Performance 3.1.1. NH Lysimeter
Measured and simulated 18 O in the bottom drainage for the NH experiment are shown in Figure 2. Both models simulate efficiently the general pattern of the measured signal and reproduce the correct timing and magnitude of the seasonal oscillations. The main differences between the two models occur at the beginning and the end of the simulation period and also coincide with the largest deviations from the measurements. The inaccuracy during the first year is likely related to the uncertainty in estimating the initial tracer distribution in the soil column and the legacy of this uncertainty during the early part of the simulation. The residual error of the models (defined as measurement-simulation) is shown in Figure 2b. Both residual   Figure 3, SAS parameters k indicate that the drainage fluxes have high affinity for releasing the older stored water (k Q > 1) while transpiration mainly removes the younger stored water (k ET < 1). The total initial storage S 0 is estimated around 200 mm, while the evapotranspiration factor ET ranges between 0 and 0.4. Orange crosses indicate best parameter values. time series are normally distributed with almost identical standard deviations (0.36%0 for HYDRUS-1D and 0.37%0 for tran-SAS), but the performance of HYDRUS-1D is less biased as the mean residual is 0.01%0 versus a mean residual of −0.18%0 by tran-SAS. Overall, Nash-Sutcliffe (NS) coefficient is 0.85 for HYDRUS-1D and 0.81 for tran-SAS.
For HYDRUS-1D, parameters estimated by  are saturated and residual water contents ( s and r ) equal to 0.39 and 0 cm 3 /cm 3 , soil retention parameters = 0.078 cm −1 and n = 1.49, saturated hydraulic conductivity K S = 119 cm/day, and longitudinal dispersivity D L = 20.8 cm. This combination of parameters results in an initial water storage of 333 mm. For tran-SAS, the posterior parameter distributions are shown in Figure 3. All three parameters are well constrained and show that the drainage fluxes have high affinity for releasing the older stored water (k Q > 1) while transpiration mainly removes the younger stored water (k ET < 1). The best parameter set has k Q = 4.0 and k ET = 0.68. The initial water storage is estimated between 490 and 550 mm, and the best parameter set has S 0 = 518 mm. This value is higher than the one estimated by HYDRUS-1D and results in a higher estimate of the average lysimeter storage over the simulation period (502 mm for tran-SAS and 415 mm for HYDRUS-1D).

EPFL Lysimeter
Results of the EPFL tracer breakthrough experiment are shown in Figure 4. Both models reproduce the timing and shape of the measured breakthrough curve. The residuals time series are characterized by oscillations of small magnitude (less than 20%) that are due to a slight difference in the shape of the breakthrough peak (rounder in the models and flatter in the data). The standard deviations of the residual time series are similar (1.3 mg/L for HYDRUS-1D and 1.4 mg/L for tran-SAS), but HYDRUS-1D has a slightly higher bias (mean residual is −0.29 mg/L versus a mean residual of 0.08 mg/L by tran-SAS). The Nash-Sutcliffe coefficient for both models is 0.96. The normalized tracer recovery (inset Figure 4) shows that total recovery from measurements, tran-SAS, and HYDRUS-1D is 95%, 96% and 97%, respectively.
The estimated parameters in HYDRUS-1D are saturated and residual water contents ( s and r ) equal to 0.28 and 0.05 cm 3 /cm 3 , soil retention parameters = 0.03 cm −1 and n = 1.69, saturated hydraulic conductivity K S = 31.2 cm/day, and longitudinal dispersivity D L = 6.8 cm. The resulting initial total storage is 199 mm. The posterior distributions of calibration parameters for the SAS model are shown in Figure 5. Optimal values for the SAS function parameters are k Q = 3.8 and k ET = 0.2. This indicates that, similar to the NH lysimeter, the drainage flux appears mainly composed of older stored water, while evapotranspiration is mainly sourced by the younger stored water. The initial storage is constrained between 196 and 218 mm with an optimal value equal to 199 mm, which is identical to the HYDRUS-1D estimate. The parameter ET for evapoconcentration ranges between 0 and 0.4 with optimal value equal to 0.25. The optimal value is larger than 0 and reflects that the total mass recovery is smaller than 100%.

Water Age Dynamics
Forward TTDs for drainage and transpiration were computed for each experiment using both models, following the approach described in section 2.3. As fTTDs need to be truncated on the last day of available measurements, they cannot be reliably computed for precipitation (or irrigation) events occurring toward the end of the data series. We only retained TTDs whose truncation did not exceed 5% of the distribution. This corresponds to the first 70 days for the EPFL lysimeter and to the first 1,080 days for the NH lysimeter.  ). Yet the seasonal variability of MTT ET is similar for the two models and suggests that winter precipitation takes a longer time to leave as transpiration (see the three peaks corresponding to January-February in the time series) compared to summer precipitation. For the smaller EPFL lysimeter (Figure 7), MTT timescales are much shorter. Again, the two models predict almost identical MTT Q that are now in the range 40-60 days. In this case, MTT ET are only slightly higher for tran-SAS (average value of 14 days) than for HYDRUS-1D (average value of 6 days).

Figure 9.
Time series with the partitioning of precipitation between drainage ( Q ) and evapotranspiration ( ET ), estimated for the two lysimeters using both models. Estimates are rather different for the NH experiment (left panels), while they are similar for the EPFL experiment (right panels).

10.1029/2019WR025539
An example of forward TTDs for drainage and evapotranspiration for individual precipitation events is shown in Figure 8, where both TTD Q (blue curves) and TTD ET (green curves) are illustrated over the same travel time axis. The shape of the distributions is rather irregular because it is controlled by the hydrologic forcing (drainage and evapotranspiration) after the precipitation event. Small differences in the shapes of TTD Q (more visible at NH) are due to the fact that the SAS model makes use of measured data while HYDRUS-1D estimates the drainage flux. For both the precipitation events, water leaves relatively quickly as ET and it takes longer to cross the soil column and exit through the bottom drainage. Overall, TTD Q and TTD ET occupy different portions of the age domain with rather limited overlapping. The picture provided by the two models is broadly consistent, although, as shown previously, the two models give different estimates of the TTD ET dynamics at NH, with shorter transit times to ET estimated by HYDRUS-1D.
Age results can be used to investigate the partitioning of precipitation into "blue" fluxes (i.e., the bottom drainage in this case) and "green" fluxes (i.e., evapotranspiration). Figure 9 illustrates the time series of partitioning coefficients estimated from the two models. At both lysimeters, the coefficients fluctuate around the value of 0.5, which is basically the long-term partitioning from the long-term hydrologic fluxes. Seasonal variability in the partitioning is visible at both lysimeters: larger fractions of winter precipitations are eventually released as drainage ( Q > 0.5), while larger fractions of summer precipitation usually leave through evapotranspiration ( Q < 0.5). Note that this does not mean that precipitation is released during the same season, as TTDs were shown to span months to years (Figure 8). At the NH site, this pattern results in seasonal fluctuations of Q , with larger variability predicted by HYDRUS-1D due to the faster ET age dynamics and smaller total storage. At the EPFL lysimeter, a stable trend from Q = 0.85 in March to Q = 0.25 in May reasonably follows the activation of plant transpiration.

Differences and Similarities Among Model Results
For both lysimeters, the calibrated models give similar results in terms of tracer concentration (Figures 2  and 4) and mean transit time for the drainage flux (Figures 6a and 7a). However, significantly different estimates of the total water storage are found at the NH site. This might seem surprising as long-term mean transit time (also indicated as hydraulic turnover time) is usually computed as the ratio between the total storage and the mean outflow (see Sprenger et al., 2019), so they are not expected to be similar if the storage is different. Key to understanding this result is the fact that the bottom drainage is not the only outflow, and part of the total storage does not drain to the bottom because it is removed by ET fluxes. To overcome this issue, we propose the use of the storage volume that is ultimately released to drainage (S Q ) or to evapotranspiration S ET rather than the total storage S. This partitioning is somewhat similar to the coefficient that was used to partition precipitation. Back-of-the-envelope calculations allow estimating the amount of stor-ageS F that produces a given average fluxF asS F = MTT F * F. For the NH site, this results in an average of roughly 350 mm of storage (according to both models) that drains to the bottom of the column. Conversely, only 150 mm (tran-SAS estimate) or just 40 mm (HYDRUS-1D estimate) of storage leave the control volume as ET. The larger ET storage estimated by the conceptual model for the NH lysimeter is associated with longer transit times to ET (Figures 6b, 8, and 9). Clearly, storage estimates may be affected by factors like the implementation of the initial conditions in the two models. However, the key result is that the two models make use of the same volume of "drainage-producing" storageS Q , even if the total storage is different. This suggests that the total storage itself is not a reliable estimator of the mean transit time to drainage if information on the fraction that drains as percolation is not available. Despite the uncertainty on the estimate of ET transit times, our results suggest that while precipitation is evenly partitioned between drainage and ET, the storage is exceedingly made up of water that will eventually drain to the lysimeter base.
The two models were here compared starting from their ability to reproduce the observed tracer time series. While tran-SAS is a purely transport model that takes flows as input data, HYDRUS-1D (as typical of spatially explicit models) simulates flow as well and it needs to be calibrated on both flow and transport. In this study, the selected objective function for HYDRUS-1D calibration gave equivalent weight to the two data types and resulted in a model which is reliable both on flow (see Figure S3 and   Figure 7) and transport (Figures 2 and 4). This is deemed important to guarantee a fair model comparison.
There exist many reasons why the two models predict different transport dynamics for the ET flux. For example, grass roots in HYDRUS-1D only access the water and age pools that are at the top of the soil, while Figure 10. (a, b) Random selection of daily SAS functions for bottom drainage ( Q , blue curves) and for evapotranspiration ( ET , green curves) obtained from the two models. Although the two lysimeters differ in size, soil texture, and vegetation type, their Q are very similar and consistent across the two models. The functions ET are instead difficult to constrain using drainage data only and their estimate is rather uncertain.
ET in tran-SAS is free to sample potentially any age in storage. We carried out a sensitivity analysis on rooting depth for the HYDRUS-1D model at the EPFL case study (testing rooting depths of 7, 15, and 30 cm for grass) to understand the impact of this parameter on the model simulations (see section S4). Results show that rooting depth has a negligible influence on the simulation of bottom drainage, drainage breakthrough curve, and on the related water age statistics, and it only affects the ET age dynamics. Thus, it is not surprising that calibration against drainage tracer data does not provide a constraint on ET age distributions, and this prevents us from assessing the ET age dynamics. In HYDRUS-1D, incorporating evaporation into transpiration possibly causes small modifications to the ET age distributions. But as the detailed ET age dynamics cannot be validated and they have poor impact on the simulations, the separation between evaporation and transpiration has little relevance for the results. The lack of tracer data for the ET flux calls for other experimental and numerical investigations in this area. Recent work by Evaristo et al. (2019) provided tracer breakthrough curves in the xylem water for different plant species and showed in their experiment that the transit times for plant transpiration were higher than for groundwater recharge. As systematic tracer data collection from xylem samples is becoming increasingly available (see Allen et al., 2019;Brinkmann et al., 2018;Penna et al., 2018), the direct experimental evidence to solve, at least locally, the ET age dynamics is now at reach and it supports model intercomparisons for the ET transport dynamics.

Conceptualization of Transport Processes Within a Soil Column
Models based on the use of SAS functions rely on their ability to effectively describe the ensemble of transport processes that occur within the subsurface. A random selection of 30 daily SAS functions computed from the 1-D model is shown in Figure 10 together with the calibrated SAS functions from the lumped model. The subsample of 30 SAS functions was selected to be representative of the SAS variability observed over the entire simulation period. The simplified 1-D domain produces SAS functions that are smoother and less time variant than in recent 2-D and 3-D applications by Pangle et al. (2017), Remondi et al. (2018), Yang et al. (2018), and Kaandorp et al. (2018). For the NH case study this smoothness is likely enhanced by the use of weekly averaged drainage samples and a higher degree of time variance can be expected from higher-frequency data. Figure 10 indicates that drainage and evapotranspiration appear to source from opposite components of the water storage. This is consistent with previous applications of the SAS approach to lysimeter data . SAS functions for ET (green curves in Figure 10) are difficult to constrain and are thus more uncertain (see also section 4.1), but in both cases they suggest that evapotranspiration Figure 11. Similarity between the SAS functions Q obtained at both lysimeters with several methods: calibrated power function with parameter k ≈ 4 (blue continuous line), HYDRUS-1D simulations (gray shaded areas), theoretical SAS functions (Benettin et al., 2013) resulting from a 1-D advection-dispersion model with Péclet numbers P e = 10 and P e = 15 (black dashed lines). mainly abstracts the younger storage elements. SAS functions for the drainage flux are quite similar for the two models and indicate that drainage mainly removes the older stored water.
A major question is why these lysimeter experiments, which are very different in size, environmental forcings, and vegetation state, support similar SAS functions for drainage. To understand this result, we resort to approximating transport within the column as the interplay between advective fluxes (mostly representative of vertical matrix flow) and dispersive fluxes (that account for the presence of macropore flow), whose ratio defines the Péclet (P e ) number. We evaluate P e = L · V ∕D where L is lysimeter total depth, V is Darcy velocity (estimated from the average MTT as 0.74 cm/day for the NH experiment and 1.96 cm/day for the EPFL experiment), and D is the dispersion coefficient (estimated from dispersivity and Darcy velocity as 15.4 cm 2 /day for NH and 13.3 cm 2 /day for EPFL). We obtain comparable P e numbers for the NH (P e = 10) and the EPFL (P e = 15) sites, indicating that advection has similar prevalence over dispersion in both experiments. Then, we compare the SAS functions obtained in this study with the theoretical SAS function of a homogeneous 1-D advection-dispersion system (Benettin et al., 2013) for P e = 10 and P e = 15. The result is illustrated in Figure 11. HYDRUS-1D solves the advection-dispersion equation on a nonstationary flow field which is influenced by plant uptake. Yet its SAS functions only display minor variability around the analytical solution for the stationary case (see gray areas and black dashed curves in Figure 11). The calibrated power function is also similar to the theoretical SAS functions, indicating that it embeds the proper interplay between advection and dispersion (i.e., between matrix and preferential flow) that is needed to explain the observed tracer transport. Given that the drainage flux at the two sites follows similar transport patterns (similar P e numbers), it is not surprising that the calibrated SAS functions are also very similar (best parameter k Q is 3.8 and 4.0 at the two sites).
The consistency shown by the calibrated SAS function and the advection-dispersion process has an additional important consequence: the main component of the lumped model (the SAS function) can now be related to physical properties of the system (i.e., the Péclet number) rather than being a pure calibration parameter. Thus, we can directly transfer assumptions about the physical system into a lumped model parameterization. As we have a general idea about the order of magnitude of P e numbers in soil systems, we can start from these broad estimates and get reasonable first guesses of the relevant SAS functions even in the absence of tracer data. Further, we can run scenarios or run sensitivity analyses corresponding to different assumptions on the Péclet number.
This may bear important consequences for cases where processes are too small scale or complex to be physically represented in the spatial domain. Spatially explicit models usually need to integrate physical processes at some grid scale. When the grid surface is relatively large (as in the case of large-scale models, whose grid typically exceed 1 km 2 ), an effective description of processes occurring at the subgrid scale is needed. Spatially explicit transport models typically solve this problem by using series of subsurface storage volumes that are assumed to be completely mixed (e.g., Ala-aho et al., 2017;Kuppel et al., 2018;Remondi et al., 2018). For example, the mesoscale hydrological model (Heße et al., 2017) can run with grid sizes of more than 10 km 2 and it makes use of three completely mixed (or randomly sampled) soil layers. Such a model can be conveniently used to investigate the vulnerability of regional aquifers to pollution. Our results suggest that using a single soil compartment with a SAS function controlled by the Péclet number can be a valid alternative to the use of multiple well-mixed layers. Clearly, providing a single parameter for areas as large as entire catchments is challenging, but this is a general problem of models with large grid cells.
An additional remark pertains the age selection patterns revealed by the SAS function. At catchment scales, the shape of the SAS functions has often been found to reflect marked preference for younger stored water, at least in humid climates (Benettin et  Here, we find that, for dominantly advective flows like vertical percolation in soils, recharge fluxes mostly drain the older storage components. Thus, the idea that outflows tend to be younger than the storage that they drain (Berghuijs & Allen, 2019) may not apply to water percolation in soils, at least in agricultural or reconstructed soil columns like the ones investigated in this study. This opens the question of how integration from plot scale to catchment scale may lead to predictable transitions in the shape of the SAS function.

What Is Next
SAS-based models do not explicitly discriminate between matrix and preferential flow, and they simulate an effective flow path distribution and the ET abstractions. Therefore, they effectively reproduce abstraction processes even without detailed knowledge of the physical mechanisms responsible for generating them. Comparisons with spatially explicit process-based models may highlight what these processes are. However, in many practical cases, the architecture of the networks of macropores and preferential flow that make up for the leading characters of the hydrologic response may be beyond the reach of a description suited to a physically based model, thus forcing the use of conceptual schemes of the abstraction mechanisms. This is also true when integrating the effects of heterogeneous properties, inevitable at the catchment scale.
May we note that HYDRUS-1D was selected here as a popular and simple, physically based model. As any model, it necessarily includes simplifications (e.g., it neglects the occurrence of macropore flow owing to the limited insight we ordinarily have about its pathway structures). Yet it is a good starting point to compute, through a simple procedure, time-variant TTDs, and SAS functions (section 2.3). The next step would be to compute SAS functions directly from more advanced distributed models applied to arbitrarily large catchment control volumes (e.g., Maxwell et al., 2018;Yang et al., 2018) and, of course, to understand what physical processes control their shape in the general case.
Here it suffices to note that the very same approach used in this paper may be employed to compare conceptual and physically based models at larger scales and in higher dimensions. To that end, the reduced model parameter structure of the SAS model and the inherently more predictive formulation of transport by TTDs with respect to Eulerian of Lagrangian ones may prove a definite advantage. Key to such feat-a true catchment theory encompassing both flow and transport-will hinge on the experimental closure of mass balance at catchment scales, that is, one where the ages of any abstractions can be directly measured. While this is standard in isotope hydrology for streamflow, it remains to date an insurmountable challenge for large and diverse assemblages of vegetation.

Conclusions
In this study, we compared two transport models in terms of tracer transport and TTDs. We used a spatially integrated model based on SAS functions and a 1-D physically based model and calibrated them to drainage data from two contrasting lysimeter experiments to explore transport processes in soils. Results show the following: • The two models provide a consistent picture of transport processes through the soil columns and similar estimates of the mean transit times to drainage (MTT Q ). The larger lysimeter (2-m depth) had MTT Q of about 9 months, while the smaller lysimeter (1-m depth) had MTT Q of about 1.5 months during the study period. • Tracer data and SAS functions computed from both models consistently indicate that drainage water was composed of the older storage water in both case studies. • SAS functions for the ET flux indicate that ET generally tended to abstract the younger storage water. However, the detailed ET age dynamics had little influence on the simulated transport dynamics and thus remain unconstrained. • The SAS-based model uses a lumped formulation of transport, yet it offers the opportunity to explore the partitioning of precipitation-and of the water storage-between discharge and evapotranspiration. • The two lysimeters had a similar ratio of advective to dispersive transport (P e = 10-15). The conceptual model was able to simulate the tracer movements because it captured the correct balance between these two mechanisms. • Work needs to be done to extend this framework to larger scales. Conceptual models are a valuable interpretation tool that comes at low computational costs, and coupling them to physically based models allows deeper understanding of the underlying transport processes.