Using observed soil moisture to constrain the uncertainty of simulated hydrological fluxes

Using data from five long‐term field sites measuring soil moisture, we show the limitations of using soil moisture observations alone to constrain modelled hydrological fluxes. We test a land surface model, Modélisation Environnementale communautaire‐Surface Hydrology/Canadian Land Surface Scheme, with two configurations: one where the soil hydraulic properties are determined using a pedotransfer function (the texture‐based calibration) and one where they are assigned directly (the hydraulic properties‐based calibration). The hydraulic properties‐based calibration outperforms the texture‐based calibration in terms of reproducing changes in soil moisture storage within a 1.6 m deep profile at each site, but both perform reasonably well, especially in the summer months. When the models are constrained using observations of changes in soil moisture, the predicted hydrological fluxes are subject to very large uncertainties associated with equifinality. The uncertainty is larger for the hydraulic properties‐based calibration, even though the performance was better. We argue that since the pedotransfer functions constrain the model parameters in the texture‐based calibrations in an unrealistic way, the texture‐based calibration underestimates the uncertainty in the fluxes. We recommend that reproducing observed cumulative changes in soil moisture storage should be considered a necessary but insufficient criterion of model success. Additional sources of information are needed to reduce uncertainties, and these could include improved estimation of the soil hydraulic properties and direct observations of fluxes, particularly evapotranspiration.

the texture-based calibration in terms of reproducing changes in soil moisture storage within a 1.6 m deep profile at each site, but both perform reasonably well, especially in the summer months. When the models are constrained using observations of changes in soil moisture, the predicted hydrological fluxes are subject to very large uncertainties associated with equifinality. The uncertainty is larger for the hydraulic properties-based calibration, even though the performance was better. We argue that since the pedotransfer functions constrain the model parameters in the texture-based calibrations in an unrealistic way, the texture-based calibration underestimates the uncertainty in the fluxes. We recommend that reproducing observed cumulative changes in soil moisture storage should be considered a necessary but insufficient criterion of model success. Additional sources of information are needed to reduce uncertainties, and these could include improved estimation of the soil hydraulic properties and direct observations of fluxes, particularly evapotranspiration.

K E Y W O R D S
land surface models, pedotransfer functions, soil moisture, uncertainty

| INTRODUCTION
Land surface models simulate the vertical exchanges of water and energy between the land and the atmosphere. These models integrate many different interacting processes and are capable of outputting a large number of state variables and fluxes. Soil moisture is the largest store of water on the land surface (aside from lakes) and is probably the easiest hydrological variable to measure. That is not to say that obtaining representative measurements of soil moisture is straightforward-particular challenges are to measure soil moisture at the appropriate scale and to capture the spatial variability (Pan et al., 2017;Peterson et al., 2016Peterson et al., , 2019Vereecken et al., 2008). On the other hand, Mälicke et al. (2020) looked at networks of soil moisture sensors and showed considerable organization in the spatial response of soil moisture changes, concluding that it is the temporal information from a few sensors, rather than the spatial variation from a large number of sensors, that is most valuable for capturing catchment scale moisture dynamics. Often the most useful metric of soil moisture is field scale (10 4 -10 6 m 2 area) root zone (extending from the surface to a depth of 0.5-2 m) storage, S (mm), where S ¼ 10 3 V w =A and V w (m 3 ) is the volume of water and A (m 2 ) is the land surface area. Such metrics can be useful for irrigation planning, crop yield assessment, and other vegetation productivity measures (Vereecken et al., 2008).
Consider the simple soil water balance equation where I, T, E and D are infiltration, transpiration, evaporation and drainage, all with units (mm/d). Changes in soil moisture over some time increment are obtained by Subsequent values of storage are given by The water balance error, ε (mm), over some time increment is given by Equation (4) may be evaluated over long-term intervals (monthly, annually or multi-year) for observations and field estimates of fluxes to assess consistency of those data, and large errors (e.g. >50 mm/ year) can be expected (e.g. Pan et al., 2017). It may also be solved for models over the entire simulation period, to assess model errors in the simulated states and fluxes, and in this case the errors should be very small (e.g., <10 À3 mm/year).
Each of the fluxes in the above equations are simulated by the modelthat is to say, they are not boundary conditions (most cannot be directly measured, Vereecken et al., 2008, Li et al., 2019 We have available field measurements of volumetric liquid water content, θ (m 3 m À3 ), at a number of discrete depth intervals, in the profile, z j (m), and recording at discrete intervals in time, t i (d). The profile storage, S (mm), is given by A more useful metric of storage for use with calibrating models is the cumulative change in storage, Ω (mm). This is defined: We can also define the analogous cumulative change in volumetric water content, Θ (m 3 m À3 ) as The benefits of calibrating and validating models against Ω or Θ rather than S or θ are twofold. Firstly, observations of soil moisture are made indirectly, and an instrument calibration relationship is required for the probe that relates some measured variable (typically the dielectric constant) to the volumetric (liquid) water content (Gardner et al., 2003). These calibrations can introduce errors, but the errors in the change in water content, measured between two points in time, are smaller than the errors in the absolute water contents. Hence, we have more reliable observations of the changes in water content, and hence changes in storage, than we do of the actual water contents.
The second reason is that in most soil moisture models, hydraulic properties are determined using a normalized measure of water content, the effective saturation, given by where θ r and θ p are the residual and saturated water content. The simulated cumulative change in θ (i.e. Θ SIM ) is the same as the simulated cumulative change in θ À θ r (i.e. from Equation 6 we have . Therefore, any arbitrary value for θ r can be used to simulate Θ SIM (θ r ¼ 0 is a sensible choice) and the number of free calibration parameters can be reduced by one. Moreover, after the model has been calibrated to reproduce Θ OBS , if desired, the actual values of θ SIM can be obtained from θ SIM ¼ Θ SIM À Θ SIM þ θ OBS , (and, if desired, the parameters θ r and θ p can be rescaled in the same way).

| Field sites
We collected field data at five instrumented sites aligned from south to north in central Saskatchewan, Canada, Figure 1. Saskatchewan has a continental and seasonally frozen climate and a general trend of increasing mean annual precipitation and decreasing mean annual temperature from south to north. Soil type and vegetation cover at each site are summarized in Table 1.  (Miller et al., 1985).
Vegetation at the site is mainly native and non-native grasses, and riparian vegetation or "willow-rings" surrounding numerous ponds. T A B L E 1 Field site characteristics (Bam et al., 2019;Barr et al., 2012;Pan et al., 2017)  The soil moisture site is in a lowland between two ephemeral ponds.
Here, the water table has been observed within 1 m of the ground surface. Annual precipitation is approximately 360 mm, of which 1/3 is snowfall, but is highly variable (Bam et al., 2019). Annual potential evaporation exceeds precipitation in the region, and annual open water evaporation is approximately 700 mm (Parsons et al., 2004).
Mean January and July temperatures

| Instrumentation and data
In this study, meteorological data were used to drive the model and soil moisture data were used to assess the model performance and constrain the simulated fluxes. The specific instruments and measurement heights/depths are summarized in Table 2.
The driving data are plotted in Figure 2, which shows the interannual, seasonal and diurnal variation in these datasets. Precipitation is measured at each site with a weighing precipitation gauge that captures both rainfall and snowfall. It is notable that the two prairie sites, Kenaston and SDNWA, have higher windspeeds, which is because the forest sites are more sheltered. As a result, the prairie sites are subject to significant redistribution of snow laterally by the wind, and the snowfall observations from the gauges are not representative of the snow on the ground, which could be higher or lower than the snowfall (Pan et al., 2017). To correct for this, snow survey data from the prairie sites, which measure the peak snow on the ground prior to melt, were used to adjust the snowfall data such that the snow on the ground simulated by MESH would match the snow survey observations. For each winter (typically November-February), every snowfall event was scaled by a constant correction factor that was determined for each year by trial and error. The precipitation data in Figure 2 have been corrected in this way for the prairie sites. Precipitation data at the forest sites were not modified.
The calibration/validation soil moisture data are shown in  (Table 2). We used these data to estimate the average volumetric water content in each of the soil depth increments that correspond to the three layers in the model (i.e. 0-0.1, 0.1-0.35 and 0.35-1.6 m depths), as shown in Figure 3. The data are considered high quality, with minimal gaps or data quality problems. Soil freezing is evident at the prairie sites (Kenaston and SDNWA), where the water content in the first two layers drops rapidly at the start of the winter, and rises rapidly in the springthis is because the water turns to ice, which is not detected by the dielectric probes. The water content in layer 3 at SDNWA is much higher than at Kenaston, which is because the water  (Bartlett et al., 2003;Verseghy, 2017) with hydrological routing scheme WATFLOOD (Kouwen, 1988, recently described in Pomeroy et al., 2016). MESH relies on a mosaic of Group Response Unit's (GRUs) to represent the heterogeneity and hydrological processes of the landscape. A GRU is a grouping of hydrological response units with similar soil and/or vegetation attributes (Xu et al., 2017). In this study, we used a single grid cell with a single GRU to represent the point scale vertical processes. MESH employs CLASS to simulate vertical water fluxes and energy balances for each GRU (Verseghy, 2017). CLASS divides the soil column into three layers and the vertical movement of water between each soil layer is governed by a finite difference solution of one-dimensional Richards' equation for unsaturated flow in porous media (Soulis et al., 2000). The hydraulic properties in CLASS adopt the Clapp and Hornberger (1978) model to determine the relationship between hydraulic conductivity, K (m/d), matric potential, ψ (m), and soil moisture, where where ψ s (m) is the saturated matric potential, θ p is the saturated water content, K s (m/d) is the saturated hydraulic conductivity, and b (À) is a shape parameter. Note, temperature corrections are applied to K s to account for changes in the viscosity of water (Verseghy, 2017, p. 122), and in frozen conditions an additional impedance factor is applied to reduce K s to account for the ice blockages in the pore space (Verseghy, 2017, p. 146). The parameters in Equations (9) and (10) are normally determined using the empirical pedotransfer functions of Cosby et al. (1984), whereby θ p ¼ À0:126X S þ 48:9 100:0 , ψ s ¼ 0:01 e À0:0302XSþ4:33 where X S (%) and X C (%) are the percentage sand and clay, respectively, of the soil in a particular layer. Note, in Equation (14) K s is given The way that the fluxes in Equation (1) are calculated in CLASS is briefly described in Table 3.

| Monte Carlo simulations
To investigate how information from soil moisture observations constrains simulated fluxes, we apply a simple Monte Carlo approach. We allow for uncertainty in the model parameters by randomly sampling parameter values from a uniform (or log-uniform) distribution. We generate 10 000 parameter combinations and run the model with each parameter set over a two-year calibration period. The performance of each realization is determined by calculating ϵ j , the root mean squared error (RMSE) of the cumulative change in liquid water content (Equation 7), for each depth, j, as in Equation (15), and then averaging these over the three layers, as in Equation (16) where Θ o i,j ð Þ and Θ s i,j ð Þ are the cumulative change in volumetric liquid water content from the observations and simulations, respectively, at time index i and depth index j, n is the number of points in time, and ϵ T is the overall performance metric, with units of volumetric water content (m 3 /m 3 ). The layer average RMSE was used, so that the performance in each layer would have an equal weighting-if the total F I G U R E 3 Soil volumetric liquid water content observations from the five field sites, averaged over the three model layer depth increments. The shaded cyan areas represent times of the year when the observed soil temperature in layer 1 was below 0 C storage was used the performance would be biased towards the 3rd soil layer, as this is much thicker (1.25 m) than the 1st (0.1 m) and 2nd (0.25 m) layers.
As described in Section 2.3, the CLASS model is typically run using the Cosby et al. (1984) pedotransfer function, which determines the soil hydraulic properties from soil texture-specifically sand percentage, X S and clay percentage X C . We perform two separate calibration experiments here: the first using these pedotransfer functions samples the parameters X S and X C for each soil layer-that is, six free parameters; the second does not use the pedotransfer functions, and samples the four hydraulic properties (θ p , b, ψ s , K s ) directly for each soil layer-that is, 12 free parameters. Hereafter we describe these two experiments as the texture-based calibration and hydraulic properties-based calibration. In addition to the 6/12 soil parameters, we also sample four additional parameters that are understood to have a strong control on the simulated fluxes (Nazarbakhsh et al., 2020): the minimum and maximum annual leaf area index, L min and L max (À), the minimum stomatal conductance, r s,min (s/m) and a drainage index, ϕ (À) ( Table 4). To prevent the random parameter sampling procedure from generating L min > L max we instead sample L max and the factor f L such that L min ¼ min L max , L max f L ð Þ . f L is randomly sampled from a uniform distribution between 0.5-1.25, which ensures that around one-third of combinations will have L min ¼ L max . For the texture sampling, we need to specify ranges in sand (X S Þ, clay (X C ) and silt (X L ) and ensure that the sum of the three equals 100%, which is done by randomly sampling X S , X C and X L from uniform distributions (see ranges in Table 4), and then rescaling each by the same scale factor such that X S þ X C þ X L ¼ 100. The texture-based and hydraulic properties-based calibrations sample 10 and 16 free parameters, respectively. The parameter ranges considered for each site were based on knowledge of the soils and vegetation characteristics at each site, and are shown in Table 4.
The models were all initialized on 1st August 2013, and the calibration period ends on 30th September 2015 (two complete hydrological years) while the validation period ends on 30th September 2017 (two additional complete hydrological years). Initializing the models in August eliminates the need to specify initial soil ice content or the initial snowpack. The initial water content of the model was based on observed estimates of the initial saturation, combined with the current realization value of θ p , that is, where θ O is the observed liquid volumetric water content, and subscripts j is the depth index and ini is the initial time. This approach F I G U R E 4 The pedotransfer functions that define soil hydraulic properties as a function of sand and clay content, and θ ψ ð Þ and K θ ð Þ relationships used in CLASS for three example soil textures ensures that the initial relative saturation is always the same, regardless of the porosity value that was sampled randomly in the Monte Carlo simulation.
For the validation run, the 30 best ranked parameter sets were considered. We seek to validate the model validation performance with the Ω observations, and to explore the uncertainty in the model fluxes associated with these runs. The model performance is variable between the different sites, and between the texture based and hydraulic properties-based calibration.

| RESULTS AND DISCUSSION
The rise in storage in the spring (March-April) is characteristic of seasonally frozen soils, and is a complex combination of snowmelt infiltration and soil thaw (whereby ice becomes liquid water, leading to an apparent observed increase in liquid water content, which may or may not be associated with an actual increase in total, that is ice plus liquid, water content). The timing of this rise is delayed in all models, suggesting there may be some limitations with either the snowmelt, the infiltration flux or the soil thawing. However, the magnitude of the rise, and the overall seasonal changes in storage are generally well captured in the models. The worst performance is at the OJP field site, with the texture-based calibration. The reason this is poor is T A B L E 3 Description of soil flux calculations in Canadian Land Surface Scheme

Flux
Calculation description

Infiltration
Rainfall and through fall on the ground are combined with snowmelt to form the potential infiltration flux; the infiltration capacity is calculated by a Green-Ampt model and depends on the soil hydraulic properties; water that cannot infiltrate forms ponding on the ground-surface, where it may later infiltrate unless the surface ponding capacity is exceeded, in which case the excess water forms overland runoff.

Drainage
The bottom of the soil profile, that is, the base of soil layer 3, has a free drainage boundary condition, (Soulis et al., 2000), where D ¼ ϕK 3 θ 3 ð Þ, where ϕ is a drainage parameter that restricts free drainage (0-1).
Soil evaporation Soil evaporation is the sum of evaporation from bare soil and evaporation from soil below the canopy, both of which are driven by a humidity gradient. The humidity at the soil surface, and hence the soil evaporation flux, is reduced as the water content in layer 1 drops below field capacity, and soil evaporation is limited by the availability of water in the top soil layer and surface ponding (Sun & Verseghy, 2019).

Transpiration
Transpiration is extracted from all soil layers, weighted by the root density in each layer, as long as the liquid water content is greater than 0.04. The flux rate depends on the leaf-to-air humidity gradient, the boundary layer resistance and the canopy resistance, which in turn is related to leaf stomatal resistance and leaf area index. The stomatal resistance has a reference value, r s,min , and is modified as a function of incoming solar radiation, vapour pressure deficit, soil moisture in the wettest layer, and air temperature. Through the stomatal conductance term transpiration rates are reduced exponentially as the soil suction in the wettest layer reduces below ψ s , that is, as the soil moisture reduces (Sun & Verseghy, 2019, Verseghy, 2017. T A B L E 4 Parameter ranges considered. All parameters are sampled from a uniform distribution except K s which is sampled from a log-uniform distribution K s (m/s) 10 À7 -10 À4 10 À7 -10 À4 10 À7 -10 À4 10 À7 -10 À4 10 À7 -10 À4

Additional parameters
L max (À) 0.5-3 0.5-3 2 -4 2 -4 1 -4 f L (À) 0.5-1.25 0.5-1.25 0.5-1.25 0.5-1.25 0.5-1.25 r s,min (s/m) 50-300 50-300 50-300 50-300 50-300 because the sandy soils at OJP have a very small range of moisture variation, never rising as high as 0.2 (Figure 3). The texture-based properties only allow for a variation in θ p , the saturated water content, of between 0.36 and 0.49, as shown in Figure 4 (top left), and therefore the model always over-estimates the range of variation of water content. This restriction is removed using the hydraulic properties-based calibration. This improvement in performance using the hydraulic properties-based calibration is true for all field sites. This is because the pedotransfer functions used in the texture-based calibration artificially constrain the agility of the model, as discussed by Mendoza et al. (2015). To elaborate, when using pedotransfer functions, for a given parameter value of θ p there will always be a unique parameter value of K s , and there is no way to explore deviations in this combination. With the hydraulic properties-based calibration applied here, this constraint is removed, and θ p and K s are treated as completely independent of one another. These two cases therefore represent two extreme possibilities in terms of the possible parameter space that is explored by the model.  and drainage is relatively small but highly uncertain. At OJP we see the largest uncertainties, particularly in the drainage fluxes that could be anywhere from zero to >200 mm/year. At OBS, where again the uncertainties are lowest overall, evaporation is relatively well constrained, but drainage and runoff are still quite uncertain, especially in 2017. At OAS, the evaporation is somewhat well constrained, runoff is small, but drainage is highly uncertain.

| CONCLUSIONS
We have shown that the MESH/CLASS model is capable of simulating the changes in soil moisture within a 1.6 m deep profile at five diverse prairie/forest field sites relatively well, albeit with some limitations associated with the timing of the rise in liquid water content during the spring melt period. However, this relatively good performance at simulating water content did not result in well constrained predictions F I G U R E 7 Simulated cumulative fluxes from the unconstrained (all 10 000) and constrained (best 30) model runs for the calibration and validation periods of hydrological fluxes. From a simple and qualitative assessment of Figures 6 and 7 we conclude that the information content in soil moisture data is relatively low. The texture-based calibration is apparently slightly better at constraining the fluxes than the hydraulic propertiesbased calibration, which demonstrates that uncertainties in models can be reduced by embedding assumptions within the models. However, this also requires that these embedded assumptions are reasonable. In this case, the embedded assumption is that the hydraulic properties can be determined from pedotransfer functions. Since we see in Figure 5 that doing this degrades the performance of the model in reproducing observed changes in soil moisture storage, this is not deemed a reasonable assumption in this case; the reduction in uncertainty is considered misleading. We therefore conclude that the very wide uncertainty bounds predicted by the hydraulic properties-based calibration are in fact a more accurate reflection of the true uncertainty  (Equations 15 and 16). For all sites, we see that the parameter identifiability is better for the texture-based calibration than for the hydraulic properties-based calibration. For the hydraulic properties-based calibration, the parameters are often completely unidentifiable. One notable exception is at OJP ( Figure A6) where the porosity is identifiable and low values are clearly preferred. This poor identifiability is associated with the equifinality in the model, and further shows that additional observations are needed to constrain the parameters.
F I G U R E A 1 Dotty plots for the Monte Carlo runs at Kenaston using the texture based soil parameterisation. RMSE is the root mean squared error between the observed and simulated cumulative change in water content for the unfrozen period, averaged between the three layers, in units of volumetric water content F I G U R E A 2 Dotty plots for the Monte Carlo runs at Kenaston using the hydraulic properties based soil parameterisation. RMSE is the root mean squared error between the observed and simulated cumulative change in water content for the unfrozen period, averaged between the three layers, in units of volumetric water content F I G U R E A 3 Dotty plots for the Monte Carlo runs at St Denis using the texture based soil parameterisation. RMSE is the root mean squared error between the observed and simulated cumulative change in water content for the unfrozen period, averaged between the three layers, in units of volumetric water content F I G U R E A 4 Dotty plots for the Monte Carlo runs at St Denis using the hydraulic properties based soil parameterisation. RMSE is the root mean squared error between the observed and simulated cumulative change in water content for the unfrozen period, averaged between the three layers, in units of volumetric water content F I G U R E A 5 Dotty plots for the Monte Carlo runs at OJP using the texture based soil parameterisation. RMSE is the root mean squared error between the observed and simulated cumulative change in water content for the unfrozen period, averaged between the three layers, in units of volumetric water content F I G U R E A 6 Dotty plots for the Monte Carlo runs at OJP using the hydraulic properties based soil parameterisation. RMSE is the root mean squared error between the observed and simulated cumulative change in water content for the unfrozen period, averaged between the three layers, in units of volumetric water content F I G U R E A 7 Dotty plots for the Monte Carlo runs at OBS using the texture based soil parameterisation. RMSE is the root mean squared error between the observed and simulated cumulative change in water content for the unfrozen period, averaged between the three layers, in units of volumetric water content F I G U R E A 8 Dotty plots for the Monte Carlo runs at OBS using the hydraulic properties based soil parameterisation. RMSE is the root mean squared error between the observed and simulated cumulative change in water content for the unfrozen period, averaged between the three layers, in units of volumetric water content F I G U R E A 9 Dotty plots for the Monte Carlo runs at OAS using the texture based soil parameterisation. RMSE is the root mean squared error between the observed and simulated cumulative change in water content for the unfrozen period, averaged between the three layers, in units of volumetric water content F I G U R E A 1 0 Dotty plots for the Monte Carlo runs at OAS using the hydraulic properties based soil parameterisation. RMSE is the root mean squared error between the observed and simulated cumulative change in water content for the unfrozen period, averaged between the three layers, in units of volumetric water content