## 1. Introduction

[2] Available data on soil hydraulic parameters (e.g., saturated hydraulic conductivity and water retention parameters) are often inadequate for an accurate simulation of unsaturated flow and contaminant transport. Statistical methods have been developed to estimate the parameters using data that can be easily obtained such as moisture content, soil texture, bulk density, and geophysical data. Pedotransfer functions (PTFs) are statistical methods that estimate the soil hydraulic parameters from the pedotransfer input variables such as bulk density, soil texture, and organic carbon content [e.g., *Rawls et al.*, 1991; *Tamari et al.*, 1996; *Schaap and Bouten*, 1996; *Schaap et al.*, 1998; *Pachepsky et al.*, 1996, 1999; *Minasny et al.*, 1999; *Wosten et al.*, 2001; *Pachepsky and Rawls*, 2004]. Typically, the PTF-estimated parameters are used directly in numerical modeling, while uncertainty of the parameter estimates and its effect on modeling results are often not investigated or are simply ignored. The parameter estimation uncertainty can be significant, especially when multiple kinds of data are used for the estimation. We hypothesize that quantifying the uncertainty can improve the parameter estimation and subsequent modeling predictions and that the uncertainty quantification can help evaluate usefulness of PTF-based parameter estimation methods for unsaturated flow modeling.

[3] The objective of this study is to investigate uncertainty of the PTF-based parameter estimates and its effect on moisture flow simulation. The research is conducted for a field site where a cokriging method is used to generate pedotransfer variables and an artificial neural network (ANN) based PTF is used to estimate soil hydraulic parameters for numerical modeling [*Ye et al.*, 2007a]. Uncertainty of the estimated soil hydraulic parameters is attributed to two sources: the PTF intrinsic uncertainty due to limited data used to train the PTF and PTF input uncertainty in the pedotransfer variables. The relative effect of the two kinds of uncertainty on the parameter estimation uncertainty is investigated. The Monte Carlo simulation is used to propagate the parameter estimation uncertainty in modeling flow in unsaturated media. Two sets of Monte Carlo simulations are conducted. The first set addresses only the PTF intrinsic uncertainty, while the second one considers both PTF intrinsic and input uncertainty. The relative contribution of the two kinds of uncertainty on the predictive uncertainty is also investigated. Exploring the relative contribution is important for uncertainty reduction, because limited resources can then be optimized to gather information on the most important uncertainty source. This study appears to provide the first comprehensive investigation on the relative contribution of PTF intrinsic and input uncertainties and their effects on parameter estimation uncertainty for unsaturated flow modeling.

[4] The uncertainty assessment is conducted for an ANN-based PTF developed by *Ye et al.* [2007a] for the Sisson and Lu (SL) field injection site [*Sisson and Lu*, 1984] located within the U.S. Department of Energy Hanford Site in southeastern Washington. The SL site was used for a field infiltration experiment from June to July in 2000 [*Gee and Ward*, 2001; *Ward et al.*, 2000, 2006a]. Initial moisture content distribution was measured on 5 May 2000 at the 32 radially and symmetrically arranged cased boreholes (Figure 1). Injections began on 1 June, and 4000 L of water were metered into an injection point 5 m below the land surface over 6 h. Similarly, 4000 L of water were injected in each subsequent injection on 8, 15, 22, and 28 June. During the injection period, neutron logging for the moisture content (θ) in 32 wells took place within a day (i.e., 2, 9, 16, and 23 June) following each of the first four injections. A wildfire burned close to the field site preventing immediate logging of the θ distribution for the fifth injection on 28 June. Three additional readings of the 32 wells were subsequently completed on 7, 17, and 31 July. During each neutron logging, moisture contents were monitored in each well at a depth interval of 0.3048 m (1 foot) starting from a depth of 3.9625 m (13 feet) and continuing to a depth of 16.764 m (55 feet), resulting in a total of 1376 measurements in each of the eight observation days over a 2-month period. The moisture content measurements, especially those of initial moisture contents, reflect soil heterogeneity at the site, which is the basis for developing the ANN-based PTF for estimating heterogeneous soil hydraulic parameters.

[5] The ANN-based PTF is able to estimate the three-dimensional (3-D) distribution of soil hydraulic parameters from 3-D distributions of PTF input variables (i.e., bulk density and soil texture) obtained using the cokriging method. While the cokriged PTF input variables are uncertain (the uncertainty being measured by cokriging variance), the PTF input uncertainty was not considered by *Ye et al.* [2007a], and the cokriged PTF inputs were treated as being deterministic in the PTF. On the other hand, *Ye et al.* [2007a] also ignored the PTF intrinsic uncertainty in the estimated soil hydraulic parameters, and used only mean parameter estimates to simulate the SL site field injection experiment. Although the simulated moisture contents agreed reasonably well with the corresponding field measurements and the agreement is comparable with previous modeling studies of the same experiment [e.g., *Zhang et al.*, 2004; *Yeh et al.*, 2005; *Kowalsky et al.*, 2005; *Ward et al.*, 2006b], a mismatch was observed at each borehole [*Ye et al.*, 2007a]. This raises several questions such as: does the mismatch indicate that the PTF parameter estimates are inadequate to simulate the observed moisture content variability? Will consideration of the parameter estimation uncertainty improve the simulation in the sense that the observed moisture content variability can be sufficiently captured? The answers to these questions are explored in this study.

[6] According to *Chirico et al.* [2007], PTF parameter estimation uncertainty is attributed to three major sources: (1) PTF model parameters, (2) PTF input variables, and (3) PTF model structures. The PTF model parameter uncertainty is referred to in this paper as the PTF intrinsic uncertainty to distinguish it from PTF parameter estimation uncertainty of the soil hydraulic parameters. Exploring the intrinsic uncertainty is not a common practice in PTF development. *Schaap and Leij* [1998] estimated such uncertainty using the nonparametric bootstrap method [*Efron*, 1992]. In the work by *Schaap and Leij* [1998], the bootstrap method randomly resamples with replacement of an original data set of size *N* into a bootstrap realization also of size *N*. To simulate the repeated collection of data, the selection procedure is repeated *M* times for the original data set, with each bootstrap realization containing approximately 63% of the original data. An ANN-based PTF is calibrated for each of these *M* realizations and validated for approximately 37% of the data not contained in the specific bootstrap realization. The multiple realizations of the calibration and validation data set represent the PTF intrinsic uncertainty; the uncertainty is quantified by assigning a probabilistic distribution for the *M* realizations of parameter estimates. Although *Schaap and Leij* [1998] developed the theoretical basis for assessing the PTF intrinsic uncertainty, typically only the mean values for the parameter estimates are used in numerical modeling. This ignores the PTF intrinsic uncertainty; the mean estimates, however, may not be able to explain the observed physical behavior. For a hydrologic model, *Srivastav et al.* [2007] reported that use of the bootstrap mean failed to capture the hydrograph peak flow characteristics. This is also observed by *Ye et al.* [2007a], wherein the bootstrap mean parameter estimates cannot adequately capture the observed moisture content variability. To our knowledge, investigating the effect of bootstrap parameter estimation uncertainty on unsaturated flow modeling has not yet been reported.

[7] Another source of PTF parameter estimation uncertainty is the PTF input uncertainty, which can be due to data error and spatial or temporal variability of PTF inputs. This uncertainty is usually ignored in PTF applications, because it is either unknown or difficult to propagate through the modeling. This study presents a unique opportunity of assessing the PTF input uncertainty caused by spatial variability, because the PTF input variables are estimated using the cokriging method and the input uncertainty is measured by the cokriging variance. In addition to factors contributing to (co)kriging variance as discussed by *Isaaks and Srivastava* [1989], the cokriging variance can also be due to the error in estimating cross variogram [*Lark*, 2003] and the error in primary and secondary data [*Abbaspour et al.*, 1998]. As shown below, these PTF inputs, estimated largely from the initial moisture content (θ_{i}, cm^{3}/cm^{3}, %), are subject to significant uncertainty; this uncertainty dominates the total PTF parameter estimation uncertainty, and increases predictive uncertainty of simulated moisture content.

[8] PTF structure is another source of uncertainty [*Chirico et al.*, 2007]; that is, different PTFs implement different conceptualizations of physical or statistical realities and thus yield different estimations. One way of addressing this uncertainty is to compare the parameter estimates with corresponding measurements and to select the optimal model that gives the minimum difference [*Williams et al.*, 1992; *Tietje and Tapkenhinrichs*, 1993; *Kern*, 1995; *Minasny et al.*, 1999; *Cornelis et al.*, 2001]. This method, however, does not guarantee that the selected model will give satisfactory predictions for the variables of interest. An alternate approach is to conduct numerical modeling using parameter estimates for various PTFs, and to select the best PTF model that gives the minimum difference between the simulated and observed state variables [*Espino et al.*, 1996; *Finke et al.*, 1996; *Minasny and Field*, 2005; *Mermoud and Xu*, 2006; *Dai et al.*, 2008]. Because a single model is used for both methods, neither method completely addresses the PTF model uncertainty. *Chirico et al.* [2007] found that the PTF model structure uncertainty can contribute to PTF parameter estimation uncertainty more than the PTF input uncertainty. *Pachepsky et al.* [2006] suggested that the recently developed method of Bayesian model averaging [*Neuman*, 2003; *Ye et al.*, 2004, 2005a, 2008] is a promising tool for addressing the PTF model uncertainty. However, addressing the model uncertainty is beyond the scope of this study.