## 1. Introduction

[2] Modeling of water fluxes in the unsaturated zone is important for quantifying soil moisture movement between the surface and groundwater. This modeling is intrinsically difficult because the processes are highly nonlinear and soil structures can vary from millimeters to kilometers in size and can rarely be fully resolved. Animals and plants have a large impact on the topsoil, processes such as hysteresis and macropore transport may not be included in the model, measurement devices have errors and typically there are discrepancies between observation scale and modeling scale. Despite all these potential sources of errors, today we have advanced models that are assumed to adequately represent water flow in the unsaturated zone.

[3] A crucial point in modeling is to decide to what level the details of the system need to be resolved. A high level of detail may provide a better representation of reality, but it requires more data, system knowledge, and computational power. Simpler models, on the other hand, require less details, are faster to run, and easier to understand, but may not accurately reproduce the system of interest. A decision on the appropriate level of detail depends on the modeling goal and the available data. Ideally, data and model simulations should be on the same scale, and the model should be able to represent the relevant processes. In modeling of the unsaturated zone for large scale systems, this is, however, rarely the case and the scale differences between observations and models can be orders of magnitude, as demonstrated for example by *Vereecken et al.* [2007]. To resolve differences in scale and to accurately describe spatially distributed processes of soil moisture flow and states, models need to be upscaled.

[4] In their review of upscaling methods, *Vereecken et al.* [2007] distinguished two ways of upscaling. The first way uses small scale spatial information to derive effective equations and/or effective parameters for the large scale model. Examples of such approaches are the use of stochastic theory [see, for example, *Vereecken et al.*, 2007; *Zhang*, 2002] and the scaleway approach of *Vogel and Roth* [2003]. The second way is to assume that the model equations can represent the effective behavior of the system and to estimate effective parameters using inverse modeling. Standard models for water flow in the unsaturated zone are often based on the Richards equation, also for larger scales. Although inverse methods to estimate effective parameters are well accepted, the assumption that such models can represent the effective behavior is not always physically justified [e.g., *Vereecken et al.*, 2007]. For this and several other reasons, the estimation of effective parameters is notoriously difficult in the unsaturated zone, as evidenced by the studies of *Papafotiou et al.* [2008], *Mertens et al.* [2005], and *Kumar et al.* [2010] that all showed differences between parameters estimated for the same system at different scales.

[5] A common approach to estimate hydraulic parameters of small soil columns is to use one- or multistep outflow experiments, in which a saturated soil sample is drained by (stepwise) lowering of the water pressure at the bottom of the sample [e.g., *Bayer et al.*, 2005; *Laloy et al.*, 2010a; *Schelle et al.*, 2010; *Valiantzas and Londra*, 2008; *Vasin et al.*, 2008; *Zurmühl and Durner*, 1998]. The ability to obtain representative hydraulic parameters for such an experiment depends, naturally, on the level of complexity of the system [*Laloy et al.*, 2010a; *Vasin et al.*, 2008], the type of model used [*Laloy et al.*, 2010a; *Zurmühl and Durner*, 1998], but also to a large extent on what system states the model seeks to reproduce. For example, *Bayer et al.* [2005], *Durner et al.* [2008], and *Laloy et al.* [2010a] all found poor agreement between measured and modeled internal water distribution, even though the averaged state as determined by the flow in and out of the system was well described. The difficulties of estimating effective hydraulic parameters have also been pointed out by *Vereecken et al.* [2008], who reviewed problems with estimating hydraulic properties from soil moisture data and by *Schelle et al.* [2010], who showed that it is particularly difficult to predict the soil moisture regime outside the calibration period used to estimate the effective parameters.

[6] All papers discussed in the previous paragraph ask the question if, or to what extent, it is possible to find effective model parameters that reproduce the observations well. If one believes that representative effective parameters exist, it remains a challenging task to find them. Typically, an optimization algorithm is used to search the parameter space and to find the best possible parameter combination that minimizes the difference between observations and model predictions. A major problem in inverse modeling applications that estimate hydraulic parameters of the unsaturated zone is the long run time of a single flow model evaluation, which restricts our ability to adequately explore the parameter space. This, in turn, can lead to an additional uncertainty in the resulting parameter estimates. With increasing computational power, methods for automatic model parameter estimation have become increasingly popular for estimating parameters within an acceptable number of model evaluations. Apart from estimating the best set of parameters for a particular problem, some methods also assess model parameter uncertainty. Examples of such methods are Markov chain Monte Carlo (MCMC) methods [e.g., *Gelman et al.*, 2004; *Vrugt et al.*, 2008], informal Bayesian approaches using generalized likelihood functions [e.g., *Beven and Freer*, 2001], and multiobjective parameter estimation approaches that search for an entire set of solutions that are all optimal in the sense that an improvement in one objective results in a deterioration of another objective [e.g., *Vrugt et al.*, 2003].

[7] An important decision in setting up an inverse modeling problem is the definition of a likelihood function because this automatically entails assumptions about the underlying causes for the difference between observations and model. In the theory of using a model, *Kennedy and O'Hagan* [2001] discussed six groups of errors that might cause deviations between models and observations: parameter uncertainty, model inadequacy, residual variability, parametric variability, measurement error, and code uncertainty. Of these errors, the measurement error is the most commonly treated. It is often assumed that it can be treated as uncorrelated noise that follows a Gaussian distribution with zero expectation (i.e., white noise), which allows treatment with well-established statistical methods. The uncertainty resulting from a possibly incomplete search of the parameter space, and hence the risk of only finding a local optimum, is in this view a code uncertainty. In unsaturated zone modeling, it is common that certain structures of the soil or certain processes are not well represented, which makes the model an imperfect model of the real world. This is referred to as model inadequacy. Due to the often strong correlation in time and space, it is inappropriate to describe these two error sources as uncorrelated Gaussian noise. Therefore, it is common practice to ignore errors due to model inadequacy and code uncertainty, despite the fact that these errors can be orders of magnitude larger than measurement errors [*Doherty and Welter*, 2010].

[8] An alternative approach to deal with errors in modeling is to include external error models that correct for discrepancies between observations and modeling predictions. Examples of such approaches are the use of autoregressive models and external adjustments of model forcing terms used by *Kavetski et al.* [2006], *Vrugt et al.* [2008], and *Reichert and Meileitner* [2009]. Recently, a formal Bayesian approach was proposed that uses a likelihood function that can take into account skewness, heteroscedasticity, and correlation of the residuals [*Schoups and Vrugt*, 2010]. This method was successfully tested for a hydrological test case. Depending on the complexity of a problem and the availability of data, different problems might require different error treatments. In this context, *Doherty and Welter* [2010] pointed out that a universal procedure to deal with modeling errors associated with imperfect models does not exist.

[9] In this paper we aim to investigate how parameters can be estimated for a model that is known to be imperfect because it does not fully resolve soil structure. The motivation to address this question is that measurements are regularly made on a much smaller scale than the modeling scale. For example, effective hydraulic parameters for large scale models describing water flow in the unsaturated zone are commonly estimated from water content observations made with TDR probes on a centimeter scale. We follow the general idea of introducing an error model to the parameter estimation process such as described by *Carter* [2004]. We test the approach by using spatially averaged saturation measurements taken during multistep outflow experiments in lab-scale heterogeneous sand samples by *Vasin et al.* [2008]. The available data were obtained using neutron radiography, neutron tomography, and outflow measurements. We would like to stress here that these data are used for illustrative purposes, but that the presented ideas are by no means limited to this type of experiment. In fact, our results are not meant to improve multistep outflow experiments, but instead they are intended for use with a wide range of hydrological flow problems, where typical applications would be very different from this small scale.

[10] The remainder of the paper is structured as follows. First, the setup of the experiment of *Vasin et al.* [2008] and a virtual reality experiment used for initial tests are explained in section 2. Section 3 presents a selection of parameter estimation results using MCMC simulation to illustrate the problem of estimating effective parameters for heterogeneous soils and the need for further investigation. In section 4, three external error models are introduced and tested as a way to improve model performance by accounting for soil structure outside of the flow model. Finally, conclusions are drawn in section 5.