## 1. Introduction

[2] Rainfall runoff modeling is one of the central and classic problems in hydrology. By definition every model is a simplification of a more complex system. The fact that natural processes are described with mathematical equations and the corresponding parameters are derived from observations and experience leads to uncertainties. The main sources of uncertainty are embodied in the following five areas: inputs, state specification, process definition, model structure and output.

### 1.1. Input Uncertainty

[3] The meteorological input is based on point observations which are themselves uncertain and sometimes combined with indirect measurements such as radar or satellite information. Because of the fact that the exact precipitation, temperature and other input variables are not known at every point of the catchment, uncertainty due to measurement and interpolation errors and spatial variability need to be taken into account.

### 1.2. State Uncertainty

[4] The actual state of the catchment (e.g., moisture conditions, snow cover) is usually not directly observed but calculated using model equations. Because of the fact that the input and the model abstraction are simplifications, the state itself becomes uncertain. Additionally, continuous simulations inherit state uncertainty from preceding time steps.

### 1.3. Process Abstraction-Related Uncertainty

[5] The main hydrological processes are described using equations that can only capture parts of the complex natural processes. Parameters of these equations correspond partly to sets of discrete measurements or need to be estimated via calibration. This automatically leads to uncertainties of the corresponding model output.

### 1.4. Model Structure Uncertainty

[6] The model structure itself leads to uncertainty due to the inherent simplification of the more complex real system. The discretization of the landscape into polygons or rasters produces additional errors as the real processes occur on much smaller scales.

### 1.5. Output Uncertainty

[7] Observed discharge, groundwater level, soil moisture, conductivity and other observations are also based on rating curves, point measurements, or remote sensing and can be corrupted by measurement errors and neglected spatial variability.

[8] The quantification of these uncertainties is important both for practical decision making and theoretical modeling. Unfortunately, this is neither a straightforward nor simple task. *Kavetski et al.* [2003], *Gupta et al.* [2005], *Beven* [2006], *Schaefli et al.* [2007], and many others state that despite the considerable attention that has been given to uncertainty estimation in recent years, there has been no satisfactory approach to separate all sources of error and to quantify the total uncertainty proposed to date. *Singh and Woolhiser* [2002] describe this fact as one of the major limitations of current watershed models. Therefore, the purpose of this study was to develop a methodology for the quantification of total model uncertainty considering the above list of relevant error sources in turn and in combination.

[9] Even physically based hydrological models require parameter calibration because subgrid processes can only be parameterized in a lumped way. Effective parameters are required at the model grid scale, which can be quite different from field or laboratory measurements despite being lumped with one another [*Beven*, 1989]. This calibration is more difficult than may be expected because of problems associated with the objective function used, parameter interaction, input uncertainty, and the implicitly assumed error model. *Kavetski et al.* [2003] give a comprehensive overview of these problems and show that objective functions based on least squares or derivatives thereof will yield biased parameter estimates if the input and output data are corrupt.

[10] It has long been understood that the choice of a single objective function must lead to biased calibration as each performance criterion is sensitive only to certain characteristics of the hydrograph [*Krause et al.*, 2005]. Multiobjective calibration has been proposed to counteract this effect [*Yapo et al.*, 1998; *Gupta et al.*, 2003] and additional information may very well reduce the uncertainty of model predictions. However, the extension of the dimensionality of the optimization can also increase uncertainty and the approach will still suffer from the main shortcomings of standard single-objective calibration. The problem is that most calibration methodologies assume and require that the model errors are Gaussian and that their variance is constant in space and time (homoskedastic), which is rarely verified.

[11] Markov Chain Monte Carlo methods are the most popular in uncertainty estimation. The Shuffled Complex Evolution Metropolis algorithm (SCEM-UA) of *Vrugt et al.* [2003] and the generalized likelihood uncertainty estimation (GLUE) by *Beven and Binley* [1992] have been used in numerous studies. The latter has also been criticized for the adoption of “less formal likelihoods”, the subjective choice of “behavioral” parameter sets and the lumping of all sources of uncertainty into a single parameter uncertainty, which leads to very wide confidence bounds [*Mantovan and Todini*, 2006; *Kavetski et al.*, 2003]. Perhaps the major concern with both methods is the lack of a specific error model structure acknowledging the properties of input and parameter uncertainties.

[12] *Montanari and Brath* [2004] propose to use the normal quantile transform in order to make the input and output time series Gaussian and to derive a linear regression relationship between the model residuals and the simulated river flow. The major drawback of this method is the assumption that the model performance and errors are homoskedastic. *Wagener et al.* [2003] tackle this commonly ignored problem with a dynamic identifiability analysis (DYNIA). It allows for the evaluation of simulated and observed time series with respect to information content for specific model parameters. This analysis can be used to indicate areas of structural failure and potential improvement to the model.

[13] *Kavetski et al.* [2003] introduce a strict inference scheme called BATEA (Bayesian Total Error Analysis) to analyse the model parameters' posterior distributions, conditioned to model error, input error and output error, by Monte Carlo Markov Chains. Considering explicit input and output uncertainty, this method still requires error models of low dimensionality for numerical reasons. Unfortunately, most environmental observation time series show significant complexity, thus prohibiting the use of simple multiplicative error models.

[14] One of the first and most robust approaches to deal with heteroskedasticity was presented by *Sorooshian and Dracup* [1980]. They employed a power transformation and maximum likelihood theory to estimate the weights of the weighted least squares approach in a two-step procedure optimizing the parameters of a simple two-parameter model. *Schaefli et al.* [2007] use a mixture of two normal distributions to mimic the heteroskedasticity of the total modeling errors of a conceptual rainfall runoff model applied to a highly glacierized alpine catchment. The two normal distributions represent the error populations during the two very distinct high- and low-flow regimes. Unfortunately, the approach still assumes normal, homoskedastic, and lag-one autocorrelated error distributions for each flow regime and lumps all error sources into the parameter uncertainty. As the assumptions were not completely proven by the data, the problem was broken down into two similarly ill-posed cases instead of actually being solved.

[15] *Gallagher and Doherty* [2007] demonstrate the estimation of model predictive uncertainty for a water resource management model consisting of a soil water balance and a groundwater model. Although the chief disadvantage of the method, the assumption that the model is linear, prevents the exact determination of highly nonlinear model error, useful approximations of the individual contributions to the overall predictive uncertainty can be given; provided that plausible estimates of the individual uncertainty sources such as input data or model parameters are available. An issue which is neglected in most approaches is that conditions change in time, e.g., because of improved observation networks, climate change scenarios or meteorological forecasts. This varying temporal uncertainty due to changing input should be acknowledged in uncertainty estimation methods.

[16] *Gupta et al.* [2005] identify the typical assumptions of normality, constancy of variance and simplicity of the correlation structure of the underlying error model as the major drawbacks of current uncertainty estimation schemes. Therefore, the presented methodology explicitly addresses these important properties: it produces error series that are normally distributed and that reproduce the variable contributions of different processes to the total uncertainty in time. It is based on a scaled decomposition of plausible error contributions from different uncertainty sources that represent the time variant importance of different processes. The hydrological model and the corresponding error model are calibrated simultaneously. The uncertainty time series are used as a weighting factor to normalize the model residuals during calibration so that the assumptions of least squares optimization are fulfilled. The methodology is demonstrated with an example application to the distributed Hydrological Bureau Waterbalance (HBV) model of three watersheds in the Neckar basin.