## 1. Introduction

[2] It is the nature of conceptual rainfall-runoff models that at least some of the model parameters must be calibrated against observations. Values of these parameters cannot be measured in the field because they aggregate properties of the whole catchment, which is described as a lumped system by the model. For several decades automatic calibration schemes for the estimation of these parameters against historical data and the related problems and challenges have received much attention [e.g.,*Sorooshian and Dracup*, 1980; *Duan et al.*, 1992]. More recently focus has turned into utilization of multiple sources and measures of information in model calibration [e.g., *Gupta et al.*, 1998; *Seibert and McDonnell*, 2002; *Gupta et al.*, 2008; *Efstratiadis and Koutsoyiannis*, 2010] and into better understanding of the uncertainties affecting the calibration and model simulations in a predictive mode [e.g., *Beven and Binley*, 1992; *Vrugt et al.*, 2009a]. Some recent contributions attempt to take into account in the calibration procedure all the factors that may contribute to uncertainty in the parameter estimates [*Kavetski et al.*, 2002; *Ajami et al.*, 2007; *Renard et al.*, 2010; *Renard et al.*, 2011].

[3] In rainfall-runoff modeling there are four major sources of uncertainty that can be linked to the total uncertainty of model parameters and ensuing predictions/simulations. These are well recognized as being with respect to inputs, model structure, parameters and output calibration data. Several approaches have been devised to approximate the total uncertainty of rainfall-runoff simulations. Very popular among these approaches is the Generalized Likelihood Uncertainty Estimation (GLUE) procedure [*Beven and Binley*, 1992; *Beven and Freer*, 2001; *Beven*, 2006]. To improve model performance, however, it would be essential to know how much different sources contribute to total uncertainty. Such information can guide research efforts and measurement activity can then be allocated to the most influential factors. The contribution of the different sources of uncertainty cannot be distinguished by using GLUE because total uncertainty is aggregated into parameter uncertainty within the equifinality concept. Furthermore, the subjectively chosen thresholds between behavioral and nonbehavioral models and common use of informal likelihood functions in GLUE have led to an active debate about whether it gives trustworthy estimates of uncertainties in rainfall-runoff modeling [e.g.,*Mantovan and Todini*, 2006; *Beven et al.*, 2007, 2008; *Stedinger et al.*, 2008].

[4] In boreal regions snow processes are an important part of the hydrological cycle. To our knowledge specific snow modeling research focusing on uncertainty and in particular separation of different sources of uncertainties related to snow modeling are scarce. *Rutter et al.* [2009] compared 33 snowpack models in different hydrometeorological and forest canopy conditions. The comparison indicates that model structure uncertainty plays an important part in total uncertainty of snow modeling. Furthermore, their research reveals that it is more difficult and uncertain to model snow processes at forested sites than open sites. *Franz et al.* [2010] used Bayesian model averaging (BMA) to combine results of 12 snow models to assess uncertainty in hydrologic prediction. They found out that different snow models perform best at different locations and time periods. They suggest that consideration of multiple models would provide useful information for probabilistic hydrologic prediction. *He et al.* [2011] studied the parameter uncertainty of the SNOW17 model using Generalized Sensitivity Analysis and Differential Evolution Adaptive Metropolis (DREAM) in 12 contracting study sites in the U.S. They showed that parameter uncertainty of the model depends on the study site and that uncertainty ranges of some model parameters show correlation with forcing data (precipitation and temperature). Uncertainty in input variables, model structure and snow observations were not explicitly addressed in the research. Although it seems that other specific studies on snow modeling using Bayesian inference are missing, there are individual published uncertainty studies where a snow process scheme is included in the hydrological model. *Kavetski et al.* [2006b] applied the Bayesian Total Error Analysis (BATEA) framework in the two case studies in U.S. (French Broad River and South Branch Potomac River) using rainfall and runoff measurements (without snow observations) as the calibration data. Snow process parameters identified with the standard least squares calibration were affected when the input uncertainty model was included and the hydrological model was recalibrated. Also *Clark and Vrugt* [2006] pointed out that input uncertainty may seriously bias the parameter estimates in snow modeling. These studies highlight the need for addressing input uncertainty in the calibration of a snow model.

[5] It is well established that the estimates of the model parameters and the confidence limits of streamflow simulation are affected when data uncertainty is taken into account in the calibration of conceptual rainfall-runoff models [e.g.,*Huard and Mailhot*, 2008; *Croke*, 2009; *Thyer et al.*, 2009]. *Kavetski et al.* [2002, 2006a] pointed out three possible outcomes when errors in the input variables are neglected in the calibration of hydrological models:

[6] 1. Parameter bias caused by possible input errors is likely to be different from catchment to catchment and thus regionalization of parameters to ungauged catchments may be significantly confounded.

[7] 2. Biased parameters may yield biased predictions and, because input uncertainty affects the structure of parameter uncertainty, the confidence limits of the parameters are likely to be erroneous if input uncertainty is neglected in calibration.

[8] 3. Ignorance about different types of errors in hydrological models prevents proper analysis of model error and model adequacy that may help to improve the model.

[9] *Kavetski et al.* [2002, 2006a, 2006b] and *Kuczera et al.* [2006] proposed the BATEA framework to separate and take into account all the uncertainty sources in model calibration. In BATEA, input (precipitation) uncertainty is included by using additional latent variables that are multipliers for individual storm events and calibrated simultaneously with model parameters. *Vrugt et al.* [2008] extended the same input uncertainty model by introducing more vague priors (uniform) for the multipliers and use of the DREAM in parameter sampling. One of the problems in the application of latent variables is the number of additional parameters, which may lead to computational problems in calibration when the number of precipitation events (and thus latent variables) increases considerably. *Ajami et al.* [2007]tried to decrease the number of parameters in the input uncertainty model of BATEA by using multipliers sampled from the same distribution for each rain observation in their Integrated Bayesian Uncertainty Estimator (IBUNE)-framework. (see discussion by*Renard et al.* [2009] and *Ajami et al.* [2009]). *Reichert and Mieleitner* [2009]presented a method to account for input uncertainty that can be seen as a generalization of the storm-dependent latent variables. Their approach is based on stochastic time-dependent variables.

[10] None of the multiplicative precipitation error models is able to “correct” rainfall observations if rainfall occurred but was not recorded by the gauges or more generally if events were very poorly sampled. By using a case study in New Zealand *McMillan et al.* [2011] conclude that multiplicative error model is consistent with observations and can be used in hydrological modeling where an appropriate minimum rainfall intensity threshold is respected. They also show the dependency of the rainfall error structure on the data time step and point out that common independence assumption of consecutive rainfall multipliers may be flawed for models operating with short (subdaily) timesteps. In addition, separation of storm events in the precipitation series is a practical problem when the latent variables are used to account for rainfall uncertainty. *Renard et al.* [2010] presented a recent example of how to overcome this issue by assigning an individual multiplier (latent variable) for each rainy day. Use of daily multipliers, however, still does not solve problems connected to nondetection of rainfall and of how to separate the influence of model structure and missing information (e.g., precipitation intensity during a time step used in the model). Furthermore, latent variables will most likely be affected by model structure error unless it is specified separately (parameter values may also be biased with respect to their physical interpretation by model structure error), and this leads to the question of how to represent model structure error.

[11] There are several methods for approximating structural uncertainties in rainfall-runoff modeling:

[12] 1. It is possible to use multiple models and approximate model uncertainty based on the range of simulation results [e.g., *Refsgaard et al.*, 2006]. As an example of a technique BMA has been used in different disciplines to combine results of different models. *Franz et al.* [2010] for example used BMA in studying snow model uncertainty.

[13] 2. It is possible to account for the bias in the model with a method based on stochastic, time-dependent parameters [*Kuczera et al.*, 2006; *Reichert and Mieleitner*, 2009]

[14] 3. Model structure uncertainty can also be taken into account by fitting an ARMA-model to the model residuals [e.g.,*Schaefli et al.*, 2007; *Vrugt et al.*, 2009a; *Laloy et al.*, 2010].

[15] The generalized likelihood function of *Schoups and Vrugt* [2010]adopted in this study applies autoregressive model to account for model structural uncertainty. Autocorrelation in the residuals of model output variables is a typical indication of model structural problems. Autocorrelation is caused, e.g., by the storages in conceptual models that may unrealistically smooth estimated runoff series. However, systematic errors in precipitation and streamflow observations can lead to similar problems and autocorrelated errors in the model simulation results. We assume in this study that the ARMA-model accounts solely for model structural errors, because the other sources of errors are separately handled. This study is motivated by the need to assess how valid the above assumption is in a snow-affected catchment and whether different sources of errors can be separated in the inference.

[16] This study adopts the input uncertainty model presented by *Kavetski et al.* [2002]in a snow-fed basin in conjunction with Bayesian inference. It extends the studies of*Kavetski et al.* [2006ab], *Vrugt et al.* [2008] and *Schoups and Vrugt* [2010] by utilizing snow observations in addition to streamflow records in calibration. To our knowledge it is the first time that the generalized likelihood function of *Schoups and Vrugt* [2010] that uses an autoregressive part to account for model structural uncertainties has been applied with an input uncertainty model. *Renard et al.* [2010]concluded that given rainfall-runoff time series alone, input and structural errors cannot be separated unless one has good prior knowledge about the magnitude of the errors. Based on a case study*Renard et al.* [2011]concluded that inclusion of independent estimates of rainfall and runoff uncertainties could overcome the ill-posedness of the inference. A dense rain gauge network enabled geostatistical conditional simulation for independent estimation of rainfall uncertainties prior to Bayesian inference. Compared to*Renard et al.* [2010, 2011]this study adopts a different strategy for describing model structure uncertainty and extends their study by introducing snow water equivalent (SWE) observations as another state variable in the inferencing. The hypothesis is that the use of informative priors for precipitation multipliers, e.g., through geostatistical analysis of rain gauges, is not necessary, when rainfall-runoff data is complemented with additional hydrological information such as SWE data used in this case study.

[17] The conceptual rainfall-runoff model IHACRES combined with a degree-day snow model is used to simulate SWE and streamflow in a small boreal catchment in southern Finland. The aim is to determine unbiased parameter estimates for the combined degree-day snow model and IHACRES by taking into account both input uncertainty and model structure uncertainty in the model calibration. At the same time our goal is to reveal precipitation uncertainty from the streamflow and SWE data. Thus we are doing “hydrology backward” [*Kirchner*, 2009; *Vrugt et al.*, 2008]. Bayesian multiresponse calibration is adopted to utilize both streamflow and SWE observations. Precipitation uncertainty is separately handled in the calibration of the model, whereas input air temperature, streamflow and SWE measurements are considered to be free of error.