Dynamic modeling of predictive uncertainty by regression on absolute errors

Authors


Abstract

[1] Uncertainty of hydrological forecasts represents valuable information for water managers and hydrologists. This explains the popularity of probabilistic models, which provide the entire distribution of the hydrological forecast. Nevertheless, many existing hydrological models are deterministic and provide point estimates of the variable of interest. Often, the model residual error is assumed to be homoscedastic; however, practical evidence shows that the hypothesis usually does not hold. In this paper we propose a simple and effective method to quantify predictive uncertainty of deterministic hydrological models affected by heteroscedastic residual errors. It considers the error variance as a hydrological process separate from that of the hydrological forecast and therefore predictable by an independent model. The variance model is built up using time series of model residuals, and under some conditions on the same residuals, it is applicable to any deterministic model. Tools for regression analysis applied to the time series of residual errors, or better their absolute values, combined with physical considerations of the hydrological features of the system can help to identify the most suitable input to the variance model and the most parsimonious model structure, including dynamic structure if needed. The approach has been called dynamic uncertainty modeling by regression on absolute errors and is demonstrated by application to two test cases, both affected by heteroscedasticity but with very different dynamics of uncertainty. Modeling results and comparison with other approaches, i.e., a constant, a cyclostationary, and a static model of the variance, confirm the validity of the proposed method.

1. Introduction

[2] Deterministic models have been widely used in hydrology for both forecasting and simulation purpose. The assessment of their uncertainty is a major research issue. For managers and decision makers, quantification of uncertainty associated to model estimates is a valuable information for both alarm and operational purposes. From the modeler point of view, it provides indications for model diagnosis and improvement [Gupta et al., 2008; Reichert and Mieleitner, 2009], and targeted data collection. Theoretically, since the forecast value by a deterministic model will never be exact, associating it with some kind of characterization of its error is the only way to assess its quality [Weijs et al., 2010]. The lower the error the better the model: this is what we implicitly do in practice every time we use a deterministic model, since we regard the model prediction as the expected, or most probable value of the estimated variable. Quantification of model uncertainty makes this assumption explicit and provides the model user with a more formal and accurate evaluation of the residual error.

[3] Sources of uncertainty in deterministic models are often classified in measurement errors, both in the input and output, uncertainty in the parameters and uncertainty in the model structure, including selection of the input and the mathematical relation between input, state, and output variables. Many approaches to uncertainty assessment rely on such decomposition. One or more sources of uncertainty are given a statistical description and uncertainty is propagated in the model via random sampling and simulation, to obtain a sample or distribution of model predictions in place of a single value [see, e.g., Thyer et al., 2009; Kuczera and Parent, 1998; Kavetski et al., 2006]. The application of these methods may require collecting several information for the statistical characterization of the different uncertainty sources (e.g., the accuracy of the measurement devices, the analysis of the error induced by data preprocessing), and can be limited by the computational cost of model simulation.

[4] On the other hand, “model residual” approaches skip any distinction of uncertainty sources and directly analyze the time series of model residuals to build a model of the global predictive uncertainty, which is often sufficient for practical purposes. The drawback of model residual approaches is that, while they do not require assumptions about the different sources of uncertainty, they usually do for the characterization of the model residual. Historically, the most common approach is to assume that the residual be an independent identically distributed process, usually zero mean and Gaussian. The approach has been widely criticized because most of these assumptions are violated in hydrological applications, especially autocorrelation and homoscedasticity of the model residuals [Sorooshian and Dracup, 1980]. Many methods have been proposed either to manipulate model residuals so that they satisfy such assumptions (e.g., using Box-Cox transformations [Box and Tiao, 1973; see Kuczera and Parent, 1998; Bates and Campbell, 2001] or the normal quantile transform [e.g., Montanari and Brath, 2004]) or to relax some of these assumptions [e.g., Romanowicz et al., 2006; Schaefli et al., 2007; Schoups and Vrugt, 2010].

[5] In this paper we will present a novel “model residual” approach for estimating the predictive uncertainty of deterministic hydrological models. In our approach, we will assume that the residual error of the model be uncorrelated in time or that it can be described by an autoregressive model with uncorrelated residual. The assumption is very useful because the statistical description of the residual process reduces to providing a sequence of marginal probability distribution functions (pdfs). We use the same distribution type at all time steps while allowing for the residual variance to change in time and reproduce the heteroscedastic behavior that is often observed in hydrological time series. For several practical reasons that will be clarified throughout the paper, we will use Gaussian distributions, although the approach can be extended to other distributions if Gaussian proved to be unsatisfactory, provided that they are symmetric. Under these hypotheses, the identification of the residual error pdf is reduced to the estimation of the error variance (or standard deviation). The latter can be a function of time (the season) or other hydrometeorological inputs, depending on the case study under exam. The novelty of our approach is that we will not assume a priori the input variables of the variance model nor the type of relation between these inputs and the variance, but rather we will infer such information from data analysis and consideration of the features of the hydrological system under exam. This is possible if one regards the variance model identification as a regression analysis over the time series of the model residual errors. Further, we will show that under the Gaussian assumption it is possible (and numerically more efficient) to identify a model of the error standard deviation from the time series of absolute errors, rather than a model of the variance from the time series of squared errors. Another contribution of our paper is that we will show the effectiveness of introducing past absolute errors among the input of the standard deviation model, which means that the error standard deviation is modeled as a dynamic process. For all these reasons, we named the proposed approach dynamic uncertainty model by regression on absolute error (DUMBRAE).

[6] The paper is organized as follows. In section 2, the DUMBRAE approach is fully described from the methodological standpoint. Then the issue of how to evaluate the quality of an uncertainty model is discussed, from visual inspection of the inferred confidence bounds to formal methods. Relying on these results, we can demonstrate the effectiveness of the proposed approach through the application to two case studies.

2. Methodology

[7] We consider a deterministic model

display math

that provides the flow forecast inline image as a function of several inputs collected in the vector inline image and a parameter set inline image. Since the forecast is affected by multiple sources of error, including measurement error in the input inline image, error in the model parameters inline image and structure inline image, the actual flow will be given by

display math

where rt is the model residual error. Equation (2) implicitly assumes that the model output can be univocally split into two mutually exclusive components, deterministic and random. This dichotomy is questionable (see discussion by Koutsoyiannis [2010]), but it is very useful from the operational standpoint because it allows one to separate the identification of the hydrological model (1) and that of the uncertainty model, and define one common strategy to uncertainty modeling that can be applied to any precalibrated hydrological model independently from its structure.

[8] The residual error rt can be described by its probability distribution function (pdf) and the predictive uncertainty of model (1) is derived from such pdf. For instance, confidence bounds can be obtained by adding the quantiles of the error pdf to the flow forecast inline image. The most common approach is to introduce some hypothesis about the pdf shape, and identify the pdf parameters from the residual time series. Alternatively, Solomatine and Shrestha [2009] propose a method to derive the error quantiles for given degree of confidence (and ideally the entire error pdf) without making any a priori assumption.

[9] Traditionally, the residual error is assumed to be independent, identically distributed, zero mean, and Gaussian; that is, it is assumed that the error pdf is inline image for all t. The approach is often unsatisfactory since in hydrological time series all these assumptions are rarely satisfied, and many works in the literature aim at relaxing some of them.

[10] Violation of the assumption of independence means that the deterministic model (1) produces systematic error, for instance because it neglects some of the processes (snowmelt, evapotranspiration, etc.) occurring in the system. However, when model (1) is used in prediction mode, the problem can be easily overcome by modeling the model residual rt as an autoregressive process of order q,

display math

whose residual et satisfies the independence assumption provided that a sufficiently large value of q is used. Identification of model (3) is straightforward as ordinary least squares can be used to estimate the parameters inline image. Once model (3) is available, the flow forecast is corrected as

display math

and its predictive uncertainty is given by projecting the pdf of et, which is independent by construction.

[11] Independence is a valuable property because it allows describing et by a marginal distribution inline image, independently of the error distribution at previous or following time step, i.e.,

display math

where inline image is the parameter vector for the distribution family inline image. However, this means that the distribution inline image must be identified and evaluated on the basis of one data only, since we have one observation per time step. Sometimes, the error histogram is used to infer the shape of the error pdf. However, this is not justified if we assume that the pdf be different at each time step, because each data point should be considered as an extraction from a different distribution. The QQ plot (or probability plot, see section 2.2) is a tool to assess the overall fit of a sample of data inline image against a sample of different distributions inline image; however, it can be used only after all the distributions have been identified. Therefore, in our analysis we will assume a given distribution family for each time t, identify all the distributions (i.e., estimate the parameters inline image for all t), and finally we will test our choice a posteriori by means of the probability plot. With this approach, the most flexible distribution possible should be assumed so that the shape of the pdf is not constrained a priori.

[12] In the absence of other information, the error is assumed to be zero mean and to follow the same distribution at all time steps, while its standard deviation inline image is let vary in time, thus accounting for the heteroscedasticity observed in the error time series,

display math

In general, a sequence of random variables is said to be heteroscedastic if the random variables have different variances. Regression analysis in the presence of heteroscedasticity has been widely studied in econometrics [Engle, 1982]. In the hydrological context, variability of the standard deviation inline image is usually related to flow conditions: the higher the flow is, the higher the variance is. For instance, Schaefli et al. [2007] assume that the error distribution inline image be Gaussian with zero mean and that the standard deviation takes one out of two possible values corresponding to two different hydrological condition, low or high flow. Schoups and Vrugt [2010] use a skew exponential power (SEP) density as error pdf inline image and assume that the error standard deviation inline image is linearly related to the flow forecast inline image on the basis of evidence that predictive uncertainty increases at higher flow.

[13] In our approach we will demonstrate that the variability of the standard deviation inline image can be effectively related also to other hydrological inputs, not necessarily flow, and we will provide a general method to infer what variables are most significant and how to estimate their relation to inline image. By application of our method to the proposed case studies we will show that improving the model of the standard deviation can significantly enhance the uncertainty model, even under simple assumptions about the error pdf inline image (e.g., Gaussian). In general, we will assume that inline image be given by some relation of the form

display math

where inline image is a vector of suitable input variables, not necessarily flow but possibly also other variables like, for instance, precipitation; and inline image is a parameter vector.

[14] The model identification thus encompasses modeling the variance, i.e., identifying relation (6), and estimating the (stationary) pdf inline image appearing in (5). The model identification criterion will be the maximization of the likelihood function L which, under the independence assumption and equations (5) and (6), takes up the form

display math

Notice that when Gaussian distribution is used, the error pdf inline image is fully described by its mean and standard deviation and the parameter vector inline image disappears from equation (7). On the other hand, the problem can be enlarged to encompass also the calibration of the original hydrological model (1), or (4). In this case, the likelihood function becomes

display math

In the remaining of the paper we will focus on the case when the hydrological model has already been calibrated, and use Gaussian distribution as the error pdf inline image. Under this assumption, the likelihood function is (7), where the pdf parameter vector inline image is dropped. Then, the problem boils down to identifying the standard deviation model (6), that is, selection of the input variables inline image, choice of the class function inline image, and estimation of the parameters inline image. These topics will be discussed in section 2.1.

2.1. Identification of the Standard Deviation Model

[15] In our approach we propose not to fix a priori the relation between the error standard deviation and the hydrological inputs, but rather to infer it from data analysis and consideration of the specific features of the case study under exam. This is possible because the dynamics of inline image is revealed by the time series of the residual error. In fact, by definition the error variance inline image is given by inline image and thus, if the error is zero mean, inline image. Identifying the model of the variance inline image can be viewed as a regression analysis problem where the time series to be modeled is the time series of squared errors.

[16] However, modeling the time series of squared error can be difficult because the operation of squaring emphasizes high values and increases the distance between high and low errors, so that the resulting time series is difficult to model. If the error pdf is Gaussian, it is possible to directly relate the standard deviation inline image to the absolute error inline image, which is a much smoother time series. In fact, under assumption of symmetric distribution, it can be demonstrated that the standard deviation is linearly proportional to the absolute error

display math

where the proportionality coefficient equals inline image in the case of Gaussian distribution (see Appendix A for proof). Therefore we can directly identify model (6) by regression analysis over time series of absolute error. Selection of the model input inline image can be based on analysis of the autocorrelation function between absolute error and candidate input variables like flow forecast, past observations of flow or meteorological variables (precipitation, temperature, etc.). Other data analysis tools that may prove useful for input selection include scatterplots, cluster analysis, mutual information, etc. For a review of available methods for input variable selection and their application in the hydrological context, see Bowden et al. [2005]. As it will be shown in the application case studies, data analysis is also guided and supported by the knowledge of the system characteristics and the model features.

[17] Once the model input have been chosen, model identification requires choosing the class function inline image and parameter estimation. In this work, we will follow a parsimonious approach and start by simple relations, e.g., linear, and move to more complex ones only if the modeling results are not satisfactory.

[18] As for parameter estimation, the problem is spontaneously formulated as maximizing the likelihood function (7). Notice that even if the error pdf inline image is Gaussian and the standard deviation model (6) is linear, both the likelihood function and its logarithm are nonlinear in the parameters inline image, which can make the maximum likelihood approach computationally demanding. The problem is further complicated by the fact that parameter values should be constrained to guarantee that the resulting standard deviation be positive for all possible values of the input vector inline image.

[19] Alternatively, a more straightforward approach to parameter estimation is to minimize the mismatch between the output of model (6) and the observed absolute error, e.g., using Euclidean distance

display math

The advantage is that when inline image is Gaussian and inline image is linear, the fast and efficient linear least squares solution can be used. However, the criterion underlying the solution (10) does not reflect the true modeling scope, which is to identify the error pdf and not to interpolate the absolute error time series. Nonetheless, the solution (10) can be effectively used as the initialization of the recursive nonlinear optimization approach used to solve the maximum likelihood problem.

2.2. Model Evaluation

[20] The identified model of the standard deviation will be evaluated in terms of likelihood value L, or equivalently, negative log likelihood inline image, evaluated over a validation data set different from the one used for model calibration. The use of L as an evaluation indicator is theoretically justified by the likelihood principle, which states that, for given model structure, all information that the data contains about the model parameters is in the likelihood function [Jaynes and Bretthorst, 2003]. Also, the likelihood is the only scoring rule which is unambiguous, local and proper [Weijs et al., 2010]. Other evaluation indicators that will be used are the Akaike information criterion (AIC) inline image, which balances the model log likelihood and the model complexity, as measured by the number inline image of model parameters; and the proportion of observations within the confidence limits (or prediction interval coverage probability, PICP [Solomatine and Shrestha, 2009]).

[21] A closer insight into the behavior of the standard deviation model will be given by visual inspection of the associated confidence bound, as well as the QQ plot. The QQ plot, or probability plot [Laio and Tamea, 2007; Thyer et al., 2009], is a graphical tool for assessing the goodness of fit of a sample of probability distributions against a sample of data. In our application, since the error is assumed additive and Gaussian, the conditional probability distribution of the flow forecast inline image is Gaussian with standard deviation equal to the error standard deviation computed by model (6). The goodness of fit of such distribution cannot be assessed by conventional statistical tests, because only one extraction from that distribution is available, i.e., the measurement yt. However, from the probability integral transform it follows that if the estimated cumulative distribution function (cdf) inline image of yt coincides with the true cdf inline image, then the value inline image is an extraction from a uniform distribution over inline image. Since this is true for any inline image, N being the number of flow measurements, one can evaluate the goodness of fit of the N cdfs inline image by checking if inline image is a sample of mutually independent, uniformly distributed observations. Independence of the sample can be checked by looking at the autocorrelation function. As for the uniformity hypothesis, we compute the value of the empirical cdf of ut as inline image, where Rt is the number of elements in U lower than ut, and compare it with the value of the uniform cdf, inline image.

3. Application to the Simulation Model of the Rhone River, Switzerland

[22] The Rhone River is located in a high mountainous catchment, where the hydrological regime is strongly affected by glacier and snowmelt, with peak flows in summer (July–August) and a low-flow period in winter (February–March). The simulated discharge is generated using a semilumped conceptual glaciohydrological model, described in detail by Schaefli et al. [2005]. The model has two levels of discretization. The first one distinguishes between catchment areas covered and not covered by ice. Accumulation and melting of snow and ice in the ice-covered area are modeled by two parallel linear reservoirs. The noncovered area is modeled by a linear reservoir for the slow contribution of the soil underground water and a nonlinear one for the direct runoff. The second level of discretization is among a set of elevation bands: runoff discharge is computed separately for each band and then aggregated. No routing among the components of the model is considered because the runoff delay is much smaller than the modeling time step, which is 24 h. Time series for model identification (precipitation, temperature, and potential evapotranspiration) cover the period from 1981 to 1984. Data from 1990 to 1994 are used for validation.

[23] The error of this conceptual model is correlated in time, as already discussed by Schaefli et al. [2007], and it can be described by a first-order autoregressive model. The residual error after adding the autoregressive component is

display math

where yt is the measured runoff, inline image is the hydrological model prediction at time t, and inline image is the residual of the hydrological model at previous time step, inline image.

[24] Figure 1 shows that although the residual et is independent from its previous values, its squared value inline image have a significant autocorrelation. This is an evidence of heteroscedasticity [Engle, 1982]. The absolute error inline image shows an even higher autocorrelation. As anticipated, squaring the error reduces the smoothness and thus the autocorrelation of the time series. For this reason we will focus on the absolute error time series and use it to derive a model of the error standard deviation.

Figure 1.

Autocorrelation of residual error et (open circles), squared error inline image (squares), and absolute error inline image (solid circles) for the Rhone River case study (calibration data set).

3.1. Model Identification

[25] Potential inputs of the standard deviation model are all the inputs and the output of the hydrological model, i.e., precipitation pt, temperature Tt and runoff forecast inline image, as well as the absolute value of the last observed residual inline image and, since temperature mainly affects the flow through snowmelt, temperature value when positive, inline image, where inline image is the unit step function.

[26] Table 1 reports the cross-correlation value between different candidate input variables and the absolute error. It shows that the precipitation is weakly related with the absolute value of the residual, whereas all the other inputs have a significative correlation. Weak correlation between precipitation and error can be justified by the fact that precipitation includes both snow and rain undistinguished. Predicted runoff has a slightly higher correlation than the other variables.

Table 1. Cross-Correlation Value Between Different Candidate Input Variables and the Absolute Error for the Rhone River, Calibration Period 1981–1984
 Input
ptTtTt · H (Tt) inline image|et−1|
corr(input, |et|)0.1160.4560.5110.5880.530

[27] Following this data analysis, we selected as the input variable the lagged absolute error, which accounts for the slow dynamics of the standard deviation; an exogenous component identified in the predicted runoff inline image; and a constant term, to avoid too small variance values during low-flow periods. According to the parsimonious modeling approach, we started from a linear relation between these variables, while relaxing the lag 1 assumptions and considering also absolute error and flow forecast values in previous time intervals. The model is thus an autoregressive exogenous input model (ARX) and takes the following form:

display math

The model order n and m is chosen by trial and error. Parameters a, bi and ci are estimated using criterion (10) because inline image is linear and (10) becomes numerically efficient. However, the optimization should be constrained such that inline image can take up positive values only. If the domain of all inputs is real positive, as in this case, positivity of all the parameters is a sufficient condition for the positivity of the model output. The calibration procedure thus starts by applying unconstrained linear least squares and, if the solution does not satisfy the positivity constraint, uses such solution as the starting point for an iterative constrained least squares procedure (Isqnonlin function in the Matlab Optimization Toolbox).

3.2. Model Evaluation

[28] The dynamic uncertainty model by regression on absolute error (DUMBRAE) of equation (12) is here compared with a constant model, a mixture of normal distributions (here called two mixed), and a periodic model. The constant model assumes that the standard deviation of the prediction error be constant in time and equal to the sample standard deviation. For this test case, this model was already recognized as unsatisfactory by Schaefli et al. [2007], who proposed a model mixture of two normal distributions to treat heteroscedasticity using two different values of the variance, one smaller variance for the more predictable low flow and one higher variance for the high flow. The rule for switching from one to another is based on some conditions on the predicted runoff. For a more complete benchmark we also developed a simple periodic model that estimates the standard deviation as a function of time and is identified using a Fourier series expanded until a (small) finite number of harmonics. The model is described in details in Appendix B.

[29] Table 2 reports the values of the evaluation indicators over the calibration and validation period by the DUMBRAE model (12) with n = 1 and m = 1, and the three benchmarks. It shows that DUMBRAE is the best of the evaluated set, since it has the minimum value of inline image and AIC, and the PICP closest to 0.95. As expected the constant variance model is extremely poor. The periodic model proves better over the calibration data set but in validation the negative log likelihood goes to infinity. A closer insight into data shows that the periodic model performs well in many cases (this is also showed later in the QQ plot) but for two events that fall in the tail of the estimated pdf and thus are associated with extremely small probability value. This is because the periodic model follows the seasonal trend but it does not use real-time information. DUMBRAE instead, even if traced by a simple ARX relation, can dynamically adjust to the increase and decrease of the variance.

Table 2. Comparison of Different Standard Deviation Models for the Rhone Rivera
Modelnφ−log (L)AICPICP
CalibrationValidationCalibrationValidationCalibrationValidation
  • a

    Here inline image is the number of model parameters, inline image is the negative log likelihood, AIC is the Akaike information criterion, and PICP is the prediction interval coverage probability at level of confidence inline image.

Constant16.02 × 1033.34 × 10310.9 × 1036.02 × 1030.9460.940
Periodic72.26 × 103Infinity4.54 × 103Infinity0.9290.919
Two mixed34.68 × 1032.05 × 1038.42 × 1034.25 × 1030.9520.962
DUMBRAE31.71 × 1031.36 × 1033.60 × 1032.57 × 1030.9530.958

[30] Figure 2 shows the 95% confidence interval in time of the four benchmark models for the first year of the calibration period (1981). It can be seen that the periodic and the DUMBRAE model have more or less the same trend, which follows relatively well the observed values of the residuals (black dots). The two-mixed and the constant models provide an unnecessary large confidence band. The two-mixed model, having only a binary configuration (high-low uncertainty), can only partially modulate the variance. The constant variance model is completely inflexible, and the estimation of its average value is dominated by few, large error values. Notice that because the error distribution is assumed to be Gaussian and thus have an infinite support, negative flows are given a nonzero probability and the lower quantile of the flow can be negative. In principle this is not acceptable and may be a reason for abandoning the Gaussian assumption. In practice, however, the problem may be overcome by simply setting to zero the negative quantiles. In this application, the DUMBRAE model did not produce any negative flow quantile over the calibration horizon or the validation horizon while all the other benchmark models did, especially the constant model. The reason for this good property of DUMBRAE is that it reduces the standard deviation in correspondence to small forecast values (see equation (12)), thus keeping the lower quantile very close to the forecast when the latter is close to zero.

Figure 2.

(top) Rhone River discharge. (bottom) Residual error (black dots) and 95% confidence bounds based on the constant variance model (blue lines), two-mixed model (red lines), periodic model (green lines), and DUMBRAE (magenta). First year (1981) of the calibration data set.

[31] Finally, Figure 3 shows the QQ plots corresponding to the four standard deviation models over the calibration and validation data set.

Figure 3.

QQ plot of the constant variance model (blue line), two-mixed model (red line), periodic model (green line), and DUMBRAE (magenta line) for the Rhone River case study over (a) the calibration data set and (b) the validation data set.

[32] It shows that none of the empirical cdfs, as estimated by DUMBRAE and the benchmark models, lays exactly on the bisector, i.e., none of them coincides with the theoretical cdf, however the empirical cdf corresponding to the periodic and DUMBRAE model are the closest ones. Notice that these two models exhibit very similar results in the QQ plot, while having very different log likelihood values. Specifically, the periodic model outperforms DUMBRAE in the QQ plot over the validation data set although it is definitely outperformed in terms of log likelihood (see Table 2). This is because the periodic model strongly underestimates the variance in a small number of events, which are not visible in the QQ plot but are heavily penalized in the likelihood score.

[33] In conclusion, the analysis indicates that the DUMBRAE model (12), although simple, provides an effective tool for estimating the standard deviation error. It outperforms the other three benchmark models in terms of the log likelihood, AIC and PICP indicators; it provides an error cdf closer to the theoretical one than that of the constant and two-mixed model, although the S-shaped QQ plot indicates that the standard deviation is sometimes overestimated; and it produces a confidence interval that more closely adjusts to the variation in the model (absolute) error.

4. Application to Inflow Forecasting in the Lake Maggiore Catchment

[34] Lake Maggiore is a regulated lake at the border between Italy and Switzerland. The lake catchment covers about 6600 km2, with 17% of the watershed area above 2000 m above sea level. Climate conditions are extremely variable, with higher precipitation in autumn and spring and significant contribution from snowmelt in late spring and summer. A flow forecasting model was developed to support real-time operation of the lake [Pianosi and Soncini-Sessa, 2009]. It is a data-driven, lumped model that provides the total inflow inline image to the lake in the next 24 h, as a function of the data available at the time of forecast: measured precipitation in the catchment (spatial average), the observed inflow and the forecasting error in previous time intervals. The model is composed of (1) an autoregressive component employing the logarithm of the flow, so that the recession curve follows a more than exponential decay, (2) an exogenous component that weights the precipitation input by a periodic function estimated from data, and (3) a moving average component based on previous forecasting errors. Although it is a data-driven model, its parameters can be given a physically sound interpretation.

[35] The model was calibrated using time series of inflow and precipitation over the period 1993–1997. The time series of residual error over the same period is used to identify the error variance model, whereas data from 1998 to 2000 are used for model validation.

[36] Figure 4 shows the autocorrelation function of the model residuals. It can be seen that while the residual error is almost uncorrelated, the squared error and the absolute error are significantly autocorrelated, which is evidence of heteroscedasticity of the error process [Engle, 1982].

Figure 4.

Autocorrelation of residual error et (open circles), squared error inline image (squares), and absolute error inline image (solid circles) for the Lake Maggiore catchment case study (calibration data set).

4.1. Model Identification

[37] Potential inputs of the standard deviation model are first searched among the hydrological model inputs, that is, precipitation, residual error and flow observation in previous time intervals. Figure 5 shows the correlation between past precipitation inline image and error et (open circles), squared error inline image (squares) and absolute error inline image (solid circles). It can be noticed that the error is highly correlated with precipitation in the same time interval (k = 0), because the forecasting error is often due to a simultaneous and thus unpredictable precipitation event, while the correlation with past precipitation records ( inline image) is negligible, which confirms that all the information available at the time of forecast is correctly exploited by the hydrological model. However, the correlation between precipitation and squared error is high also for inline image. This means that while the error value does not depend on past precipitation, the error variance does; or, in physically sound terms, uncertainty in the inflow forecast increases after rainfall events. Finally, the correlation between precipitation and the absolute error is even stronger than between precipitation and squared error. Therefore, it can be expected that identifying the standard deviation model from time series of the absolute error be easier than identifying the variance model from squared errors.

Figure 5.

Cross correlation between past precipitation inline image and error et (open circles), squared error inline image (squares), and absolute error inline image (solid circles) for the Lake Maggiore catchment case study (calibration data set).

[38] Since the correlation of the absolute error inline image with predicted flow inline image (0.44) and with observed flow in the previous time interval inline image (0.35) are both lower than correlation with the observed precipitation, the latter will be used as the exogenous input of the standard deviation model, which takes the form

display math

[39] Following the uncertainty decomposition based on hydrological causes and presented by Götzinger and Bardossy [2008], the first two components (constant and autoregressive) in (13) can be interpreted as uncertainty due to the process description, whereas the third component is due to the precipitation input. Model calibration follows the same approach as described in the Rhone River application.

4.2. Model Evaluation

[40] Just as in the Rhone River application, a constant and a periodic model of the standard deviation will be used as benchmarks. Furthermore, a state-dependent model, as used, for instance, by Thyer et al. [2009] and Schoups and Vrugt [2010], will be assessed. It assumes that the standard deviation is linearly proportional to the inflow forecast, i.e.,

display math

[41] Table 3 reports the evaluation indicator values by these three models and the DUMBRAE model (13) with n = 2 and m = 2. It shows that DUMBRAE has the lowest negative log likelihood and thus the highest skill in reproducing the error probability. The AIC value is also minimum and the PICP is quite close to the theoretical value of 0.95 (even if the constant and state-dependent model are slightly better over the validation data set). Contrary to the Rhone River application, the lower quantile of the flow estimated by DUMBRAE is negative in correspondence to some heavy rainfall events that produce a very high standard deviation values (see equation (13)), even if the frequency of such negative values is lower than with the constant and periodic model. As discussed in section 4.1, this is not a problem for operational purposes (it will be sufficient to replace negative flow quantiles by zero) but it is a conceptual weakness in the proposed approach. Nonetheless, our opinion is that for application-oriented scopes this weakness is acceptable since, on the other hand, the Gaussian assumption provides several important computational advantages (the variance model can be identified separately from the mean; time series of absolute errors can be used in place of squared errors thanks to equation (9)).

Table 3. Comparison of Different Standard Deviation Models for Lake Maggiore Catchmenta
Modelnφ−log (L)AICPICP
CalibrationValidationCalibrationValidationCalibrationValidation
  • a

    Here inline image is the number of model parameters, inline image is the negative log likelihood, AIC is the Akaike information criterion, and PICP is the prediction interval coverage probability at level of confidence inline image.

Constant112.1 × 1037.32 × 10324.1 × 10314.6 × 1030.9650.959
Periodic511.7 × 1037.20 × 10324.0 × 10314.3 × 1030.9210.911
State dependent211.8 × 1037.18 × 10323.6 × 10314.4 × 1030.9420.940
DUMBRAE510.8 × 1036.60 × 10321.8 × 10313.4 × 1030.9450.923

[42] Figure 6 compares the 95% confidence intervals of the flow based on the DUMBRAE model and the three benchmark standard deviation models. It shows that if a constant standard deviation is used, the confidence interval (blue lines) is too large for low-flow events and too narrow for flood events. Since floods generally occur in autumn, the periodic model properly produces a larger standard deviation value in that period, and thus a wider confidence interval (green lines). Still, the periodic model is not completely satisfactory since it provides an average assessment of the seasonal uncertainty but it cannot distinguish high uncertainty periods caused by precipitation events. The confidence intervals based on DUMBRAE and the state-dependent model (red and magenta lines) are narrow in correspondence to low-flow values and wider in correspondence to floods, when unpredictability increases. However, an important difference in behavior between those two models can be detected observing their trends during the recession process, i.e., from day 290 to day 300 in Figure 6. Forecasting errors in this phase are small because the hydrological model can accurately reproduce the flow dynamics in the recession phase. The DUMBRAE model can quickly adapt to this situation and properly reduce the confidence intervals because it is updated by information on absolute errors in previous time intervals (see equation (13)). The state-dependent model, instead, by simply relating the standard deviation to the flow forecast, cannot distinguish if the high flow is due to an ongoing precipitation event (high uncertainty) or to a recession process (low uncertainty), and in this latter case overestimates the confidence interval. From a model diagnostics perspective, DUMBRAE indicates that the forecast error is mainly related to precipitation input, suggesting that either the precipitation observations (or their spatial aggregation) is a major source of uncertainty or that the hydrological model is not adequate in reproducing the fast response of the catchment to rainfall events (while properly reproducing the recession curve).

Figure 6.

(top) Daily inflow to Lake Maggiore. (bottom) Residual error (black dots) and 95% confidence bounds based on the constant variance model (blue lines), state-dependent model (dashed red lines), periodic model (green lines), and DUMBRAE (magenta lines). Days 200–360 in the calibration data set.

[43] Finally, Figure 7 reports the QQ plot generated by the four standard deviation models over the calibration and validation data set. The line generated by the DUMBRAE model (magenta) is very close to the bisector (black line), while the curve of the other standard deviation models is S shaped, which means that the zt points are concentrated toward the center of the interval inline image. This indicates that the confidence interval is frequently wider than needed, or in other terms, that the standard deviation is often overestimated.

Figure 7.

QQ plot of the constant variance model (blue line), state-dependent model (red line), periodic model (green line), and DUMBRAE (magenta line) for the Lake Maggiore catchment over (a) the calibration data set and (b) the validation data set.

5. Conclusion

[44] This paper presents a method for identifying a simple but effective uncertainty model to be associated with the predictions of a hydrological model with heteroscedastic errors. The method is very general and only requires the time series of residual errors over a historical period as an input, regardless of the structure of the hydrological model. However, it relies on the assumption that such error be independent and zero mean, and thus an appropriate transformation of the hydrological model output may be required to meet this condition. Also, we show that under the further assumption that the residual error is Gaussian, the error standard deviation is linearly proportional to the error absolute value, which means that modeling the error standard deviation boils down into a regression analysis of the absolute error time series. On the basis of these considerations, we show that a proper data analysis combined with physical considerations on the case study at hand can help identifying the most suitable input to the standard deviation model and the most parsimonious model structure. We named the approach dynamic uncertainty modeling by regression on absolute errors (DUMBRAE). We also provide some theoretical background and practical approaches for model identification on the basis of the maximum likelihood principle. The effectiveness of the method was demonstrated by application to two case studies and comparison with other approaches presented in the literature for modeling the error variance, as well as a simple constant and periodic model. Modeling results are assessed in terms of the formal likelihood measure, plus other evaluation indicators and a graphical tool, the QQ plot, which provides more insight about the estimated probability distribution. The analysis shows that the proposed method can effectively reproduce the heteroscedasticity of the residual errors. A suitable choice of the inputs to the standard deviation model can improve the model accuracy and compensate for the simplifying assumption of Gaussian errors. Using precipitation measurements and past observations of the error itself, and not only the flow forecast as usually done in the literature, significantly enhance the uncertainty description especially in the recession phase of the hydrograph. Finally, although DUMBRAE is an application-oriented approach and it was originally conceived for the practical goal of effectively associating prediction bounds to flow forecasts, it can also contribute to model diagnostics. In fact, as demonstrated in the proposed case studies (especially the lake Maggiore catchment), the analysis of the time series of residual errors help to identify what variables and processes mostly contribute to the forecasting error, and thus what components of the hydrological model should be improved. One limitation of the proposed approach is that it depends on the a priori assumption of an adequate function family for the error standard deviation model. Further research will also concentrate on extending the approach to other more flexible probability distribution like for instance Gamma distributions.

Appendix A:: Expected Value of the Absolute Value of a Normal Variable

[45] First, it can be proved that if inline image and inline image, then

display math

where inline image and inline image are the pdfs of X and Y, respectively. In fact, from the definition of Y, it follows that inline image and thus

display math

Since inline image is an even function,

display math

from which the thesis follows.

[46] Therefore the expected value of Y is given by

display math

Appendix B:: Periodic Model of the Standard Deviation Based on Fourier Decomposition

[47] A periodic model of the error standard deviation is

display math

where T is the time period (for instance, if the modeling time step is 24 h, T = 365 day) and t is the current time step counter. Identification of model (B1) is straightforward, in fact, for given order n of the Fourier decomposition, the coefficients inline image can be estimated by least squares according to the well known formula

display math

where N is the number of data, the regressor vector inline image is given by

display math

and the observation yt is simply given by inline image, according to equation (9).

Acknowledgments

[48] The authors are grateful to the anonymous reviewers and Associate Editor D. Kavetski for their careful revision that significantly improved the paper. The authors wish to thank B. Schaefli for kindly providing the model output data set and information about the Rhone River model, the Swiss Federal Office for Water and Geology for the discharge data, and the national weather service MeteoSwiss for the meteorological data series. The data used for Lake Maggiore catchment were provided by Consorzio Ticino, ARPA Piemonte (Italy), and MeteoSwiss (Switzerland). The authors also wish to thank G. Schoups for useful comments on the paper.