On hydrological model complexity, its geometrical interpretations and prediction uncertainty



[1] Knowledge of hydrological model complexity can aid selection of an optimal prediction model out of a set of available models. Optimal model selection is formalized as selection of the least complex model out of a subset of models that have lower empirical risk. This may be considered equivalent to minimizing an upper bound on prediction error, defined here as the mathematical expectation of empirical risk. In this paper, we derive an upper bound that is free from assumptions on data and underlying process distribution as well as on independence of model predictions over time. We demonstrate that hydrological model complexity, as defined in the presented theoretical framework, plays an important role in determining the upper bound. The model complexity also acts as a stabilizer to a hydrological model selection problem if it is deemed ill-posed. We provide an algorithm for computing complexity of any arbitrary hydrological model. We also demonstrate that hydrological model complexity has a geometric interpretation as the size of model output space. The presented theory is applied to quantify complexities of two hydrological model structures: SAC-SMA and SIXPAR. It detects that SAC-SMA is indeed more complex than SIXPAR. We also develop an algorithm to estimate the upper bound on prediction error, which is applied on five different rainfall-runoff model structures that vary in complexity. We show that a model selection problem is stabilized by regularizing it with model complexity. Complexity regularized model selection yields models that are robust in predicting future but yet unseen data.