## 1. Introduction

[2] The paper is concerned with the choice of appropriate model complexity for prediction of streamflow responses of ungauged catchments at different timescales. At present there is inadequate guidance available to a priori choose a model of appropriate complexity for predictions of streamflow of a specified accuracy, given only the climatic, vegetative, topographic and soil characteristics, and a given timescale. Provided such a method exists, and a model of appropriate complexity can be chosen, the dominant physical controls on streamflow variability can then be properly investigated. These results can subsequently be used to concentrate the modelers' efforts on the best approaches for estimating the required parameter values, with a view to improving the accuracy of predictions and to reducing predictive uncertainty.

[3] The motivation for this paper is therefore not the development of a “perfect model”, but a systematic method of model selection that makes tradeoffs between model complexity, accuracy and predictive uncertainty. Although integral to modeling strategies in catchment hydrology [*Garen and Burges*, 1981], the tradeoffs between model complexity, accuracy and predictive uncertainty have not been explored in a systematic manner in the past. A plethora of models with a wide-ranging complexity have been developed [*Chiew et al.*, 1993; *Singh*, 1995], but there is no consistent method available to select the most appropriate model for a given set of catchment conditions. *Chiew et al.* [1993] compared the relative accuracy of six predefined calibration-based models of increasing complexity, when tested on a number of catchments with different climatic conditions, and concluded that the most accurate model was the most complex model, regardless of timescales and catchment characteristics. Although, in some cases simpler models were reported to make accurate enough predictions, the links between climate, timescale and catchment characteristics, and model accuracy and required model complexity, were not explored in any detail.

[4] Conceptual models dependent on calibration for their parameter values, such as the ones studied by *Chiew et al.* [1993], are used extensively in catchment hydrology. However, *Klemes* [1983] suggests that a common weakness of many calibration-based models is structural arbitrariness and overparameterization. As they require estimation of many parameters with little or no physical meaning, there is often no means to estimate them all a priori. Consequently, there is a large associated parameter uncertainty which, when propagated through the model, produces output errors too large to place confidence in the predictions. Because these calibrated model parameters often could not be linked to physically meaningful or measurable climate and landscape properties, the links between catchment characteristics, climate and timescale, and predictive uncertainty, could not be established, and hence the tradeoff between model complexity, accuracy and predictive uncertainty could not be fully investigated. This has also inhibited the identification of the dominant physical controls on streamflow variability at various time and space scales.

[5] The approach to model development adopted in this paper is inspired by the downward approach outlined by *Klemes* [1983]. Starting with the simplest model possible, model complexity is systematically increased in direct response to demonstrated deficiencies in the model predictions, and the process is continued until the required accuracy is achieved. For water balance modeling, the modeler often finds that a relatively simple model is sufficient to make predictions at large timescales, such as the annual timescale, whereas the model complexity needs to increase as the timescales decrease. The starting model used in this paper is a variant of the Manabe bucket model [*Manabe*, 1969; *Milly*, 1994]. It incorporates simple, but physically meaningful representations of hydrological processes, and requires parameter values and climatic inputs most of which can be realistically derived from a priori field measurements. Minimal calibration against selected streamflow records from a number of storm events (<10) each of approximately 5–10 days duration is required to estimate two of these parameters before simulations are run and predictions made.

[6] The ultimate extent of the model evolution reached in the downward approach presented here is dependent upon the tradeoff between model complexity, accuracy, and predictive uncertainty, which are assessed using systematic sensitivity and Monte Carlo error analyses. These analyses identify the dominant physical controls on streamflow variability at each timescale, and produce estimates of predictive accuracy and uncertainty, and hence the degree of model complexity required/available to capture observed streamflow variability. Between-catchment differences in model predictions are correlated to physically measurable catchment properties and climatic conditions. Although the model applied in this paper is specific to New Zealand catchments, the approach used has also been adopted with considerable success to catchments in other climatic and physiographic regions [*Jothityangkoon et al.*, 2001; *Eder et al.*, 2002; *Farmer et al.*, 2002], and to other types of models as well. Based on the correlations obtained in all of these studies (including the present one), a qualitative relationship is formulated between climatic indices, timescale and required model complexity, which can provide useful guidance to model selection, development and inter-comparison in other regions.