## 1. Introduction

[2] Conceptual hydrological models commonly operate with several connected stocks representing physical elements in a catchment. Model parameters define the behavior of the various conceptual elements and the way they relate to each other. As conceptual elements represent averages of various subcatchment processes that contribute to the overall catchment response, model parameters are conceptual representations of abstract watershed characteristics and cannot be assessed from direct measurements. Instead they have to be determined by calibration, which is a process of changing parameter values until a satisfactory agreement between simulated and observed catchment behavior is obtained [*Sorooshian and Gupta*, 1995].

[3] In manual calibration a process of trial and error parameter adjustment is made, and the simulated and observed watershed behavior is compared using visual inspection and different measures of performance. While manual calibration can produce good results, it can be time consuming and it involves a great deal of subjective judgment.

[4] The shortcomings of manual calibration have motivated the automation of the calibration process. This has transformed the calibration problem into an optimization problem, consisting in determining the set of model parameters that optimizes (maximizing or minimizing) a number of objective functions. Objective functions are single valued equations that depend on model parameters and express the agreement between observed and simulated catchment behavior in numerical form.

[5] Single objective calibration consists of determining the set of model parameters that optimizes a single objective function. Such an approach to model calibration, however, is subject to limitations that restrict its applicability. Calibration based on a single objective function, in fact, often results in hydrograph representations that are considered unrealistic from the operational hydrologist's point of view. This can be due to the following reasons. First, a single objective function may enhance the error with respect to the simulation of some aspects of the observed signal at the expense of other aspects, therefore constraining the calibration to fit certain characteristics of the system response while neglecting others. Second, the integration of the residuals into one value may hide or underestimate the information content of the data available, therefore not capturing and not exploiting all the information that is present in the data. These limitations suggest the need of constraining the calibration processes by a larger number of objective functions, leading to a multiobjective view of the calibration problem.

[6] In this paper we compare two multiobjective approaches, representative of different ways of interpreting the calibration process. The first approach refers to the concept of Pareto optimality [*Gupta et al.*, 1998] and consists in calibrating all model parameters simultaneously with respect to a common set of objective functions. The approach results in the determination of a set of Pareto-optimal solutions, reflecting various trade-offs between parameters and calibration objectives. The second is a “stepped” calibration approach [*Hogue et al.*, 2000], and consists in associating model parameters with calibration objectives based on the processes that each parameter is designed to represent and on the role of each process on the overall system response. The parameter sets associated with the different objectives are calibrated in separate stages, reflecting the procedure that is followed by operational hydrologists in manual calibration. The approach provides a single solution that represents a balance between the selected calibration objectives. The purpose is to demonstrate the principles and implications of each approach in a comparative evaluation. The two approaches are examined in a case study that considers the calibration of two models of different levels of complexity.

[7] The set of objective functions, the same for the two approaches, is chosen to evaluate model performances with respect to three aspects of the stream hydrograph simulation, namely, low flows, high flows and lag time of the system. A comparison is made between two model structures with different levels of complexity. Initially a simple model structure is used and calibration results are evaluated. According to the calibration results and to the hydrological insight of the catchment, the initial model structure is improved by introducing additional processes and components. The calibration procedure is repeated for the improved model structure. This comparison gives the opportunity not only of evaluating the performance of multiobjective calibration at different levels of model complexity, but also allows a discussion of the results of calibration strategies as a means of understanding model deficiencies and helping model development.