## 1. Introduction and Scope

[2] Hydrologic models often contain parameters that cannot be measured directly but which can only be inferred by a trial-and-error (calibration) process that adjusts the parameter values to closely match the input-output behavior of the model to the real system it represents. Traditional calibration procedures, which involve “manual” adjustment of the parameter values, are labor-intensive, and their success is strongly dependent on the experience of the modeler. Automatic methods for model calibration, which seek to take advantage of the speed and power of computers while being objective and relatively easy to implement, have therefore become more popular [e.g., *Boyle et al.*, 2000]. Since the early work reported by *Dawdy and O'Donnell* [1965], automatic calibration procedures have evolved significantly. However, many studies using such methods have reported difficulties in finding unique (global) parameter estimates [*Johnston and Pilgrim*, 1976; *Duan et al.*, 1992; *Sorooshian et al.*, 1993; *Gan and Biftu*, 1996].

[3] Regardless of the methodology used, most hydrologic models suffer from similar difficulties, including the existence of multiple local optima in the parameter space with both small and large domains of attraction (a subregion of the parameter space surrounding a local minimum), discontinuous first derivatives, and curving multidimensional ridges. These considerations inspired *Duan et al* [1992] to develop a powerful robust and efficient global optimization procedure, entitled the shuffled complex evolution (SCE-UA) global optimization algorithm. Numerous case studies have demonstrated that the SCE-UA algorithm is consistent, effective, and efficient in locating the optimal model parameters of a hydrological model [e.g., *Duan et al.*, 1992, 1993; *Sorooshian et al.*, 1993; *Luce and Cundy*, 1994; *Gan and Biftu*, 1996; *Tanakamaru*, 1995; *Kuczera*, 1997; *Hogue et al.*, 2000; *Boyle et al.*, 2000].

[4] While considerable attention has been given to the development of automatic calibration methods which aim to successfully find a single best set of parameter values, much less attention has been given to a realistic assessment of parameter uncertainty in hydrologic models. Estimates of hydrologic model parameters are generally error-prone, because the data used for calibration contain measurement errors and because the model never perfectly represents the system or exactly fits the data. Consequently, it is generally impossible to find a single point in the parameter space associated with good simulations; indeed, there may not even exist a well-defined region in the sense of a compact region interior to the prior parameter space. Although the SCE-UA global optimization algorithm can reliably find the global minimum in the parameter space, it still remains typically difficult, if not impossible, to find a unique “best” parameter set, whose performance measure differs significantly from other feasible parameter sets within this region. Such poor parameter identifiability may result in considerable uncertainty in the model output and, perhaps more important, make it virtually impossible to relate these parameter values to easily measurable soil or land-surface characteristics [*Schaap et al.*, 1998; *Duan et al.*, 2001; *Vrugt et al.*, 2002].

[5] Only recently have methods for realistic assessment of parameter uncertainty in hydrologic models begun to appear in the literature. These include the use of a multinormal approximation to parameter uncertainty [*Kuczera and Mroczkowski*, 1998], evaluation of likelihood ratios [*Beven and Binley*, 1992], parametric bootstrapping and Markov Chain Monte Carlo (MCMC) methods [e.g., *Tarantola*, 1987; *Kuczera and Parent*, 1998]. Because traditional statistical theory based on first-order approximations and multinormal distributions is typically unable to cope with the nonlinearity of complex models, MCMC algorithms have become increasingly popular as a class of general purpose approximation methods for problems involving complex inference, search, and optimization [*Gilks et al.*, 1996]. An MCMC method is a stochastic simulation that successively visits solutions in the parameter space with stable frequencies stemming from a fixed probability distribution. A variety of MCMC samplers can be constructed for any given problem by varying the sampling or proposal distribution subject to conditions that ensure convergence to the posterior target distribution. These algorithms originally arose from the field of statistical physics where they were used as models of physical systems that seek a state of minimal free energy. More recently, MCMC algorithms have been used in statistical inference and artificial intelligence [*Geman and Geman*, 1984; *Neal*, 1993].

[6] Recently, *Kuczera and Parent* [1998] used the Metropolis-Hastings algorithm [*Metropolis et al.*, 1953; *Hastings*, 1970], the earliest and most general class of MCMC samplers, in a Bayesian inference framework to describe parameter uncertainty in conceptual catchment models. The Metropolis-Hastings algorithm is the basic building block of classical MCMC methods and requires the choice of a proposal distribution to generate transitions in the Markov Chain. The choice of the proposal distribution determines the explorative capabilities of the sampler and therefore the statistical properties of the Markov Chain and its rate of convergence. If the selected proposal distribution closely approximates the posterior target distribution, the Markov Chain that is sampled will rapidly explore the parameter space, and it will not take long to obtain samples that can be treated as independent realizations of the target distribution of interest. However, a poor choice of the proposal distribution will result in slow convergence of the Markov Chain and an inability to recognize when convergence to a limiting distribution has been achieved. For complex hydrologic models, there is usually very little a priori knowledge available about the location of the high-probability density region within the parameter space. The proposal distribution should therefore express a great deal of initial uncertainty, thereby resulting in slow convergence to the final posterior target distribution (for example, *Beven and Binley* [1992] suggested imposing a uniform distribution over a large rectangle of parameter values). An important challenge therefore is to design MCMC samplers that exhibit fast convergence to the global optimum in the parameter space, while maintaining adequate occupation of the lower posterior probability regions of the parameter space.

[7] To improve the search efficiency of MCMC samplers, it seems natural to tune the proposal distribution during the evolution to the posterior target distribution, using information inferred from the sampling history induced by the transitions of the Markov Chain. This paper describes an adaptive MCMC sampler, entitled the Shuffled Complex Evolution Metropolis algorithm (SCEM-UA), which is an effective and efficient evolutionary MCMC sampler. The algorithm, a modification of the original SCE-UA global optimization algorithm developed by *Duan et al.* [1992], operates by merging the strengths of the Metropolis algorithm [*Metropolis et al.*, 1953], controlled random search [*Price*, 1987], competitive evolution [*Holland*, 1975], and complex shuffling [*Duan et al.*, 1992] to continuously update the proposal distribution and evolve the sampler to the posterior target distribution. The stochastic nature of the Metropolis-annealing scheme avoids the tendency of the SCE-UA algorithm to collapse to a single region of attraction (i.e., the global minimum), while information exchange (shuffling) allows biasing the search in favor of better solutions.

[8] This paper is organized as follows. In section 2, we describe the Metropolis-Hastings algorithm and the new SCEM-UA algorithm for estimating the posterior probability distribution of hydrologic model parameters. In section 3 we illustrate the power of both algorithms by means of three case studies with increasing complexity; here we are especially concerned with algorithm efficiency (particularly the number of simulations needed to converge to the stationary posterior distribution). Finally, in section 4 we summarize the methodology and discuss the results.