Evolution of ensemble data assimilation for uncertainty quantification using the particle filter-Markov chain Monte Carlo method

Authors


Corresponding author: H. Moradkhani Department of Civil and Environmental Engineering, Portland State University, Portland, OR 97201, USA. (hamidm@cecs.pdx.edu)

Abstract

[1] Particle filters (PFs) have become popular for assimilation of a wide range of hydrologic variables in recent years. With this increased use, it has become necessary to increase the applicability of this technique for use in complex hydrologic/land surface models and to make these methods more viable for operational probabilistic prediction. To make the PF a more suitable option in these scenarios, it is necessary to improve the reliability of these techniques. Improved reliability in the PF is achieved in this work through an improved parameter search, with the use of variable variance multipliers and Markov Chain Monte Carlo methods. Application of these methods to the PF allows for greater search of the posterior distribution, leading to more complete characterization of the posterior distribution and reducing risk of sample impoverishment. This leads to a PF that is more efficient and provides more reliable predictions. This study introduces the theory behind the proposed algorithm, with application on a hydrologic model. Results from both real and synthetic studies suggest that the proposed filter significantly increases the effectiveness of the PF, with marginal increase in the computational demand for hydrologic prediction.

1. Introduction

1.1. Bayesian Inference

[2] Estimation of hydrologic quantities with computer simulation models has greatly advanced in recent years. Through the realization that uncertainty is persistent in all layers of hydrologic prediction, the problem of streamflow forecasting has been reevaluated by much of the scientific community, leading hydrologists to generate hydrologic forecasts within a probabilistic framework [Najafi et al., 2012; Madadgar et al., 2012]. Most often, this is performed through Bayesian inference. Bayesian methods are attractive in hydrology because they have been proven effective, not only in statistical research but also in applications to hydrologic modeling [Kuczera and Parent, 1998; Marshall et al., 2004; Moradkhani et al., 2005a, 2005b; Kavetski et al., 2006; Bulygina and Gupta, 2010; Renard et al., 2011; DeChant and Moradkhani, 2012]. Although it is often common to base probabilistic estimation in hydrology from the Bayesian perspective, specifics about varying implementations differ greatly. These differences stem from the sources of uncertainty accounted for in the analysis, assumptions about the form of the errors, and whether the estimation is performed within a batch or sequential framework. In the current study, the focus is on combining the strengths of sequential and batch Bayesian methods for improved state-parameter estimation.

[3] Sequential Bayesian estimation, often referred to as data assimilation, is a class of methods that seek to estimate the uncertainty associated with the input-state-output relationships of a given model, at every model evaluation in which an observation of the system, state or output, is available. Of these techniques, currently the ensemble Kalman filter (EnKF) is the most commonly used technique in the hydrologic community [Moradkhani et al., 2005a; Zhou et al., 2006; Hendricks Franssen and Kinzelbach, 2008; DeChant and Moradkhani, 2011a; Leisenring and Moradkhani, 2011; Montzka et al., 2011; Nie et al., 2011; Li et al., 2012; Liu et al., 2012]. The EnKF and its several variants have been widely used throughout the hydrologic literature; however, several studies have highlighted problems owing to the limiting assumptions within this technique [e.g., Moradkhani et al., 2005b; Weerts and El Serafy, 2006; Moradkhani et al., 2006; Salamon and Feyen, 2009; Matgen et al., 2010; Montzka et al., 2011; Plaza et al., 2012; DeChant and Moradkhani, 2012]. Recent research has suggested that the particle filter (PF) is a viable alternative to the EnKF in cases where the underlying assumptions are violated [Moradkhani et al., 2005b; Moradkhani and Sorooshian, 2008; Leisenring and Moradkhani, 2011; DeChant and Moradkhani, 2012; Rings et al., 2010; Plaza et al., 2012]; however, the viability of using the PF in certain applications has been questioned throughout the broader data assimilation literature. These concerns are highlighted in the following sections, and the ways to move forward in hydrologic data assimilation are proposed.

1.2. Bayesian Filtering Effectiveness and Efficiency

[4] Although the PF technique has been shown to be effective in many hydrologic modeling applications, this method has received criticism because of its large computational demand in comparison with EnKF-based approaches [Zhou et al., 2006; van Leeuwen, 2009; Snyder et al., 2008]. Often described as “the curse of dimensionality,” high-dimensional filtering requires a large ensemble size to avoid collapse of the filter, a problem that the PF is more susceptible to than the EnKF. Although the EnKF is better suited to avoid ensemble collapse at lower ensemble sizes than the PF, when the Gaussian error assumption of the EnKF is violated, the performance is suboptimal at all ensemble sizes [DeChant and Moradkhani, 2012]. As the assumption of Gaussian error structure will be violated in nearly all hydrologic applications, the PF can be an attractive alternative.

[5] All PFs are based on the Sequential Importance Sampling (SIS) algorithm [Liu et al., 2001]. Although SIS alone can be an effective PF, it is highly subject to collapse, with only a few of the samples having significant weight. This is referred to as weight degeneration. To avoid this problem, resampling methods have been suggested in the statistical literature. Resampling is the process of replicating ensemble members with significant weight, while discarding samples with insignificant weight, to maintain an effective sample that represents the system probability distribution. These techniques include residual resampling [Liu and Chen, 1998; Douc et al., 2005; Weerts and El Serafy, 2006], multinomial resampling [Douc et al., 2005], weighted random resampling [Leisenring and Moradkhani, 2011], stratified resampling [Hol et al., 2006], and systematic resampling [Moradkhani et al., 2005b]. All of these methods have been proven to be effective for building a posterior density but have small differences in their implementation.

[6] Another potential strategy to improving posterior estimation through the PF is with multimodel analysis, through a combination of PF and Bayesian model averaging (BMA) [Parrish et al., 2012]. This method is particularly suited to manage errors resulting from model structural imperfections. Unlike model averaging studies, the current study focuses on posterior estimation within a single-model structure; however, advancements made in this study are compatible with PF and BMA combinations. To improve single-model analysis within filtering, it is necessary to create the most representative posterior distribution possible. This study focuses on enhancing sampling of the posterior with Markov chain Monte Carlo (MCMC) moves.

[7] MCMC refers to several techniques that estimate a posterior density through simulation. Unlike the PF, which is based on the law of large numbers, MCMC is based on ergodic theory and estimates the posterior with a single or multiple chains, which explore to the posterior distribution [Kuzcera and Parent, 1998; Marshall et al., 2004; Kavetski et al., 2006; Smith and Marshall, 2008; Vrugt et al., 2009; Jeremiah et al., 2011]. This methodology has complementary benefits to PF techniques and may be used to more efficiently sample from the posterior. Several studies in the statistical literature have suggested using MCMC techniques for rejuvenating particles at each observation time step to improve the diversity of each sample, leading to a more complete characterization of the posterior distribution [Andrieu et al., 2010; Doucet and Johanson, 2009; Kantas et al., 2009]. This study will expand on these ideas suitable for application to hydrologic models. Recently, we noticed a parallel study [Vrugt et al., 2012] applying similar methods within the context of hydrologic modeling. This study was accepted for publication while the current study was under review. To avoid confusion with that study, we note that the current study proposes a new adaptation of MCMC to the PF for improving joint state-parameter estimation in an entirely sequential framework in the case of stationary parameters. The idea and preliminary results of the current work were presented byMoradkhani et al. [2010].

2. Theory

2.1. Posterior Inference Using Bayes Law

[8] Through Bayes law, one seeks to estimate the probability distribution of some parameters conditioned on an observation, referred to as the posterior. The posterior distribution is represented by inline image, which is the probability of some parameter θ given an observation y. Given that some prior information is available about the parameters inline image, we may develop the posterior distribution through the normalized product of the prior and a sampling distribution (likelihood), inline image.

display math

In equation (1), Bayes law reduces the uncertainty about θ by conditioning it on the observation, assuming that the prior and observation data do not provide conflicting information.

2.2. Markov Chain Monte Carlo Method

[9] Assume that we want to estimate the posterior distribution conditioned on a time series of observations inline image, where T is the length of the observation vector. This distribution can be estimated through Monte Carlo simulations of θ and is proportional to the product of the likelihood ( inline image) and prior ( inline image), as shown in equation (2).

display math

Although it would be advantageous to directly sample from this posterior distribution, parameters in most practical situations are too complex for this strategy. As this posterior typically cannot be sampled from directly, MCMC treats each sample as an evolving Markov chain. Each chain is successively iterated through a series of proposal sampling and acceptance/rejection steps. In the proposal-sampling step, the proposed parameters (θp) are sampled from the proposal distribution based on equation (3).

display math

After sampling, the probability of the proposed parameters may then be calculated via equation (2). Based on the probability of the proposed parameters ( inline image) and the current parameter probability ( inline image), the proposed parameters may either be accepted or rejected with a probability equal to the metropolis ratio (equation (4)). Note that equation (4) assumes that the proposal distribution in equation (3) is symmetric ( inline image).

display math

Acceptance with this probability yields an ergodic Markov chain, which will completely explore the posterior distribution. For the purpose of this study, this introduction to MCMC is sufficient. Complete explanation of the MCMC methods can be found by Kuzcera and Parent [1998].

2.3. Sequential Bayesian Estimation of States and Parameters

[10] Assuming that an observation is available at time t, a modeler will be interested in estimating the posterior state ( inline image) and parameter ( inline image) distributions conditioned on all previous observations ( inline image) and the current observation ( inline image). Although the parameters are represented here with a time index, this is included purely for improved readability and understanding of practical implementation of the algorithms. Parameters are assumed to be constant in time, and thus estimation of dynamic parameters is not attempted in this study. The posterior can be estimated according to Bayes law, as shown in equation (5).

display math

In equation (5), inline image represents the prior information, inline image represents the likelihood, and inline image represents the normalizing constant. As the model is assumed to be Markovian, Bayes law can be applied in a recursive form (equation (5)) through the estimation of the prior distribution via the Chapman-Kolmogorov equation, as shown inequation (6).

display math

The prior distribution is estimated through the integration of the transition probability ( inline image) and the posterior at the previous time step ( inline image). To complete the calculation of the numerator in equation (5), an assumption about the form of the residuals is made to calculate the likelihood. This is typically a normal likelihood function with mean zero and an assumed variance. Last, the normalizing factor must be estimated. Although this value is not readily available, it may be expanded to the integral of the numerator (total probability), according to equation (7), using the states and parameters as intermediate variables.

display math

By substituting equations (6) and (7) into equation (5), sequential Bayes law can be developed to compute the posterior distribution sequentially in time, as shown in equation (8).

display math

2.4. Sequential Monte Carlo Using the Particle Filter

2.4.1. Discrete Forward Model and SIS

[11] To understand SIS, it is essential to view the hydrologic model in the state-space framework. This framework assumes that the model is an order one Markov Process. The model progresses forward at each discretized time increment through a series of differential equations represented by inline image in equation (9).

display math

In the above equation, the model is provided with the posterior states from the previous time step for ensemble member i ( inline image), the prior forcing at the current time step for ensemble member i ( inline image), and the prior parameters at the current time step for ensemble member i ( inline image). Given this information and an assumed model error ( inline image), the prior states for ensemble member i ( inline image) can be calculated. In addition to the forward model operator, an observational operator ( inline image) is necessary to translate the current states into the observation space.

display math

where the forecast for ensemble member i ( inline image) is estimated from the prior states and observational operator parameters ( inline image), with an assumed prediction error ( inline image). In equation (10), inline image represents parameters for the observational operator, which may be different from the hydrologic model parameters. The application in this study assumes that the observational operator parameters ( inline image) are contained within the hydrologic model parameter vector or are the same as forward model parameters ( inline image), which is generally the case in hydrologic models.

[12] SIS begins with a Monte Carlo experiment to develop a discrete representation of the prior distribution. During time steps that an observation is available, a posterior is developed to reduce the uncertainty in the system. The filtering posterior from equation (8) is approximated by equation (11).

display math

where inline image represents the ensemble size, inline image is the posterior weight for ensemble member i at time t, and inline image is the Dirac delta function. The first step to estimating the posterior weights is calculating the likelihood. The normalized likelihood is calculated for each ensemble member i according to equation (13), with a sampled observation, as shown in equation (12).

display math
display math

In equation (12), inline image is the observed value at time t and inline image is the perturbed observation with assumed error ( inline image). Equation (13) assumes statistical properties ( inline image) of the model prediction residuals ( inline image) to develop the probability of the observation through a likelihood function inline image. This probability may then be applied to equation (8), along with the prior density, to develop posterior distribution. In discrete form, this calculation can be shown by equation (14) to obtain posterior weights applied to each ensemble member.

display math

In equation (14), inline image is the prior weight, which is equal to the posterior weight at the previous time step. At this point, the modeler has a weighted sample of model realizations. From this sample, information about continuous posterior can be estimated. Of particular interest may be the expected value of a given state, which is shown in equation (15).

display math

2.4.2. Resampling Algorithms

[13] As SIS strictly weights and updates the weights of the discrete samples, the necessary sample size scales exponentially with both the degrees of freedom in the system and the length of the observation period. If the total uncertainty in the system becomes too large, there will be too few samples with meaningful weights, leading to collapse of the ensemble, or weight degeneration. Occurrence of weight degeneration can be examined by calculating the effective sample size, as shown in equation (16). Typically, a threshold for the minimum effective sample size is set, which indicates the occurrence of degeneration.

display math

[14] Resampling algorithms are capable of removing problems of weight degeneration, but may have problems with insufficient representativeness. Although the ensemble members can be resampled to meaningful locations within the state-parameter space, the posterior may still incompletely represent the uncertainty in the system, leading to partial or full collapse of the ensemble, referred to as sample impoverishment. This case is a problem with basic resampling PFs as depicted in the resampling step (step 2) ofFigure 1. Note that several of the ensemble members are at the same value, leading to incomplete representation of the posterior. A number of techniques may be applied to achieve higher variability in ensemble members including small random noise [Moradkhani et al., 2005b; Salamon and Feyen, 2009], kernel smoothing [Moradkhani et al., 2005a], and MCMC methods [Andrieu et al., 2010].

Figure 1.

Visualization of the proposed algorithm.

3. Proposed Methodology

3.1. Algorithm Description

[15] This study proposes a new approach to sequential state-parameter estimation, in the case of stationary parameters, motivated by distinctive features of the PF and MCMC. This method was developed to reduce the occurrence of sample impoverishment in Sequential Importance Resampling (SIR), while considering the multidimensional correlation structure between the parameters and state variables. The main concern with the SIR algorithm is the treatment of parameters. Although basic resampling of states appears to be sufficient because states are dynamic quantities, parameter moves must be applied after resampling to maintain diversity throughout the ensemble and converged to the correct parameter distribution. The SIR method proposed byMoradkhani et al. [2005b] handles this problem by adding a small error term to the parameters after each resampling step, as shown in equation (17), or it could be through kernel smoothing of parameters as described in Moradkhani et al. [2005a].

display math

where inline image represents a random sample from the Gaussian distribution with mean 0 and variance inline image, where inline image is the variance of the prior parameters at the current time step and s is a small tuning parameter. As this method adds noise to the resampled parameters prior to moving to the next time step, it is essential to avoid over disbursing the parameters, or significantly changing the distribution, while applying enough noise to allow for adequate diversity within the ensemble. Previous work with this algorithm has applied s values between 0.005 and 0.025 to achieve this [DeChant and Moradkhani, 2011a; Leisenring and Moradkhani, 2011; DeChant and Moradkhani, 2012]. Although some success has been found with this method, larger moves are desirable to allow for maximum search of the posterior. To achieve this in a systematic way, MCMC steps are adapted to this framework. The benefit of using MCMC moves is that larger noise values can be used, and the metropolis acceptance ratio is applied to avoid moving outside the filtering posterior. The application of this acceptance criterion is shown schematically in steps 4 and 5 of Figure 1.

[16] Achieving effective use of MCMC requires a few considerations. First, creation of proposal parameters must be well adapted to the problem, maximizing the efficiency of these moves. Here, we suggest that this can be achieved with equation (17), assuming that the s value is properly chosen. This technique is compared with the popular differential evolution method in a following study (C. DeChant and H. Moradkhani, manuscript in preparation). Second, we must develop a method to calculate the parameter probability that includes all prior information, which is an issue that is not explicitly managed in Vrugt et al. [2012]. In the current study, a probability distribution is fit to the filtering posterior parameters at the previous time step to allow for estimation of the full posterior distribution. A smoothing methodology may also be used to manage prior parameter probability by retaining/calculating all parameter trajectories; however, this is avoided here as the focus is improvement of sequential data assimilation and maintaining minimal computational demand. Third, the proposed algorithm only resamples at time steps in which the effective sample size drops below a given threshold, which is set to inline image in this application. By setting a resampling threshold, the algorithm becomes much more efficient. To explain the movement of data through the algorithm, a flowchart is provided in Figure 2.

Figure 2.

Flowchart of the PF-MCMC algorithm.

[17] In Figures 1 and 2, all steps in the filtering methodology with MCMC moves are illustrated. The first two steps are the basic SIS and resampling methods. After resampling, it becomes necessary to add a move step, creating a proposal distribution (step 3). An effective search allows for larger moves, but larger moves require the ability to reject parameter samples that move outside the filtering posterior distribution ( inline image), thus ensuring that parameters do not diverge. To determine whether to accept a proposed parameter, the probabilities of the resampled parameters ( inline image) and proposed parameters ( inline image) must be calculated (step 4). The probability of the proposed joint state parameters, inline image, is calculated according to equation (18), and the probability of the resampled state parameters is calculated similarly.

display math

where inline image is a sample from the proposal state distribution at the current time step, inline image is a sample from the proposal parameter distribution, inline image is the current observation sample, and inline image represents all past observations. Note that in step 4 of Figure 2, the proposal initial states are a function of the posterior states from the previous time step inline image), the posterior forcing inline image, and the proposal parameters, which is a key difference between the current work and Vrugt et al. [2012]. Adjusting states within the MCMC moves, in addition to parameters, as presented by Vrugt et al. [2012], poses the question that if the water balance in the model is preserved. The method proposed here retains the water balance and leads to the case that inline image, thus eliminating the need to estimate the proposal state probability. In addition, inline image is calculated based on the same likelihood function used in equation (13). To calculate the proposal parameter probability, inline image, an assumption must be made about the prior parameter distribution (filtering posterior at previous time step) to estimate inline image. The prior parameters are assumed to fit marginal Gaussian distributions with mean inline image and variance inline image. Although a joint distribution would be preferred in this scenario, marginal priors are selected because the parameters have nonlinear relationships, and thus have a joint distribution that is difficult to fit. To calculate prior probability based on the Gaussian distribution, weighted mean and variance values of the filtering posterior must be calculated. Mean and variance values are calculated as follows:

display math
display math

With the mean and variance of the parameters, it is possible to calculate the prior probability of the proposal parameters, based on the filtering posterior at the previous time step, and subsequently calculate the posterior proposal parameter probability. The proposal and resampled parameters are then compared via the metropolis acceptance ratio to determine the acceptance probability in equation (21).

display math

The Metropolis Algorithm is acceptable because the proposal distribution is assumed to be symmetric, as it is sampled from a Gaussian distribution. Acceptance of the new parameters is shown in step 5 of Figure 1. Through this acceptance/rejection step, the algorithm ensures that the parameters remain in the filtering posterior density, as shown in step 6 of Figure 1. After a single iteration, the algorithm moves to the next time step. Though several iterations could be performed, one is suggested in this study to remain similar in computational demand to the method of Moradkhani et al. [2005b]. It is assumed that one iteration is sufficient for three reasons. First, the algorithm is well informed as to the correct jump distance for parameters, based on prior ensemble properties, thus moves should be very efficient. Second, a large number of chains are used, one from each ensemble member, allowing for effective characterization of the posterior after one iteration. Last, the algorithm is performed over a long data set, which allows for many resample-move steps to reach the correct posterior parameters.

3.2. Jump Rate Tuning with Variable Variance Multiplier

[18] Effective implementation of any MCMC algorithm requires well-adapted jump rates to effectively search the posterior distribution. Optimal jump rates for a Gaussian proposal distribution were suggested byRoberts and Rosenthal [2001] as follows:

display math

In equation (22), the jump rate is a function of the number of dimensions (d), leading to an acceptance rate of about 28%, with respect to the five-dimensional hydrologic model (HyMOD) examined in this study. Such a method for jump rate estimation is adopted inVrugt et al. [2012], even though they acknowledge that this may not be optimal in a sequential framework. It is suggested here that higher acceptance rates are beneficial in sequential estimation because high sample diversity is essential to representing the posterior with a minimal sample size. As an unknown acceptance rate is optimal for sequential estimation, it would be convenient if this value could be estimated automatically. In a parallel study, Leisenring and Moradkhani [2012]suggested a method of sequential scale factor estimation in the SIR algorithm called variable variance multipliers (VVM). VVM may be used to sequentially find the most fitting variance scaling factor in the PF-SIR and PF-MCMC algorithms. In this study, VVM are adapted to the PF-MCMC framework to tune the jump rates automatically. It is, however, noted that there are some minor differences between the VVM methodology inLeisenring and Moradkhani [2012] and the methodology here.

display math
display math
display math
display math

In the above equations, inline image is the forecast expected value, inline image is the observation, inline image and inline image correspond to the 75% and 25% forecast quantiles (interquartile range), respectively, and st is the scale factor at the current time step. The application of VVM in this study assumes that the median ratio of the absolute error ( inline image) to one half the width of the interquartile range ( inline image), over some predefined lag time, should be close to one. Leisenring and Moradkhani [2012] found that inline image is optimally calculated from the 95% predictive bounds, which differs from the implementation proposed in this study. This adjustment to the methodology is due to differences in complexity of the modeling framework and differences in observed data between the two studies. Regardless of the predictive bounds used to calculate inline image, a inline image value larger than one indicates a need to increase the parameter spread, as the predictive distribution is too narrow, and a inline image value less than one indicates the parameter spread must be decreased, as the predictive distribution is too wide. To implement this algorithm within the HyMOD model, it is helpful to use some smoothing value (τ), which is set to 0.5 in this study, and a maximum jump between each time step, set to 0.05 in this study, to avoid overadjustment of scale factors. Each of these values was tuned to the model and data of this study, based on reliability metrics, and may need to be examined further when applied to different models/data sets. A running median of the VVM is necessary with the data in this study because outliers in the streamflow residuals inflate the mean. Here, a running median of the previous 100 time steps is used. It is important to note that a random walk is developed at each time step with the application of VVM to PF-MCMC. Although one iteration is used at each time step in this application, the estimated multiplier and variance would remain constant over subsequent iterations at a given time step, maintaining an invariant posterior distribution.

4. Case Studies

[19] In this study, both a synthetic and a real experiment were performed to compare the ability of the proposed methodology and the original SIR methodology to estimate the posterior. In addition to the original SIR algorithm, an algorithm using VVM to determine the correct scaling factor for SIR is also compared in the analysis for consistency; however, this method is implemented only when the effective sample size indicates resampling should be performed to remain computationally consistent with the original SIR algorithm. Both the synthetic and real experiments are performed on the HyMOD model. Throughout the analysis, the method of Moradkhani et al. [2005b]will be referred to as PF-SIR, SIR with VVM will be referred to as PF-SIRV, and the proposed methodology will be referred to as PF-MCMC.

[20] In these experiments, several performance measures will be used, most of which are described in DeChant and Moradkhani 2011b, 2012](i.e., Nash-Sutcliffe efficiency (NSE), Predictive quantile-quantile (QQ) plot, reliability, and sharpness). Note that in the current study, the reliability and sharpness metrics are represented withϕ and ε, respectively; however, in DeChant and Moradkhani [2012], these measures were α and π. This change in notation is made to avoid confusion with the notation in the description of MCMC techniques. In addition, a new probabilistic verification measure is proposed here, referred to as confidence, and is shown in equations (27)(30). In these equations, zt is the quantile of the predictive distribution in which the observation is located at time t, P1,i and P2,i represent the ith upper and lower quantiles, Wi is the frequency that the observation falls between the ith predictive bounds, and C is the confidence value. A positive C value indicates overconfidence (too little spread), and a negative C value indicates underconfidence (too much spread).

display math
display math
display math
display math

4.1. Time-Lagged Replicates

[21] Robust analysis of hydrologic data assimilation techniques requires repeated experiments over multiple different flow regimes. To achieve this, DeChant and Moradkhani [2012] proposed breaking a 40 year data set from Leaf River, Mississippi, into multiple different time periods. This allows for multiple calibrations of the model in years with different streamflow characteristics. Furthermore, a validation of the posterior parameters from each calibration is performed on a separate time period. Examining the accuracy of posterior parameters over a separate validation time period allows for independent analysis of the calibrated parameters. Following the methods presented in DeChant and Moradkhani [2012], this study performs 21 calibration replicates for HyMOD, with the starting date of each replicate separated by 500 time steps. Each calibration is performed over 2000 days with the latter 1000 days used for calculation of performance measures, providing 1000 days for the parameters to converge to a reasonable distribution. After calibration with real data, validation is performed on a separate 2000 day time period, using the posterior parameters from the last time step of each calibration and state estimation via the PF. A validation experiment is not necessary on the synthetic analysis because the convergence to the true parameters can be directly evaluated.

4.2. Synthetic Study

[22] A synthetic study was chosen to illustrate the ability of the algorithms to estimate predefined parameters, in the presence of known errors. This is performed according to the framework of Moradkhani [2008]. Model states and outputs are generated with predefined parameters and observed forcing data, which are assumed to be the true values. Filtering is then performed on this synthetic data set, and convergence of the parameters to the predefined values is evaluated. In this experiment, precipitation is assumed to follow a lognormal distribution with a 25% relative error, potential evapotranspiration is assumed to follow a normal distribution with a 25% relative error, and streamflow observations are assumed to follow a normal distribution with a 15% relative error. This is chosen based on DeChant and Moradkhani [2012].

[23] This study is performed with the HyMOD model, which is a parsimonious, conceptual, lumped hydrologic model, originally developed by Boyle et al. [2000]. The model contains five state variables and five parameters: α, Bexp, Cmax, Rs, and Rq. Accurate convergence to predefined parameters is examined with the synthetic case. Note that parameters are assumed to be constant, as was stated in section 2.3. For a more detailed description of the model processes, see Moradkhani et al. [2005b].

[24] Comparison of the convergence of each parameter from the three filters is shown in Figure 3. From Figure 3, it is observed that the Rq, Bexp, and Cmax parameters are very identifiable, as was suggested by Moradkhani et al. [2005b], and all methods estimate them accurately. Unlike these three parameters, α and Rsare less identifiable. Although all methods appear to perform similarly in locating the proper parameters, small differences can be observed in the behavior of the techniques. In examining the 95% and interquartile bounds for each method, the PF-SIRV and PF-MCMC allow for more movement in the quantiles than the PF-SIR. In fact, the PF-SIRV and PF-MCMC perform much larger and time-varying adjustments, leading to larger parameter search area. An increase in parameter search area is preferable because it will reduce the chance of sample impoverishment, particularly at lower ensemble sizes. Further evidence that PF-SIRV and PF-MCMC are more robust against sample impoverishment is suggested in the analysis of the performance measures.

Figure 3.

Streamflow prediction and convergence of the parameter distributions for the PF-SIR, PF-SIRV, and PF-MCMC for the synthetic experiment designed according toMoradkhani [2008]over 18 months. The upper panels show streamflow estimation with 95% predictive bounds for the PF-SIR, PF-SIRV, and PF-MCMC; the black line is the expected value and the dots are the observations. The lower panels show the parameter evolution for the above three methods. The light gray region is the 95% bounds, the dark gray is the interquartile range, the black line is the mean value, and the black triangle is the predefined parameter value.

[25] Comparison of the ensemble prediction from each method is provided in Figure 4. In this figure, the predictive QQ plot is generated through a combination of all 21 model runs. Lumping results from 21 model runs ensures that random fluctuations in performance are averaged out, allowing for more reliable analysis. These 21 replicates were performed over four different ensemble sizes (50, 100, 300, and 500) to highlight the difference in performance with respect to ensemble size. In general, a trend toward increasing reliability and decreasing overconfidence is observed with increasing ensemble size. In comparison of all three filters, the results suggest that the PF-MCMC and PF-SIRV are more robust in avoiding sample impoverishment than the PF-SIR, especially at low ensemble sizes. Both PF-SIRV and PF-MCMC provide considerably accurate representation of uncertainty when implemented with more than 100 ensemble members, whereas PF-SIR remains overconfident at 500 ensemble members. This result suggests that the proposed algorithms are providing more diversity in posterior parameters, leading to a more accurate representation of uncertainty. To support this finding, the average performance measures for all 21 replicates, at different ensemble sizes, were calculated and presented inFigure 5. In the upper left subplot of Figure 5, the PF-MCMC and PF-SIRV approach a high NSE value at lower ensemble sizes than the PF-SIR. This indicates that the PF-MCMC and PF-SIRV produce a more accurate expected value at lower ensemble sizes. In addition, at nearly all ensemble sizes, the PF-MCMC and PF-SIRV provide a more reliable estimate of uncertainty (higherϕ and εvalues) than the PF-SIR and approaches a confidence value below 0. This indicates that the PF-MCMC and PF-SIRV become underconfident at high ensemble sizes, whereas the PF-SIR remains overconfident at nearly all ensemble sizes. This provides further support that the PF-MCMC and PF-SIRV are providing increased diversity in the posterior parameters, leading to a more accurate estimation of the posterior distribution. Although PF-MCMC is expected to improve on PF-SIRV in avoiding underconfidence (likely due to over dispersion in parameters), the performance of the two is similar in this application. The benefits of using the PF-MCMC over the PF-SIRV will be examined through the analysis of results from the real data experiment.

Figure 4.

Predictive QQ plots of the three filters for joint state-parameter estimation in a synthetic experiment for various ensemble sizes.

Figure 5.

Performance measures of the three filters for joint state-parameter estimation in a synthetic experiment for various ensemble sizes.

4.3. Real Data Study

[26] A study using streamflow observations, from the Leaf River basin, is provided to examine the performance of the proposed algorithm in a real streamflow forecasting scenario. This experiment is performed over the same 21 time periods as the synthetic study and with the same error assumptions, except that there is assumed to be model prediction error. This error is normally distributed with a standard deviation equal to 30% of the prediction value.

[27] Similar to Figure 4, Figure 6shows that the PF-SIRV and PF-MCMC approach a more reliable distribution at a smaller ensemble size than the PF-SIR. As explained in the synthetic analysis, this is a result of improved parameter search methods, which avoid sample impoverishment. Unlike the synthetic study, the PF-SIR remains biased at all ensemble sizes; however, this is avoided in the other two filters. The PF-SIR displays an inability to predict low flows, which is seen as a high bias inFigure 6. This overprediction suggests that the PF-SIR may not be capable of fully characterizing the posterior distribution, particularly the portion of the posterior parameter distribution related to baseflow. As the PF-MCMC and PF-SIRV have a larger search path for each parameter, both accurately explore the posterior distribution, leading to an accurate estimation of all flows. Further support for the results inFigure 6 are shown in the performance measures presented in Figure 7. Figure 7shows that PF-MCMC and PF-SIRV approach a high NSE value at very low ensemble sizes, indicating an accurate prediction. Unlike the synthetic experiment, PF-SIR is unable to create an expected value (reflected in NSE metric) with the same accuracy as PF-MCMC or PF-SIRV at any ensemble size. In addition, probabilistic measures reveal more reliable results from the PF-MCMC and PF-SIRV than the PF-SIR. The contrast between the results from the synthetic and real experiments highlights the strong impacts model error can have on these techniques. Another important note fromFigure 7is that the PF-SIR again shows a tendency toward overconfidence in comparison with the PF-MCMC and PF-SIRV methods. The PF-MCMC and PF-SIRV methods are able to reduce this overconfidence by increasing the parameter search area, and thus creating an ensemble that is more representative of the true posterior distribution.

Figure 6.

Predictive QQ plots of the three filters for state-parameter estimation using streamflow data assimilation for various ensemble sizes.

Figure 7.

Performance measures of the three filters for joint state-parameter estimation with real streamflow for various ensemble sizes.

[28] All previous analysis showed that the PF-SIRV performs equivalently with the PF-MCMC method. This suggests that the VVM method is capable of locating the posterior parameters equally as well as the PF-MCMC. Up to this point, the motivation for using the metropolis acceptance criteria is not apparent. To further the performance assessment of the PF-SIRV and PF-MCMC, a validation of the posterior parameters of each method must be examined. In this validation, a state estimation experiment is performed with the posterior parameters at the final time step of each calibration from PF-SIRV and PF-MCMC. The validation results for the PF-SIRV and PF-MCMC are shown inFigure 8. PF-SIR is excluded from the validation results as they are not competitive with the PF-SIRV and PF-MCMC; however, full results of the PF-SIR for both HyMOD and the Sacramento soil moisture accounting model are available inDeChant and Moradkhani [2012]. During the validation, the PF-MCMC estimated parameters produce a marginally higherϕvalue than PF-SIRV at all ensemble sizes above 100, suggesting different performance between the two methods, which was not observed in the calibration. This contrast between calibration and validation highlights a benefit of using an MCMC step after the resample-move step. While the PF-MCMC may either accept or reject the adjusted parameters, ensuring an accurate sample, the PF-SIRV keeps every adjusted parameter. By only accepting the valid parameter adjustments, the PF-MCMC maintains more meaningful parameter distribution than the PF-SIRV, leading to slightly more reliable probabilistic prediction. Although the results suggest that only minor improvements in parameter estimation can be achieved by the PF-MCMC over PF-SIRV, these results will likely be significant in problems of greater dimensions.

Figure 8.

Performance measures of the PF-SIRV and PF-MCMC filters during the validation for various ensemble sizes.

5. Discussion and Conclusion

[29] This study proposes an improved PF algorithm for hydrologic prediction. To improve the PF-SIR algorithm, the new algorithm uses MCMC moves to increase parameter diversity within the posterior distribution. This allows for a more complete representation of the posterior distribution, reducing the chance of sample impoverishment and leading to a more accurate streamflow forecast. The algorithm proposed in this article was tested in both synthetic and real case studies, with the parsimonious HyMOD model, to examine convergence and streamflow prediction properties of PF-MCMC in comparison with PF-SIR and PF-SIRV. Results from each experiment highlight the effects of improved parameter estimation via PF-MCMC and also make a strong case for the use of VVM proposed byLeisenring and Moradkhani [2012].

[30] Synthetic analysis showed that all three filtering methods in this study are capable of locating predefined parameters. This highlights the sensitivity of each algorithm to the correct parameters and supports the conclusion that PFs can effectively locate the proper posterior parameters [DeChant and Moradkhani, 2012; Leisenring and Moradkhani, 2011; Moradkhani et al., 2005b]. Such a conclusion conflicts with the assertion in Vrugt et al. [2012]that the PF cannot properly locate the posterior parameters due to shortcomings in their application. Although all methods were capable of locating the correct parameters, the PF-MCMC and PF-SIRV appear to be capable of creating a more reliable prediction at lower ensemble sizes than the PF-SIR. This is attributed to the improved parameter search in the PF-MCMC and PF-SIRV, leading to greater parameter diversity. This greater parameter diversity reduces the potential for sample impoverishment. As the PF-MCMC and PF-SIRV are more robust against sample impoverishment, both are capable of running at lower ensemble sizes, making these filters more efficient. To prove the increase in efficiency with the new algorithm,Figure 9is presented. In this figure, the run time for the PF-SIR, PF-SIRV, and PF-MCMC at ensemble sizes from 10 to 1000 ensemble members is compared. From this figure, it is observed that PF-MCMC has only slightly greater computational demand at each ensemble size than PF-SIR and PF-SIRV. PF-MCMC requires about 14% more time to run at each ensemble size than PF-SIR. As the PF-MCMC and PF-SIRV provide reliable prediction at around 200 ensemble members, whereas PF-SIR requires around 1000 ensemble members, the PF-MCMC and PF-SIRV significantly reduce the computational demand.

Figure 9.

Computational demand of the three filtering methods used for state-parameter estimation of HyMOD model for the Leaf River basin. The sequential filtering was performed for 2000 time steps with different ensemble sizes.

[31] Analysis of these techniques with real data is also provided in the results section. This analysis shows how the inclusion of model error affects the estimation of parameters and streamflow. Results from real data support the finding that PF-MCMC and PF-SIRV are capable of avoiding sample impoverishment at lower ensemble sizes than PF-SIR and that the PF-MCMC and PF-SIRV avoid bias that occurs in the PF-SIR prediction. This high bias is a result of poor baseflow characterization from the PF-SIR. Similar results for the PF-SIR were found inDeChant and Moradkhani [2012]. In addition to examining the performance of these algorithms during calibration, a validation was performed to examine the accuracy of the posterior parameters. During validation, PF-MCMC showed small improvements over PF-SIRV. Although both PF-MCMC and PF-SIRV performed nearly identically during calibration, the PF-MCMC produced slightly more reliable predictions than the PF-SIRV during validation, suggesting that PF-MCMC more accurately estimated the posterior parameters. By applying the metropolis acceptance criteria, PF-MCMC was capable of more accurately identifying the parameters at ensemble sizes greater than 100; however, this resulted in only minor benefits with the HyMOD model. Although the PF-MCMC showed only marginal improvements over PF-SIRV in this study, the benefits of PF-MCMC will likely be more apparent in models of greater complexity (C. DeChant and H. Moradkhani, manuscript in preparation). Overall, results in this study suggest that through a combination of VVM and the metropolis algorithm, it is possible to improve the exploration of the posterior parameters in hydrologic data assimilation, leading to improved streamflow predictions.

[32] Results from both synthetic and real experiments suggest that the PF-MCMC algorithm is a more efficient and accurate filter than the PF-SIR. The PF-MCMC algorithm allows for larger, but accurate, parameter moves, leading to a more diverse ensemble than the standard PF-SIR. By creating a more diverse ensemble, PF-MCMC is more robust against problems of sample impoverishment. This allows for implementation with smaller ensemble sizes, which makes the filter more efficient than the PF-SIR. Although these results support the hypothesis that MCMC moves can improve the PF, it is necessary to provide a more robust analysis to confirm this conclusion. To add further analysis, a subsequent study (C. DeChant and H. Moradkhani, manuscript in preparation) examining the effects of algorithmic modifications, particularly the assumptions of Gaussian parameter distributions and proposal distribution generation, and different model structures on the behavior of the PF-SIR, PF-SIRV, and PF-MCMC algorithms was performed.

Acknowledgments

[33] The first author thanks the partial financial support provided by NOAA-MAPP grant NA110AR4310140. The authors thank the three anonymous reviewers for their constructive comments that improved the clarity of this manuscript.

Ancillary