Hypothetico-inductive data-based mechanistic modeling of hydrological systems
Peter C. Young
Integrated Catchment Assessment and Management Centre (ICAM), Fenner School of Environment and Society, Australian National University College of Medicine, Biology and Environment, Canberra, ACT, Australia
Lancaster Environment Centre, Lancaster University, Lancaster, UK
Corresponding author: P. C. Young, Lancaster Environment Centre, Lancaster University, Lancaster LA1 4YQ, UK. (firstname.lastname@example.org)
 The paper introduces a logical extension to data-based mechanistic (DBM) modeling, which provides hypothetico-inductive (HI-DBM) bridge between conceptual models, derived in a hypothetico-deductive manner, and the DBM model identified inductively from the same time-series data. The approach is illustrated by a quite detailed example of HI-DBM analysis applied to the well-known Leaf River data set and the associated HyMOD conceptual model. The HI-DBM model significantly improves the explanation of the Leaf River data and enhances the performance of the original DBM model. However, on the basis of various diagnostic tests, including recursive time-variable and state-dependent parameter estimation, it is suggested that the model should be capable of further improvement, particularly as regards the conceptual effective rainfall mechanism, which is based on the probability distributed model hypothesis. In order to verify the efficacy of the HI-DBM analysis in a situation where the actual model generating the data is completely known, the analysis is also applied to a stochastic simulation model based on a modified HyMOD model.
 Most dynamic models of hydrological systems are based on the “hypothetico-deductive” approach to scientific investigation, as advocated most powerfully by Karl Popper [Popper, 1959] and others. Within the hydrological modeling context, this is normally reflected by hypotheses in the form of “conceptual” models, normally based on scientific reasoning and past experience, which the hydrologist believes are best able to mimic the physical nature of the hydrological system. These are usually formulated in terms of dynamic equations, derived from concepts such as mass and energy conservation and estimated (calibrated or optimized) in relation to measurements of the relevant hydrological variables.
 As the present author [see e.g., Young, 2002, 2011a, 2011b] has argued, this hypothetico-deductive philosophy is largely a creature of the twentieth century and is linked strongly with laboratory science and the ability to carry out planned experimentation. Prior to this, the alternative “inductive” approach was the cornerstone of scientific investigation or “natural philosophy,” as it was referred to at the time. Here, the model or “theory of behavior” was inferred directly from observations of the system under study (often naturally occurring phenomena), without undue prejudice based on prior hypotheses. Indeed, the most famous natural philosopher of all, Isaac Newton, stated that “hypotheses nonfingo” or “I frame no hypotheses” [see e.g., Cohen, 1962].
 As Young [2011a, 2011b] points out, environmental modelers do not often have the ability to carefully plan and conduct experiments on the hydrological systems that they are studying. And even when planned experimentation is possible, as in the case of tracer experiments for solute transport and dispersion modeling [e.g., Wallis et al., 1989], or instrumented catchments in the case of rainfall-flow models [e.g., McIntyre et al., 2011], the experiments are much more difficult and costly to plan and constrain than those carried out in the confines of a laboratory. Moreover, many aspects of the system are difficult to observe and so form the basis for hypothesis formulation. With these considerations in mind, inductive modeling has attractions because it can produce a model form that efficiently parameterizes the observational data, without the constraints and possible prejudices of prior hypotheses.
 Data-based mechanistic (DBM) modeling is predominantly, but not exclusively, an inductive approach to modeling that harks back to the era of natural philosophy. It recognizes that, in contrast to most man-made dynamic systems, the nature of many natural systems, particularly at the holistic or macrolevel (global climate, river catchment, macroeconomy), is still not well understood. “Reductionist” approaches to modeling such systems, based on the aggregation of hypothetico-deductive models at the microlevel, or the application of microscale laws at the macrolevel, often results in very large simulation models that suffer from “equifinality” [von Bertalanffy, 1968; Beven, 1993] and are not fully identifiable from the available data.
 Although the term “DBM modeling” was first used in Young and Lees , the basic concepts of this approach to modeling dynamic systems have been developed over many years. For example, they were first applied seriously within a hydrological context in the early 1970s, with application to the modeling of water quality and flow in rivers [Young and Beck, 1974; Beck and Young, 1976] and set within a more general framework shortly thereafter [Young, 1978]. Since then, they have been applied to many different systems in diverse areas of application from ecology, through engineering to economics [see e.g., Young, 1998, 2006, 1998, 2011b, and the references therein].
 Interestingly, one of the first applications that led to the idea of DBM modeling was to the modeling of rainfall-flow processes [Young, 1974; Whitehead et al., 1976]; and this was generalized later in an example considered first in Young  and then in Young and Beven . This led to numerous examples that have demonstrated the utility of DBM modeling applied to rainfall-flow processes [see e.g., Young, 1998; Lees, 2000; Young, 2001a, 2003; Ratto et al., 2007; Chappell et al., 2006; Ochieng and Otieno, 2009; Young, 2010a, 2010b; Beven et al., 2012; McIntyre et al., 2011, and references therein], including catchments affected by snow melt, where the nonlinearities are more complex [Young et al., 2007].
 The standard DBM modeling procedures normally produce an efficiently parameterized stochastic model (parsimonious in terms of its dynamic order) that explains the data well and can be interpreted in reasonable, physically meaningful terms. However, this physical interpretation is inferred directly from the data-based model, the structure and parameters of which are identified and estimated, respectively, using statistical methods. Consequently, the resulting model may not always be fully acceptable or credible to an audience that has been educated to believe strongly in hypothetico-deductive modeling based on conceptual, often deterministic, simulation models. Moreover, the model obtained in this completely inductive manner may be restricted to some degree: for instance, previous DBM rainfall-flow models function very well within an adaptive flow forecasting context for which they were derived but they utilize the flow measurement as a surrogate measure of soil moisture (catchment storage) in the “effective rainfall” nonlinearity (see later 2.2). As a result, they cannot be used for stochastic simulation purposes as this would imply a physically meaningless feedback mechanism and could make the model unstable.
 If, in any particular example, the DBM model is not considered acceptable for the above reasons, then it needs to be modified to correct any such perceived deficiencies. One obvious approach is to retain the statistically identified structure of the DBM model and base any modifications on conceptual ideas about the nature of those elements in the model that require further elucidation. For instance, various nonlinear conceptual models have been evolved to synthesize the effective rainfall [see e.g., Wagener et al., 2004, chap. 3, pp. 60–72] and can provide possible replacements for the DBM effective rainfall nonlinearity; modifications that would allow it to be used for stochastic simulation. This is the stimulus for hypothetico-inductive DBM (HI-DBM) modeling: obviously, hypothetico-deductive and inductive methods of modeling are not mutually exclusive and HI-DBM modeling is an attempt to meld together the best aspects of both approaches and produce a systematic approach to hydrological model development, in general, and rainfall-flow modeling, in particular.
 The paper starts in section 2 by outlining those aspects of DBM modeling that are most relevant to subsequent sections of the paper. Since HI-DBM modeling is often going to be problem-dependent, it is best described with the help of a real example. Consequently, section 3 provides a brief introduction HI-DBM modeling, so setting the scene for the main section 4 of the paper that describes a detailed HI-DBM modeling exercise involving the well-known Leaf River data set and the associated HyMOD (hydrologic model) conceptual model. Section 5 presents a simulation example that illustrates how HI-DBM modeling works in a situation where the model structure and noise-free output of the system are known and can be used to evaluate the results. Finally, section 6 sums up the major contributions of the paper and suggests possible topics for future research.
2. DBM Modeling
 The latest embodiment of the DBM modeling is considered fully in Young and Ratto  and the references therein. Consequently, the discussion here will be restricted to the form of DBM models and how it is possible to use them in an HI-DBM context. As far as the form of DBM models is concerned, the standard DBM modeling philosophy allows for any generic model type to be used in the initial data-based analysis. However, the main generic model form used so far is a multi-input, multi-output (MIMO), nonlinear differential equation or its discrete-time equivalent, characterized by state-dependent parameter (SDP) nonlinearities [see e.g., Young, 1993, 2001a]. Here, the SDP model is one in which the model parameters are not assumed constant over the observation interval but, if the observational data support it, can be a priori unknown functions of one or more “state variables” of the system; functions that are identified and estimated as part of the statistical analysis. Although it is possible to consider multivalued functions, such as hysteresis [e.g., Young et al., 2007] and multivariable dependency [Sadeghi et al., 2010; Tych et al., 2012], the SDP functions in the present paper are assumed to be single valued and dependent on only a single state variable.
 This type of generic model will be unfamiliar to some readers and may seem rather complex at first, but this is mainly because of its general nature. Fortunately, as mentioned in section 1, previous research on the DBM modeling of rainfall-flow (and flow-routing) systems has shown that the SDP nonlinearity, in this case, is a simple function of the measured flow, acting as a surrogate measure of soil moisture. Moreover, as we shall see in the later practical example of section 4 and the associated Figure 5, the linear transfer function (TF) part of the model can be reduced to a hydrologically interpretable combination of lumped, conceptual “storage tanks” (surface water stores or reservoirs) that are likely to be much more familiar to many readers (see also the discussion on this in Young [1986, 1992a, 1992b]).
2.1. DBM Model
 DBM models can be identified in either continuous-time (differential equation: see e.g., Young and Garnier ; Young [2011b]) or discrete-time (difference equation) form. Considering the latter type with r inputs and s outputs, the generic DBM model can be written as follows in vector-matrix terms:
where y(k) is the s dimensional vector of output variables; G(z−1) is a s × r SDP TF matrix in the backward shift operator z−1, i.e., ; is an r dimensional vector of input variables; and is a s dimensional vector of noise or error variables. However, the model in this general form presents a formidable identification and estimation problem, and so it is normally formulated in a decomposed multi-input, single output (MISO) form that is much easier to handle.
 In the decomposed MIMO model, the ith MISO component, , consists of an SDP, stochastic TF model, which takes the form:
where, for any i, the are the elements in the ith row of the TF matrix G; is the vector of unknown parameters in the ith MISO component and is a vector of state variables, usually in a nonminimal state space involving the present and past input and output variables [see e.g. Young et al., 1987], on which the parameters in the polynomials and are dependent. These polynomials are defined as follows in terms of the SDPs, and :
Note that any numerator polynomial can have leading zeros to specify that there is a pure time delay between the input and its first effect on the output . As we shall see in the later practical example considered in section 4, the full model parameter vector will contain all of the parameters in these state-dependent polynomials, as well as additional parameters in the parameter vectors that may be introduced to represent any identified SDP nonlinearities in each component MISO model. Finally, the additive noise in the above equations is introduced to explain those components of that are not explained by the inputs . Since this SDP model is able to characterize the dynamics of many nonlinear, stochastic, dynamic systems, including those with chaotic dynamics, its application potential is considerable. Consequently, the exact manner in which this noise component is modeled will depend upon the nature of the application.
 Fortunately, the complexity of the full MIMO model (2) is not often required in most hydrological applications of DBM modeling. In the later example of section 4, for instance, the DBM rainfall-flow model only requires a special single-input, single output (SISO) representation with in equation (2) and a zero pure time delay () on the input so that the indices can be omitted, and u(k) is replaced by the rainfall r(k), which has an immediate effect on the flow. The model then takes the much simpler form:
or, in equation terms for readers not familiar with TFs:
where y(k) is the measured flow output and x(k) is the model generated flow. In other words, x(k) at the kth sampling instant is a linear function of the past model outputs at previous sampling instants, and the present and past values of the input rainfall , modulated by their associated SDPs that encode the nonlinear behavior. Moreover, the model can be simplified even further because the denominator of the TF is identified to have constant parameters; while the SDP nonlinearity in the TF numerator is common to all terms in the polynomial and so it can be “factored out” to become an “effective rainfall” nonlinearity acting on the input rainfall. This converts the model into a “Hammerstein” nonlinear model; namely, a serial connection of the input nonlinearity and a linear TF; a model form which, as we shall see, is similar to the HyMOD model and many other conceptual rainfall-flow models.
 Finally, the noise is approximated stochastically as a heteroscedastic autoregressive, moving average (ARMA) process of the form:
where e(k) is assumed to be a zero mean sequence of random variables with, in the present rainfall-flow context, time-variable variance (i.e., a heteroscedastic sequence of random variables); while the and are polynomials in the backward shift operator, defined as follows:
This model will be referred to as a heteroscedastic ARMA() process.
 In the later example, it is necessary to quantify the changing variance of the estimated noise variable where, from (4a) ,
while the “caret” or “hat” is employed here and subsequently to denote the estimated value. In the example, the changing squared value of the error is identified empirically as a nonparametric SDP function of the squared flow that can be approximated by the simple regression equation
where the constant coefficients ρ and ψ are estimated by standard linear least-square optimization (see the supporting information for this article for more details).
 Given that state-space models are so popular in recent hydrological systems analysis, the reader might ask why a multivariable stochastic TF model is used in DBM modeling. The main reasons are the uniqueness and parametric efficiency of the stochastic TF model. First, for any input-output behavior modeled by a TF model, there are an infinite number of state-space realizations, depending on the definition of the state vector. Second, the deterministic part of the nth order, stochastic TF model (4a) , has parameters, whereas an associated state-space model, with a state transition matrix and input vector, has up to parameters. Of course, if one assumes a particular state-space structure, some of these parameters might well be zeros. However, this is not possible when using DBM modeling because it starts with “black-box” model identification, with as few prior assumptions as possible and without any constraints on the state-space form. Indeed, if a state-space model is required, for example to implement a flood warning system based on a Kalman filter forecasting engine (see the previously cited references on DBM-based flow forecasting), then this model would be defined from the suitably identified and decomposed stochastic TF model, with the state variables interpreted in physically meaningful terms (e.g., as “quick” and “slow” components of the flow).
2.2. Response Error and Prediction Error
 An important aspect of data-based modeling is the definition and quantification of the model error on which model parameter estimation (optimization) can be based. Considering the model (4a), for instance, the measured flow y(k) at the kth sampling instant is modeled in equation (4b) by the sum of the simulated, deterministic output x(k) of the model and an additive stochastic error or noise process . On this basis, there are two major types of error that have been used in the statistical literature: the deterministic “response” or “output error” and the “prediction” error which, for any values of the parameter vectors and , are obtained by reference to the following equations:
Here, is the simulated, deterministic output of the model with parameter vector ; and is the stochastic prediction of the output based on the full, stochastic model and the assumption that can be characterized by a stochastic model with a parameter vector . For example, in the case of the model (4a) to (6), the elements of are the parameters that define the ARMA model polynomials in (7), and it is straightforward to show (see the supporting information) that the prediction of y(k) at the kth sampling instant can be computed as
where is the estimate of e(k) obtained by inverting the ARMA noise model (6), i.e.,
Here, is the simulated deterministic output of the model with estimated parameter vector , while the estimated polynomial coefficients in and are defined by . Note that, unlike the simulated output , the prediction in (11) depends on the output measurements y(k) up to and including the kth sample. Normally, therefore, the prediction error will be smaller than the response error, and the level of this reduction will provide a measure of how much improvement is derived from the inclusion of the heteroscedastic ARMA noise process into the stochastic model. For this reason, is used to evaluate the performance of this stochastic model at the “model structure identification” stage of DBM modeling, as discussed later in section 2.3.
 This concept of model prediction and the associated prediction error [see e.g., Ljung, 1999; Young, 1984, 2011b] is worth considering a little further because prediction can have various meanings in hydrology [see e.g., Kumar, 2011]. Also, there has been some confusion in the hydrological literature about the terms “simulation” and “prediction,” particularly in relation to the HyMOD rainfall-flow model considered in the later example of section 4 (see e.g., the comments by Beven  on the paper by Vrugt et al. [2009a]). Referring to predictions into the future as “forecasts” in order to avoid any ambiguity, this confusion relates to the situation where the stochastic model is being used to compute the f-step-ahead forecast. In the case of the model (4) with a zero pure time delay, this requires either (i) forecasts, or (ii) prior knowledge, of the future input rainfall that gives rise to the future changes in the output y(k). In the general forecasting literature, the true ex-ante forecast is then one where these future values of the input, over the forecasting interval, are themselves forecast over this interval; while the ex-post forecast is one in which these future inputs are assumed to be known. Of course, input forecasts are not required if the pure time delay , where , since then the required values of r(k) would be available to construct flow predictions up to f steps ahead.
 Although this distinction between ex-ante and ex-post forecasts is not always stressed in hydrology, it is essential because ex-post forecasts are sometimes presented simply as “predictions.” If the distinction is not made clear, these may be interpreted by the reader as true ex-ante forecasts, when this is not the case. Also since ex-post forecasts are normally considerably superior to ex-ante forecasts because they do not require the (normally very difficult) prediction of rainfall, this may give a misleading impression about the predictive abilities of the model.
 In order to be clear, therefore, the underlying statistical context of the present paper is time-series analysis and forecasting [e.g., Box and Jenkins, 1970; Kalman, 1960]; indeed, the reader will recognize the linear version of the stochastic model formed by equations (4a) and (6) as the model utilized in the seminal work of Box and Jenkins on time series, analysis, forecasting, and control. However, while DBM modeling exploits such time-series methodology, it is also crucially concerned with the physical interpretation of the resulting time-series models. Consequently, all the response plots in the later examples are simply based on the simulated deterministic output of the model defined by the estimated parameter vector . This reflects the primary aims of HI-DBM modeling which, in the present context, are to improve the hydrological meaning of the DBM model and/or to diagnose possible inadequacies in an associated conceptual model, not to forecast the output time series. For completeness, however, multistep-ahead forecasting is discussed briefly in the supporting information and true ex-ante, 1-day-ahead forecasts generated by an adaptive HI-DBM model of the Leaf River, are presented there.
2.3. Model Identification, Estimation, and Validation
 The whole DBM modeling procedure is considered in traditional statistical terms as a process of model identification, estimation, and validation, as outlined below. Normally, the computational procedures used in these three stages are those available as routines in the CAPTAIN Toolbox for MATLAB (this is freely available and can be downloaded via the web site at http://captaintoolbox.co.uk/Captain_Toolbox.html).
2.3.1. Model Structure Identification
 An important distinction between inductive DBM modeling and hypothetico-deductive “conceptual” modeling is that the DBM model structure is identified directly from the data, based on various statistical criteria and with as few prior assumptions as possible. This can be contrasted with conceptual modeling, where the hypothesis step in the hypothetico-deductive process is most often the a priori assumption of the model structure, which forms a major part of the model hypothesis. However, there have been some more recent and interesting excursions into model structure identification where the analysis investigates the a priori assumed model structure in order to detect structural inadequacies: see e.g., the recent publications of this topic by Smith et al. , Bulygina and Gupta , and Lin and Beck . Another recent approach in hydrology is based on the comparative evaluation and systematic selection from multiple model structures [see e.g., Beven, 2006; Fenicia et al., 2008; Clark et al., 2011].
 The DBM model structure/order identification procedure is rather simpler than the above approaches. It can be carried out in various ways but it is normally based on the application of statistical methods that help to answer questions such as: Is the model predominantly linear in dynamic terms or are there indications of significant state dependency in the parameter estimates that might suggest the presence of nonlinear dynamics? Is the model dynamic order satisfactory; i.e., is it efficiently parameterized or “parsimonious” so that overparameterization is avoided? Does the model explain the data sufficiently well? Are the final stochastic residuals purely random and devoid of components that might suggest the model has not extracted all the useful information from the data (normally that they constitute a zero mean, serially uncorrelated series that is also uncorrelated with the input variables)? if necessary, does additional correlation analysis indicate that there are other signs of nonlinearity not accounted for by the current model structure [see e.g., Zhang and Longden, 2007]? and, finally, Can the model be validated satisfactorily on data other than those used in its estimation (i.e., validation can be part of the DBM model identification procedure)? The main tools utilized for these model structure identification investigations are outlined below.
 1. Although there are many methods for identifying nonlinear stochastic systems in a TF form (see e.g., the review in Juditsky et al. ), the main approach used so far in DBM modeling is nonparametric, SDP estimation, as implemented by the sdp routine in the CAPTAIN Toolbox. This helps to identify the location and graphical nature of significant nonlinearities in the model preparatory to the parameterization of these nonlinearities at the next estimation stage.
 2. The most common measure of how well the model explains the data is the simulation coefficient of determination, , which is directly equivalent to the Nash-Sutcliffe efficiency [Nash and Sutcliffe, 1970] used in hydrology, and is defined as follows:
where is the mean-square value of the flow y(k) about its mean value (i.e., the overall or average variance in this heteroscedastic situation, where the short-term variance is changing markedly); and is similarly the mean-square value of the model simulation error . In the present context, other useful measures applied in the later example are the standard coefficient of determination R2, which is defined in the same manner as but based on the variance of the estimated prediction errors (see equations (11) and (12) in the previous section 2.2); and RT, which is defined as follows:
where denotes the magnitude (absolute value or modulus). This measure, based on the ratio of the mean magnitudes rather than mean squares, relatively pays more attention to the errors associated with the low flow modeling than .
 3. There are numerous model order identification criteria, but only the most common of these Akaike information criterion (AIC), Bayesian information criterion (BIC) and Young information criterion (YIC): Young [2011b, pp. 176–179]) are available in the CAPTAIN TF identification routine rivbjid. Basically, the AIC and BIC are based on a cost function that balances the variance of the residual errors with the number of parameters in the model; while the YIC is based on the properties of the instrumental product matrix [see Wellstead, 1978; Young et al., 1980], which plays an important part in instrumental variable estimation.
 4. The normal statistical measures used for evaluating serial and cross correlation are the autocorrelation and cross-correlation functions. These are used in the later example through application of the associated acf and ccf routines in CAPTAIN.
 These standard model identification procedures can be reinforced by other a posteriori estimation procedures, such as recursive time-variable parameter (TVP) estimation. This can help in diagnosing whether there are any signs of time variation or poor identifiability (where the recursive estimates tend to “wander” around, fluctuating widely with large uncertainty bounds and no clear signs of convergence: see e.g., Young [2010b] or [2011b]); or whether there is statistically significant temporal variation that might suggest the presence of undetected state dependency. An illustration of this is given in the later example of section 4.
2.3.2. Model Parameterization and Parameter Estimation
 The first step in DBM model estimation is the parameterization of any SDP nonlinearities identified at the previous identification stage of the modeling. Naturally, this will be dependent on the identified form of the nonlinearity but it could involve the use of some general method of curve fitting based on a combination of basis functions [Beven et al., 2012]; or the selection of a specific nonlinear function that appears to be most appropriate. In the example described in 2.2, for instance, there is only one nonlinearity, which is identified as a simple exponential-type function characterized by a single unknown parameter.
 Having fully parameterized the model structure, the parameters that characterize this structure have to be estimated in some manner. Considering the model (4a) , the elements of the model parameter vector need to be optimized over the estimation data set of N samples, i.e., using a selected optimization cost or criterion function. The criterion function that is most often used in hydrological modeling is specified as follows in terms of the response error, as defined in equation (10a) :
whilst, from a stochastic perspective, the alternative “prediction-error” cost function [see e.g., Ljung, 1999; Young, 1984, 2011b] is more appropriate and is specified as follows in terms of the prediction error defined in (10a):
Although optimization in terms of is considerably more complex, since it requires the joint estimation of the system and noise model parameters, it has the virtue that the optimality can be defined in a fully stochastic terms, such as maximum likelihood.
 There are many methods available for optimizing criterion functions such as and . For instance, in the later example of section 4, the standard MATLAB lsqnonlin optimization procedure, based on a “trust-region-reflective algorithm” [Coleman and Li, 1996], is used for initial estimation of the parameters in the HyMOD model, based on . This provides a benchmark for the later DBM and HI-DBM modeling studies, where the DBM model estimation employs the optimal iterative or recursive-iterative refined instrumental variable (RIV) algorithms [see Young, 2011b, and the references therein], which are based on a prediction error criterion function .
 These RIV algorithms have been used by the author and his colleagues in all previous unconstrained DBM modeling of rainfall-flow data and they are implemented in CAPTAIN by the rivbj routine for discrete-time model parameter estimation (an alternative rivcbj routine is available for continuous-time model estimation). However, the RIV algorithm is intended for unconstrained linear models and so, if the model is nonlinear or its parameters are constrained in some manner (e.g., to ensure that the model has real eigenvalues), it is necessary to develop and utilize a customized algorithm. In the case of nonlinear rainfall-flow models, for instance, the rivbj routine is embedded within a lsqnonlin routine, as discussed later in section 4.2.1.
 Of course, all estimation methods have their limitations and so it is important to be aware of such limitations and ensure that they do not prejudice the statistical inference. For instance, the recursive RIV algorithms can be usefully interpreted in Bayesian estimation terms and, indeed, they seem the very embodiment of Bayesian estimation if the stochasticity in the models can be considered in Gaussian or quasi-Gaussian terms (i.e., the stochasticity in the model can be quantified sufficiently by the first two statistical moments). Like the Kalman filter, however, they do not rely upon such Gaussian assumptions, and so provide statistically consistent parameter estimates in the face of non-Gaussian disturbances. However, if the stochastic disturbances are sufficiently non-Gaussian, then it may be advantageous to exploit alternative and more computationally expensive estimation methods that accommodate more general distributional assumptions, such as the numerical Bayesian-type methods that have received so much attention in the recent hydrological literature (e.g., the Differential Evolution Adaptive Metropolis (DREAM) procedure, based on a Markov chain Monte Carlo sampler: see Vrugt et al. [2009a] and Schoups and Vrugt ).
2.3.3. Model Validation
 Statistical analysis of the generic model (2) or HI-DBM models of the kind discussed in the next section, is normally carried out on two data sets: one used for model structure identification and parameter estimation; and one for validating the estimated model. For instance, in the example considered later in section 4, the entire set of 40 years' daily data is split into two segments: the first 30 years for identification and estimation; and the final 10 years for validation. The model is considered reasonably well validated if the explanation of the validation data is similar to that obtained on the estimation data set.
3. HI-DBM Modeling
 HI-DBM modeling is not only an extension of DBM modeling when elements of the DBM model are not considered fully satisfactory for some reason, it is also a way of investigating existing conceptual models to see if they can be improved using DBM models and modeling tools. A preliminary exercise of the latter type is described in Ratto et al. , which compares DBM and TOPMODEL [Beven and Kirkby, 1979] models of the River Hodder catchment in Northwest England. However, the DBM analysis, coupled with additional sensitivity analysis, was only used to influence the selection of priors used in the associated generalized likelihood uncertainty estimation (GLUE) procedure [Beven and Binley, 1992], and it did not include any structural modifications to either model, which is the core of the proposed HI-DBM analysis.
 In any specific application, HI-DBM modeling is applied to an existing set of input-output data, and it normally consists of three stages:
 A. One or more selected hypothetico-deductive conceptual models that have been proposed previously are investigated in detail, where necessary reoptimizing the model parameters in order to better understand the conceptual model characteristics and/or to remove any ambiguities in the previously published literature on the model.
 B. The same data used in stage A are then analyzed using standard DBM modeling procedures, and the validated DBM model is compared with the validated conceptual model in order to investigate the similarities and differences between the two model structures.
 C. Based on the results obtained in stage B, it is decided whether any elements of the existing conceptual model can be incorporated within the standard DBM model framework in order to improve its physical interpretability and/or whether the conceptual model can be modified in order to improve its structure and explanation of the data.
 It is the detailed examination and, if found necessary, structural modification of existing conceptual models that most differentiates the HI-DBM approach to modeling from standard DBM modeling practice, as used in previous references on the modeling of rainfall-flow processes, such as Young [2001a, 2003] and McIntyre et al. . The way in which the HI-DBM analysis proceeds is naturally problem dependent. However, the real rainfall-flow example described in section 4 provides a good illustration of how HI-DBM modeling can be employed in practice on a set of well-known data. In particular, the DBM modeling tools are used to investigate if the associated conceptual model has any deficiencies and, should this be the case, whether these deficiencies can be alleviated in some manner. In this sense, it is complementary to recent studies that have investigated the same data using alternative techniques [see e.g., Bulygina and Gupta, 2011].
4. An Illustrative Practical Example
 This example is based on the analysis of measured data from the humid Leaf River basin () located north of Collins, Mississippi, over the 40 water years from October 1948 to September 1988. These data consist of daily mean areal rainfall (precipitation, measured in mm/d), potential evapotranspiration (PET, mm/d), and streamflow (m3/d), which are available from the Hydrology Lab of the U.S. National Weather Service. A short section of the data is plotted in Figure 1, where the flow is shown with the same mm/d units as the other variables (obtained in the usual manner by conversion based on the area of the catchment). In the analysis below, the model structure identification and parameter estimation stage is based on the first 30 years of the data and validation is based on the last 10 years. These Leaf River data have been used in numerous studies (see references in the next section 4.1), and the data have been selected for study in the present paper because of this considerable attention that they have received in the past.
4.1. HyMOD Rainfall-Flow Model and Its Optimization Revisited
 The first stage of HI-DBM modeling is to investigate and fully understand the strengths and limitations of the conceptual model. The HyMOD conceptual model considered in this case is of a “Hammerstein” form, with an input nonlinearity that converts the measured rainfall into an “effective rainfall” measure that then provides the input to a fourth order linear dynamic system. This effective rainfall nonlinearity is based on the probability distributed model (PDM) hypothesis; namely, a procedure for soil moisture accounting and determining the value of runoff production according to a probability-distributed storage capacity model that is driven by the rainfall, r(k), and PET, Ev(k), inputs [see Moore, 2007]. For convenience from hereon, this nonlinearity will be referred to as the “PDM nonlinear model.” The linear part of the model is a serial connection, or “Nash Cascade” [Nash, 1958], of nq identical, “quick-flow” storage tanks, each modeled as a first-order discrete-time system, in parallel with a single “slow-flow” tank, again modeled as a first-order discrete-time system. In previous publications on the HyMOD model , where each of the three quick-flow tanks is conservative (unity steady-state gain) and characterized by a parameter Rq that defines a time constant (residence time) Tq; while the similarly conservative slow-flow tank has a parameter Rs that defines a time constant Ts.
 Later in this study, it was discovered that two versions of HyMOD had been developed and used, although this is not clear from the published literature. The first version is an earlier MATLAB implementation of HyMOD written by Jasper Vrugt, Thorsten Wagener, and Douglas Boyle; while the second version was introduced later by Jasper Vrugt. As far as the author can establish, the precise details of this model have not been published in the open literature. In order to ensure that there is no doubt which model is being used in the present paper, therefore, the MATLAB code for the main model loop is listed in the Appendix. This shows that the only difference between the two versions are two equations that define the evapotranspiration loss and then update the internal state of the soil moisture accounting model based on this.
 Both versions of the model have a parameter vector , where the five parameters are: the maximum storage capacity of the watershed ; the spatial variability of the soil moisture capacity, ; the partitioning factor between the quick and slow-flow tanks, ; the parameter that defines the time constant of each quick-flow tank, ; and a similar parameter for the slow-flow tank . In the above references, and are called “residence times,” but the residence times or time constants (with the normal units of time, here days) are actually obtained by converting the tank equations to continuous time, using the conversions and , respectively, where the sampling interval day in the case of the Leaf River data (these conversions assume that the sampled variables are constant over each sampling interval: the “zero-order hold” assumption).
 A number of different optimization and estimation procedures have been utilized previously to estimate the parameters in the HyMOD model, for instance, the shuffled complex evolution (SCE-UA) method of Duan et al.  applied to HyMOD by Misirli et al. ; the related shuffled complex evolution metropolis (SCEM-UA) in Vrugt et al. ; the multiobjective complex evolution (MOCOM-UA) algorithm in Wagener et al. ; the Bayesian recursive estimation (BaRE) algorithm in Thiemann et al. ; the Dual ensemble Kalman filter (Dual EnKF) in Moradkhani et al. [2005a]; the PF in Moradkhani et al. [2005b] and Smith et al. ; and the DREAM method mentioned previously [Vrugt et al., 2009a]. In the present paper, the parameters are optimized over the whole 30 years of the estimation data set, i.e., , where , using the standard lsqnonlin optimization procedure in MATLAB (see previous section 2.3.2). Following from most of the previously published papers on the HyMOD model, this is applied using the deterministic response error criterion function defined by (15).
 The estimated version 1 HyMOD model parameters, as obtained from lsqnonlin with all the model states initialized to zero (i.e., [IC(1) IC(2) IC(3)]; see Appendix), are as follows, where the values are rounded to three significant figures and the figures in parentheses are the estimated standard errors:
These and estimates correspond to time constants of days and a that is very large. The optimization was initialized with the following a priori parameter estimates and bounds (shown in parentheses):
The associated ARMA noise model is identified by the BIC as an ARMA (2,5) process, with the parameters estimated as:
The average variances of the simulated output error and the residual prediction error e(k) are and , respectively, yielding , and . When this estimated model is applied to the validation data set, these coefficients of determination are improved to , , and , so the validation performance is a little better than the estimation performance.
 A range of other model structures were considered for both versions of the HyMOD model using different initial conditions on the model parameters and states. A number of the results are reported in the supporting information but Tables 1 and 2 show the most important of these. In particular, Table 1 compares the parameter estimates, rounded to three significant figures, with the estimated standard errors shown in parentheses; and Table 2 compares the various coefficients of determination for the estimation and validation data sets. The above estimation results are given in the first row of each table, while the second row shows the estimation results obtained for the same version 1 HyMOD model, kindly generated by Jasper Vrugt using the DREAM optimization software tools mentioned previously in section 2.3.2. The third and fourth rows in each table provide the equivalent results obtained with version 2 of the model.
Table 1. HyMOD Model Parameter Estimation Results
Vrugt (personal communication)
Vrugt (personal communication)
Table 2. Comparison of HyMOD Model Performance
Vrugt (personal communication)
Vrugt (personal communication)
 Figure 2 compares the simulated deterministic outputs, as generated by three of the estimated models, with the measured flow, over a small, 3 month segment of the data. These are the best performing version 1 model (as noted above, this is virtually identical to the model obtained by Jasper Vrugt but this is not shown in the plot because the plots are so similar); and the best performing version 2 models. The flow-dependent 95% confidence bounds shown in the plot (twice the standard error bounds) relate to the HyMOD model with parameter estimates (17). These are computed using the information from equation (9), where the coefficient estimates in this case are and . This same approach is used on subsequent response plots of this type.
 The estimated standard error bounds shown in Table 1 should be considered with caution. The estimation of such bounds depends on the underlying estimation theory and some of the theoretical assumptions are contravened in this example. First, the residuals should be normally distributed, serially uncorrelated and monoscedastic, none of which apply in this case. Second, it is assumed that the parameters are sensibly constant over the very large estimation sample set when, as we shall see, there is clear evidence that this is not the case.
 On a practical point, note that the full estimation results suggest that there are multiple minima in the criterion function-parameter hypersurface with similar minimum cost values and this is confirmed graphically by the projection of the hypersurface in three dimensions. While this did not cause any optimization problems in this example because the optimization takes only a few seconds, it would probably be advantageous, in general, to use a computationally more demanding sequential Monte Carlo-based method of optimization, such as DREAM, because this is much less prone to finding local optimal solutions, provided the global minimum lies within the specified prior bounds.
4.2. DBM Rainfall-Flow Model
 The second stage in the HI-DBM modeling procedure is standard DBM modeling based on the Leaf River data. As mentioned in 2, the DBM model for this example is identified as one of the simplest cases of the general SDP DBM model (2). In this section, the identification and estimation of this simple model (4a) is considered in more detail, starting with the initial identification of various model structures and continuing with the more detailed estimation and examination of specific model structures that have potential in the subsequent HI-DBM modeling.
4.2.1. Initial DBM Modeling
 The first step in DBM modeling is to evaluate the data in purely data-based modeling terms. Here, initial SDP modeling tends to confirm previous DBM rainfall-flow applications and suggests a “Hammerstein” model structure with an “effective rainfall” input nonlinearity in series with a linear TF model. Not surprisingly, this identified model structure is consistent with the HyMOD model, although the component parts are not the same. In particular, as pointed out in 2, the nonlinearity exploits the flow y(k) as a surrogate measure of soil moisture, so that the model is only suitable for forecasting applications and cannot be used for simulation. Also, it should be noted that the SDP estimation suggests there may be some additional state dependency in the parameters of the denominator polynomial of the TF model that control the recession dynamics. This indicates the possibility of additional nonlinearity or nonstationarity in the model, and this is considered in later sections of the paper.
 More specifically, the effective rainfall nonlinearity is parameterized as , where γ is a constant parameter, and the full DBM model structure is identified in the form:
where and are constant parameter polynomials, identified over a range of different orders n and m; while the additive, heteroscedastic noise, , is identified as an ARMA (1,5) process, i.e.,
in which e(k) is a zero mean, serially uncorrelated sequence of random variables with changing variance . Note that the scale factor β is not optimized in the model, it is simply introduced so that, over the data set of N samples, the amount of flow matches the amount of effective rain, provided both are measured in the same units (here, mm/d). In this manner, the steady-state gain [see e.g., Young, 2011b] of the linear TF part of the model will be near to unity and, within the estimated uncertainty bounds, the model will be conservative in mass-balance terms.
 The identification and estimation of the complete model (20), (21) is carried out using an optimization procedure in the form of a MATLAB m-file in which, at each update of the parameter estimates, the lsqnonlin optimization routine in the MATLAB optimization Toolbox is used to estimate the γ parameter of the nonlinear input function and the CAPTAIN routine rivbjid, embedded within the optimization, is used to estimate the TF model parameters, given the defined by the latest estimate of γ. Furthermore, the rivbjid algorithm is constrained so that the polynomial has physically meaningful real roots (for details, see http://captaintoolbox.co.uk/Technical_Matters/ Technical_Matters.html).
 The full details of the DBM model identification for different order TF models are given in the supporting information, but the most interesting models are discussed in the next two subsections. Since all models considered in this identification analysis use the same exponential form of the effective rainfall nonlinearity, the simple triadic form defining the deterministic linear TF part of the model is used to differentiate them. Note also that the pure time delay δ in (20) is identified as zero in all cases. Indeed, a number of the estimated models have , so that there are “instantaneous” effects [see e.g., Young, 2011b], i.e., the flow is affected by rain falling during the same day.
4.2.2. Finite Impulse Response Model
 The first, rather unusual but very revealing model is the 100th order finite impulse response (FIR) model relating the effective rainfall to y(k). This is a special [0 100 0] TF model with the denominator polynomial set to unity: i.e.,
 It is called FIR because equation (22) is the discrete-time equivalent of the convolution integral equation for a dynamic system, with , representing an FIR. Indeed, because of this, it may be considered as a “nonparametric” estimate of the impulse response. This is an interesting model to consider first when, as here, we have a large amount of data that can justify the estimation of such a large number of parameters. In particular, it is easy to estimate by linear least-square estimation, and it provides a useful benchmark against which to evaluate lower-order models.
 In this case, the FIR(100) model explains the Leaf River data well when compared with the optimized HyMOD model, with for estimation and for validation. Moreover, the unit impulse response (i.e., unit hydrograph: the response to an input of 1 mm of effective rainfall in one day) response of this model can be explained very well by numerous lower-dimensional TF models. One of the best explanations is provided by the simple, first-order TF model, as shown in Figure 3. This is an interesting result because it shows that the fast response can be modeled very well by the seven coefficients in the TF numerator polynomial, with the very low amplitude slow response modeled by the first-order denominator polynomial, with corresponding to a decay time constant of 31.9 days, within a standard error range of between 28.4 and 36.2 days. Moreover, the response does not contain the second-order exponential recession that normally characterizes the faster response of many hydrographs and is usually modeled by a second-order process, with quick and slow dynamic modes (see any of the references concerned with past DBM rainfall-flow modeling in 1).
 Finally, it is interesting to note that the more symmetrical fast response seen in Figure 3 is redolent of higher order, serial connections of first-order dynamic processes and is probably the reason why the Nash Cascade of three quick-flow stores had to be used in the HyMOD model, rather than the more usual, single store. In fact, if the number of quick-flow tanks in version 1 of the HyMOD model is increased to , the estimated model parameters obtained with the initial states set to IC = , are as follows:
and it explains the flow data much better, with and , not much different from the DBM models. Further increase in nq has little effect and there is a rapid drop in for . As expected in relation to the original HyMOD model, each of the 10 quick-flow tanks now has a smaller estimated time constant of days and the estimated time constant of the slow-flow tank is also much reduced to days. Furthermore, all of the estimates seem well defined, without any signs of poor identifiability, suggesting that the previously utilized HyMOD model structure for the Leaf River data might be best modified to this alternative, higher-order form. However, this modified HyMOD model is clearly now an HI-DBM model, since the modified structure has arisen directly from the DBM analysis. Consequently, further consideration of the model is delayed until section 4.3 which deals with HI-DBM modeling.
4.2.3. Other Nonlinear DBM Models
 The rather unusual response characteristics of the FIR model are confirmed by the other DBM models using different order TF models (see the supporting information). For instance, the DBM model with a TF having a seventh-order numerator and third-order denominator (i.e., [3 7 0]) is favored with the lowest BIC of −8863, accompanied by good estimation and validation statistics, with for estimation and 0.899 for validation; while the more parsimonious model has similar statistics, with BIC = −8814, with estimation and validation values of 0.871 and 0.901, respectively. In the present context, however, the conceptual HyMOD model structure, when considered in TF terms, has a fourth-order numerator and denominator and the equivalent DBM model also performs quite well, with BIC = −8704 and values for estimation and validation of 0.863 and 0.891, respectively.
 Finally, Figure 4, compares the simulated flow response of various estimated models over the same portion of the data considered in Figure 2. Except for the second flow event, all of the models explain the flow peaks quite well; and only the [2 3 0] model fails to reproduce the low, “base flow” effects satisfactorily.
4.3. HI-DBM Modeling
 The initial DBM modeling discussed in the previous section is fairly objective and makes no major prior hypotheses about the model form other than assuming that the generic, SDP, TF model class of equation (4a), with single-state and single-valued SDP nonlinearities (see 2), can provide an appropriate characterization of rainfall-flow dynamics. Consequently, it provides the basis for flow and flood forecasting, as discussed by the references cited previously in 1. More importantly in the present context, it has also thrown valuable light on the overall dynamic behavior of the rainfall-flow processes in the Leaf River catchment that can form the basis for the HI-DBM modeling.
 Here, we will consider the simplest and most obvious HI-DBM model, which is obtained by replacing the effective rainfall nonlinearity in the DBM model by the conceptual PDM effective rainfall mechanism in the HyMOD model. In other words, the effective rainfall is generated by this mechanism, which involves both rainfall and PET inputs, rather than the SDP relationship in equation (20). As such, this change not only introduces a conceptual nonlinear mechanism that has a clearer hydrological interpretation than the SDP nonlinearity, but it also converts the DBM model from one that is limited to forecasting applications, into a stochastic simulation model that has a much wider application potential.
 Based on these considerations and the DBM modeling results obtained in the previous sections, the general HI-DBM model has the following structure:
where the is the subset of HyMOD parameters that define the PDM nonlinear model. On this basis, it is possible to look at various possible HI-DBM models that vary only in terms of the linear system TF and associated ARMA noise models. Although in the DBM modeling analysis the constrained TF model does not yield quite as good estimation and validation results as the unconstrained model, it is, perhaps, more appealing to hydrologists because it has a structure that is similar to that used in the HyMOD model. For this reason, we will look in more detail at the results obtained for this model, but refer back subsequently to the model, as well as the 10th-order HyMOD model mentioned in section 4.2.2.
 The estimated HI-DBM model with TF model takes the form:
In the case where the initial states of the version 1 PDM nonlinear model (see section 4.1) are set to zero, the optimized estimates of the parameters are as follows:
while the associated ARMA(1,5) noise model parameter estimates are:
yielding and . In order to allow for direct comparison with the HyMOD model, the TF model can be decomposed by continued fraction expansion (the residuez routine in MATLAB) and block diagram manipulation into a parallel connection of quick, , and slow, , flows with the output flow y(k) then generated by:
and the four-parameter FIR “prefilter” . In HyMOD terms, and , while the associated quick and slow-flow time constants are and days, respectively; and the partitioning factor is . A block diagram of the model is shown in Figure 5, which has clear similarities with the HyMOD model diagram in Moradkhani et al. [2005a].
 The only difference between the structure of this TF model and the linear part of the HyMOD model is that the effective rainfall from the PDM nonlinearity is modified by the simple prefilter . This operates on the PDM-generated effective rainfall input to produce the modified effective rainfall , so it is simply introducing a short-term refinement of in order to better explain the flow data. And the fact that this explanation is significantly better than the standard HyMOD model suggests that this is one area where the HyMOD model hypothesis might be reinvestigated. For example, Monte Carlo analysis shows that the gain of , as defined by the sum of its parameters (1.053), is insignificantly different from unity, so the transformation is having little statistically significant effect on the magnitude of the effective rainfall. On the other hand, it introduces a significant short-term lag effect, suggesting possible limitations in the dynamic characteristics of the PDM nonlinearity.
 Tables 3 and 4 compare the HI-DBM model (28)–(29) with its version 2 equivalent, as well as the two corresponding HI-DBM models incorporating a unconstrained TF model. They also show the comparative results for the 10th-order HyMOD model introduced in section 4.2.2, which is of the standard HyMOD form but with the quick-flow pathway modeled as a Nash Cascade with 10 first-order elements, rather than the usual three. In all cases, the initial PDM states are set to zero since the estimates are not too sensitive to these initial conditions. As expected, all six models explain the flow measurements better than the standard HyMOD model, and, in general, the version 2 models perform a little better than version 1.
Table 3. Comparison of HI-DBM Model Parameter Estimation Results
Table 4. Comparison of HI-DBM Model Performance
 In this latter regard, it is interesting to note that the effective rainfall generated by version 2 of the HI-DBM model is visually quite similar to that generated by the DBM model. This is confirmed by the instantaneous cross-correlation coefficient of 0.93 between the two series. Also, for all , the plot of the against y(k) for this HI-DBM model can be approximated, using least-square optimization (see the supporting information), by the SDP relationship of the DBM model (20), with the estimated . This is virtually the same as the estimate of obtained in the DBM model case, so it is not surprising that the performance of the two models is quite similar.
 From Table 4, we see that there is not too much to choose between and models, but the former is marginally worse on most of the statistical measures. However, both versions of this model fit the initial peak in the segment of the data we have been considering previously better than any of the DBM models. In fact, the version 1 model is a little superior in this regard, as shown in Figure 6. Here the 95% confidence bounds are computed as described before in 2, but now the regression coefficients in equation (9) are and .
 The 10th-order HyMOD models in the final two rows of Tables 3 and 4 are interesting because they perform only a little worse that the model and can be considered as a direct HI-DBM extension of the standard HyMOD model. The main reason for this close similarity can be seen if the impulse response characteristics of the quick-flow pathways are compared, as shown in Figure 7 for the version 1 variants, where the solid line is the average of the two impulse responses. It is clear that these plots are quite similar: in other words, the 10th-order Nash Cascade response mimics the quick-flow pathway of the rather well and the two models are dynamically quite similar. Despite its marginally inferior performance, when compared with the other HI-DBM models, hydrologists might feel that this HI-DBM HyMOD model, or its version 2 equivalent, has conceptual advantages that outweigh its performance inferiority, particularly in situations where the physical interpretation of the model is considered of primary importance.
 Finally, an important step in stochastic modeling is the examination of the residual correlation statistics. In all of the HI-DBM models reported in Tables 3 and 4, the estimated heteroscedastic residual series is uncorrelated serially but, while the cross correlation between and both r(k) and Ev(k) is very low, it is significant in both cases, suggesting that the models could be improved in this regard.
4.3.1. Are the HI-DBM Model Parameters Changing Over Time?
 The rivbj algorithm in CAPTAIN automatically produces recursive estimates, and these can help in diagnosing whether there are any signs of time variation, poor identifiability, or aberrations in the data [see Young, 2011b]. Figure 8 shows the recursive estimates of the system model parameters obtained for the version 1 HI-DBM model, under the assumption that the parameters are constant over the sampling interval and starting with a “diffuse prior” (initial estimates set to zero and associated diagonal covariance matrix with elements of 105). The convergence of these estimates is quite slow but the final estimates at the end of the estimation data set are, as required, equal to the estimates obtain by nonrecursive rivbj estimation: i.e.,
However, while there are no signs of poor identifiability (see section 2.3.1), there are quite sharp “jumps” that occur at some dates and are associated with statistically significant events in the recursive “innovations” process [see e.g., Young, 2011b]. These indicate the possibility of parameter variation caused by either sharply changing dynamic characteristics or aberrations in the data, such as a changing time delay δ, or “disinformation” [see Beven et al., 2011] caused by major errors in the rainfall and flow measurements.
 In order to investigate the possibility of TVPs further, we can resort to full TVP estimation using the dtfm routine in the CAPTAIN toolbox. This recursive algorithm exploits recursive fixed interval smoothing estimation [see Young, 2011b, and the references therein] to provide “smoothed” estimates of the TVPs, in which the estimate at sample k is based on all N samples of the data. This has the advantage that the estimation variance is reduced and the estimate is not lagged by the filtering aspects of the standard recursive TVP estimation algorithm, so that we see changes in the estimates at the time that they occur. The degree to which the algorithm “tracks” the parameter variations is controlled by a vector of noise-variance ratio (NVR) parameters [Young and Pedregal, 1999]. This defines the variance of the white noise input to a simple stochastic model of the parameter variations in the form of a vector random walk or integrated random walk and is optimized by the dtfmopt routine in CAPTAIN.
 Consider again the HI-DBM model, where the estimated constant parameters are given in (30). The NVR vector is optimized with the first element constrained to zero, so ensuring that the TF denominator polynomial coefficient is maintained at its estimated value of −0.882, while the NVR elements associated with seven coefficients of the numerator are constrained to be equal and optimized by maximum likelihood to 0.188. In this way, the attention of the estimator is concentrated on the effect of the numerator parameters which multiply the present and past values of the effective rainfall signal. Since this optimized NVR allows for very rapid changes in these parameters, the resulting TVP model explains the measured flow almost perfectly, with .
 Figure 9 (top) shows the recursive TVP estimate of the largest numerator polynomial parameter over a segment of the data, with the estimated standard error bounds shown in gray. The events marked with a red spot are those where the change in the estimated parameter from the constant parameter estimate (in this case 0.149) is greater than twice its estimated standard error bound, suggesting that a significant change in the parameter is necessary to fit the data at this location. Figure 9 (bottom) shows the location of these events on the measured flow data, again as a red dot, as well as the flow output of the constant parameter model and both the measured and effective rainfall. Each of these events (often defined by a group of the markers) can be inspected in greater detail in order to see if the change in the parameter estimates can be attributed to obvious measurement errors or is more likely to be model dependent.
 From the above results and others (see the supporting information) based on rectangular-weighted-past estimation [Young, 1984; Wagener et al., 2002; Young, 2011b], as well as similar results for the HI-DBM , and standard HyMOD models, we can conclude that the model parameters need to vary quite substantially in order to maintain a good explanation of the measured flow over the whole estimation data set.
5. A Simulation Example
 The analysis and results discussed in the previous section all relate to the measured Leaf River data set. While this analysis shows that the HI-DBM modeling approach can yield models that are well validated and provide a reasonable explanation of the data, it does not fully confirm the efficacy of the approach because we do not know the nature of the hypothetical “true” system. The more objective evaluation of the HI-MOD approach considered in the present section is obtained by considering a simulation example in which data are generated by a fully specified HyMOD-like model, using the measured rainfall r(k) and evapotranspiration Ev(k) inputs from the Leaf River example and adding known stochastic noise to the output flow measurement. The HyMOD model is then modified to see if the HI-DBM modeling approach is able to discover the nature of these modifications without access to any prior information in this regard.
 The stochastic simulation model used for this evaluation exercise is the HyMOD model with the following parameters:
but where the quick-flow pathway is changed so that the three quick-flow storage tanks are replaced by a single storage tank with a parameter (time constant days); and another storage tank is added in series with the parallel connection of this quick-flow tank and the slow-flow tank (time constant days), where the parameter (i.e., a time constant of days). In addition, highly colored, heteroscedastic noise, based on the kind of noise models identified in the previous modeling exercises, is added to the flow output from this model, where the heteroscedastic ARMA (2,5) model used in this case has the parameters:
and the noise has an average variance of . A segment of the data produced in this manner is shown in Figure 10. As can be seen from Figure 10 (bottom), the additive colored noise level is quite high with a noise/signal ratio, by standard deviation, of 38% in relation to the noise-free data. Of course, because the heteroscedasticity is related to the squared flow (see earlier comments in 2), the noise is much higher around the peak flows. So it is clear that this full stochastic model presents a quite formidable estimation problem.
 The DBM identification stage of the HI-DBM analysis identifies the correct [3 2 0] TF model structure with an ARMA(2,5) noise model, which is the structure that supports the linear TF part of the simulated model:
The resulting estimation based on this model structure generates the following estimated parameters for the deterministic system part of the model:
This decomposes into the correct model structure, with estimated time constants of 90, 1.39, and 4.52 days, respectively, and a steady-state gain of 1.01. The associated ARMA(2,5) noise model parameter estimates and coefficients of determination are
The system model parameter estimates are quite good, so it is not surprising that the deterministic model explains the simulated noisy data quite well, as indicated by the and RT statistics in (35); and this is extended to the full stochastic model, where .
 In this simulation environment, however, we are able to investigate the performance further and examine to what extent the deterministic output of the model matches the true simulated deterministic output x(k). Here, the performance is exceptional, with the associated : in other words, the model is reproducing the simulated deterministic output almost exactly, even though it was estimated from flow observations that are contaminated with a quite high level of heavily colored, heteroscedastic noise. This is illustrated graphically in Figure 11, which compares (full line) with x(k) (small circles) and the noisy data y(k) (black dots).
 Finally, the deterministic output of the optimized standard HyMOD model, with three quick-flow storage tanks, has an associated high , in relation to the noise-free simulated output, and is shown as a dashed line with small crosses in Figure 11. The standard model also explains the noisy measured data quite well, with , and this is all that would be available in the case of real data, where the noise-free data are obviously not available. Although, arguably, an experienced and diligent conceptual modeller might question the model, it is quite likely that the hypothesis of the standard HyMOD model would be accepted, based on these quite good results, even though it has a structural deficiency. This is in contrast to the HI-DBM model, where the initial inductive analysis correctly identifies the true structure, so removing most of the ambiguity in this regard.
 When evaluating the above results, one caveat is necessary: it has been assumed that there are no errors in the rainfall measurements, and this is not likely to be the case with real data. In other words, like most rainfall-flow modeling studies in hydrology, the example has not considered the input “errors-in-variables” problem (see the supporting information for further discussion on this, as well as the related problems of inverse estimation and the separation of aleatory and epistemic uncertainty). And even where this problem has been considered [see e.g., Kuczera et al., 2006; Vrugt et al., 2009b; Kirchner, 2009], the basic ambiguity that exists when there are both input and output noise effects present on the data [see e.g., Söderström, 2007, and the references therein] has not been addressed fully. So there is a clear need for more theoretical and practical research on this problem before a satisfactory solution is obtained.
 Recent developments in the DBM approach to modeling environmental systems [Young and Ratto, 2009, 2011] have considered how an “emulation modeling bridge” can be constructed between a high-order, computationally intensive, computer simulation model and a low-order DBM model. The present paper considers the development of another, hypothetico-inductive (HI-DBM) bridge, this time between a simpler, conceptual model, derived in a hypothetico-deductive manner, and a DBM model identified inductively from the same time-series data. This HI-DBM approach serves two purposes: first, it is a rather obvious extension to standard inductive DBM modeling, where the possibility of correcting any limitations of the DBM model by the introduction of conceptual elements, is investigated; second, it provides a means of investigating the conceptual model, diagnosing any limitations and suggesting possible modifications that will improve its simulation modeling and predictive abilities.
 Since HI-DBM analysis will be very problem dependent, the paper is dominated by a rainfall-flow modeling example based on the well-known HyMOD conceptual model of the extensive Leaf River data set. Although these data are rather atypical, this example has the advantage that both the data and the HyMOD model have received considerable attention in the hydrological literature, so that the results obtained here can be compared with those obtained in these previous studies. The present results lead to the conclusions listed below.
 1. The initial DBM modeling, that is such an important part of HI-DBM modeling, is useful for extracting information from the data, without any major preconceptions about the model structure and with good computational efficiency when compared with computationally intensive methods. For example, the analysis can quickly reveal unusual dynamic model structures, such as the [1 7 0] HI-DBM model and the 10-tank HyMOD model in the example of section 4; structures that are capable of efficiently characterizing this information and yet would be unlikely products of conceptual model formulation. Also, the diagnostic information arising from the HI-DBM model's absorption of elements from the conceptual model can add to the diagnostics obtained from the other available methods of structural estimation and error analysis applied directly to these conceptual models, such as those mentioned in section 2.3.1.
 2. The initial DBM analysis in section 4.2.2 of the paper indicates that the explanatory ability of the standard HyMOD model of the Leaf River data can be improved, without any signs of poor identifiability, if the number of identical quick-flow tanks is increased from 3 to 10, without any increase in the number of model parameters. This arises mainly because the quick-flow dynamics of the Leaf River are rather atypical and so better characterized by a higher-order Nash Cascade.
 3. The DBM model outperforms the standard HyMOD model but the input effective rainfall nonlinearity of the DBM model can be replaced by the PDM conceptual model in HyMOD without any significant degradation in the model performance. The resulting two HI-DBM models that are discussed in detail perform reasonably well: over the validation data set the deterministic part of the models are able to explain between 87.8 and 89.7% of the flow variance in the validation data; and the full stochastic model predictions provide a respectable 92.7% to 94.8% explanation of the validation flow variance. So they should have reasonable practical potential.
 4. Of the first two HI-DBM models investigated in detail, the fourth order one is most similar in structure to the HyMOD model and it performs almost as well, overall, as the rather unusual alternative. Interestingly, the HI-DBM modified HyMOD model (see 1. above) mimics well the dynamic behavior of the other two HI-DBM models and, for some hydrologists, its somewhat poorer performance may be compensated by its closer relationship with the standard HyMOD model.
 5. The improved performance of the HI-DBM models, in comparison to that of the standard HyMOD model, arises mainly from the linear routing part of the model, where the TF model replaces the parallel connection of quick- and slow-flow storage tanks used in HyMOD. When considered in Nash Cascade terms, this TF model effectively introduces a simple prefilter that modifies the output of the PDM nonlinear model. This suggests either that it may be possible to improve the PDM nonlinear model so that it is able to absorb and reproduce the effects of this prefilter; or that the TF model is simply a better representation of the routing dynamics than the Nash Cascade. The latter supposition is supported by the good performance of the HI-DBM modified 10-tank HyMOD model which, as shown in Figure 7, mimics the dynamic behavior of the TF model.
 6. The possible limitations of the PDM nonlinear model are supported by correlation analysis of the stochastic model residuals, as well as TVP and SDP estimation. The correlation analysis shows that the residuals contain components that are significantly correlated with the PET and rainfall, suggesting that the nonlinearity may not be exploiting fully the information in these measured variables. The TVP estimation in section 4.3.1 and the supporting information points to flow events that are not explained well by the constant parameter model, and this could be due, at least partially, to limitations in the PDM generated effective rainfall. And, finally, although SDP estimation indicates that there is residual, unmodeled state dependency in the HI-DBM model, probably due to unmodeled nonlinearity in the flow routing (see the first paragraph of section 4.2.1), it could again suggest inadequacy of the effective rainfall nonlinearity. It is clear, therefore, that both the PDM nonlinear model and the input rainfall need to be investigated further in order to correct any identified deficiencies. In this regard, the modifications to the PDM suggested recently by Bulygina and Gupta  may be relevant, although their modified HyMOD model does not explain the data nearly as well as the HI-DBM models (see the supporting information).
 7. Given the possible limitations of the PDM nonlinear model, the HI-DBM analysis could be repeated quite easily using other conceptual effective rainfall mechanisms, such as those discussed by Wagener et al. . Bearing in mind the comment about the possibility of additional state dependency in the parameters of the denominator polynomial of the TF model (see 6.), it is probably worth evaluating also whether the introduction of nonlinear elements into the linear, flow routing part of the model is worthwhile. Consequently, research is proceeding to investigate further both the unconstrained HI-DBM and the HI-DBM modified HyMOD models, as well as applying the HI-DBM analysis to data from a more typical catchment.
 8. Finally, if an HI-DBM model is to be used for forecasting, then further analysis is required to evaluate the different models in this context. However, the stochastic HI-DBM model with an ARMA(1,5) noise model, or the similar stochastic version of the HI-DBM HyMOD model, probably provide the best basis for simple implementation. For example, if estimated with a “false” time delay of δ = 1 day, as required to give reasonable 1-day-ahead ex-ante forecasts, then the former model still explains the data well, with and . However, the conclusion in section 4.3.1 that the model parameters need to vary quite substantially over the estimation data set in order to maintain a good explanation of the measured flow data suggests that, when used in any forecasting applications, the HI-DBM models should be implemented in a real-time parameter-adaptive form. An initial implementation of this kind using the former model has produced promising results, as shown in section 9 of the supporting information.
 The MATLAB code for the main loop in the HyMOD model used in the present study is presented verbatim below. This code is based mainly on a version of HyMOD attributed to Thorsten Wagener, Jasper Vrugt, and Doug Boyle, dated 2004. There are various minor editing changes introduced by Paul Smith in 2005, and the present author recently added the option for version 2 based on communications with Jasper Vrugt.
x_loss=IC(1);% Initialize moisture accounting
x_slow=IC(2);% Initialize slow tank state
x_quick(1:nq,1)=IC(3);% Initialize state(s) of quick tank(s)
SimFlow=repmat(NaN,[MaxT 1]); % set up storage matrix for the flow
vers1=1;%selects Version 1; 0 selects Version 2
t=1; %Initialise time counter
while t<MaxT+1 % Begin Model Loop, time step by time step
% Compute excess precipitation and evaporation
xn_prev=x_loss; % New state is state of last time-step
% Partition UT1 and UT2 into quick and slow flow component
UQ=alpha*ER; % Quickflow contribution
US=(1-alpha)*ER; % Slowflow contribution
% Route slow flow component with single linear reservoir
% Route quick flow component with 3 linear reservoirs in series
% Compute total flow and effective rainfall for timestep
end %End Model Loop
 I am most grateful to Jasper Vrugt, Keith Beven, and Paul Smith for their stimulating and very useful comments on the paper. This does not mean that they necessarily share all the views stated in the paper, and I am, of course, responsible for any errors or omissions. I am also grateful to the anonymous referees whose comments helped to improve the content and focus of the paper.