## 1. Introduction

[2] Most dynamic models of hydrological systems are based on the “hypothetico-deductive” approach to scientific investigation, as advocated most powerfully by Karl Popper [*Popper*, 1959] and others. Within the hydrological modeling context, this is normally reflected by hypotheses in the form of “conceptual” models, normally based on scientific reasoning and past experience, which the hydrologist believes are best able to mimic the physical nature of the hydrological system. These are usually formulated in terms of dynamic equations, derived from concepts such as mass and energy conservation and estimated (calibrated or optimized) in relation to measurements of the relevant hydrological variables.

[3] As the present author [see e.g., *Young*, 2002, 2011a, 2011b] has argued, this hypothetico-deductive philosophy is largely a creature of the twentieth century and is linked strongly with laboratory science and the ability to carry out planned experimentation. Prior to this, the alternative “inductive” approach was the cornerstone of scientific investigation or “natural philosophy,” as it was referred to at the time. Here, the model or “theory of behavior” was inferred directly from observations of the system under study (often naturally occurring phenomena), without undue prejudice based on prior hypotheses. Indeed, the most famous natural philosopher of all, Isaac Newton, stated that “*hypotheses nonfingo*” or “*I frame no hypotheses*” [see e.g., *Cohen*, 1962].

[4] As *Young* [2011a, 2011b] points out, environmental modelers do not often have the ability to carefully plan and conduct experiments on the hydrological systems that they are studying. And even when planned experimentation is possible, as in the case of tracer experiments for solute transport and dispersion modeling [e.g., *Wallis et al*., 1989], or instrumented catchments in the case of rainfall-flow models [e.g., *McIntyre et al*., 2011], the experiments are much more difficult and costly to plan and constrain than those carried out in the confines of a laboratory. Moreover, many aspects of the system are difficult to observe and so form the basis for hypothesis formulation. With these considerations in mind, inductive modeling has attractions because it can produce a model form that efficiently parameterizes the observational data, without the constraints and possible prejudices of prior hypotheses.

[5] Data-based mechanistic (DBM) modeling is predominantly, but not exclusively, an inductive approach to modeling that harks back to the era of natural philosophy. It recognizes that, in contrast to most man-made dynamic systems, the nature of many natural systems, particularly at the holistic or macrolevel (global climate, river catchment, macroeconomy), is still not well understood. “Reductionist” approaches to modeling such systems, based on the aggregation of hypothetico-deductive models at the microlevel, or the application of microscale laws at the macrolevel, often results in very large simulation models that suffer from “equifinality” [*von Bertalanffy*, 1968; *Beven*, 1993] and are not fully identifiable from the available data.

[6] Although the term “DBM modeling” was first used in *Young and Lees* [1993], the basic concepts of this approach to modeling dynamic systems have been developed over many years. For example, they were first applied seriously within a hydrological context in the early 1970s, with application to the modeling of water quality and flow in rivers [*Young and Beck*, 1974; *Beck and Young*, 1976] and set within a more general framework shortly thereafter [*Young*, 1978]. Since then, they have been applied to many different systems in diverse areas of application from ecology, through engineering to economics [see e.g., *Young*, 1998, 2006, 1998, 2011b, and the references therein].

[7] Interestingly, one of the first applications that led to the idea of DBM modeling was to the modeling of rainfall-flow processes [*Young*, 1974; *Whitehead et al*., 1976]; and this was generalized later in an example considered first in *Young* [1993] and then in *Young and Beven* [1994]. This led to numerous examples that have demonstrated the utility of DBM modeling applied to rainfall-flow processes [see e.g., *Young*, 1998; *Lees*, 2000; *Young*, 2001a, 2003; *Ratto et al*., 2007; *Chappell et al*., 2006; *Ochieng and Otieno*, 2009; *Young*, 2010a, 2010b; *Beven et al*., 2012; *McIntyre et al*., 2011, and references therein], including catchments affected by snow melt, where the nonlinearities are more complex [*Young et al*., 2007].

[8] The standard DBM modeling procedures normally produce an efficiently parameterized stochastic model (parsimonious in terms of its dynamic order) that explains the data well and can be interpreted in reasonable, physically meaningful terms. However, this physical interpretation is inferred directly from the data-based model, the structure and parameters of which are identified and estimated, respectively, using statistical methods. Consequently, the resulting model may not always be fully acceptable or credible to an audience that has been educated to believe strongly in hypothetico-deductive modeling based on conceptual, often deterministic, simulation models. Moreover, the model obtained in this completely inductive manner may be restricted to some degree: for instance, previous DBM rainfall-flow models function very well within an adaptive flow forecasting context for which they were derived but they utilize the flow measurement as a surrogate measure of soil moisture (catchment storage) in the “effective rainfall” nonlinearity (see later 2.2). As a result, they cannot be used for stochastic simulation purposes as this would imply a physically meaningless feedback mechanism and could make the model unstable.

[9] If, in any particular example, the DBM model is not considered acceptable for the above reasons, then it needs to be modified to correct any such perceived deficiencies. One obvious approach is to retain the statistically identified structure of the DBM model and base any modifications on conceptual ideas about the nature of those elements in the model that require further elucidation. For instance, various nonlinear conceptual models have been evolved to synthesize the effective rainfall [see e.g., *Wagener et al*., 2004, chap. 3, pp. 60–72] and can provide possible replacements for the DBM effective rainfall nonlinearity; modifications that would allow it to be used for stochastic simulation. This is the stimulus for hypothetico-inductive DBM (HI-DBM) modeling: obviously, hypothetico-deductive and inductive methods of modeling are not mutually exclusive and HI-DBM modeling is an attempt to meld together the best aspects of both approaches and produce a systematic approach to hydrological model development, in general, and rainfall-flow modeling, in particular.

[10] The paper starts in section 2 by outlining those aspects of DBM modeling that are most relevant to subsequent sections of the paper. Since HI-DBM modeling is often going to be problem-dependent, it is best described with the help of a real example. Consequently, section 3 provides a brief introduction HI-DBM modeling, so setting the scene for the main section 4 of the paper that describes a detailed HI-DBM modeling exercise involving the well-known Leaf River data set and the associated HyMOD (hydrologic model) conceptual model. Section 5 presents a simulation example that illustrates how HI-DBM modeling works in a situation where the model structure and noise-free output of the system are known and can be used to evaluate the results. Finally, section 6 sums up the major contributions of the paper and suggests possible topics for future research.