Typology of hydrologic predictability



[1] Prediction problems broadly deal with ascertaining the fate of fluctuations or instabilities through the dynamical system being modeled. Predictability is a measure of our ability to provide knowledge about events that have not yet transpired or phenomena that may be hitherto unobserved or unrecognized. The challenges associated with these two problems, that is, forecasting a future event and identifying a novel phenomenon, are distinctly different. Whereas the prediction of novel phenomena seeks to explore all possible logical space of a model's behavioral response, the prediction of future events seeks to constrain the model response to a specific trajectory of the known history to achieve the least uncertainty for the forecast. Predictability challenges have been categorized as initial value, boundary value, and parameter estimation problems. Here I discuss two additional types of challenges arising from the dynamic changes in the spatial complexity driven by evolving connectivity patterns during an event and cross-scale interactions in time and space. These latter two are critical elements in the context of human and climate-driven changes in the hydrologic cycle as they lead to structural change–induced new connectivity and cross-scale interaction patterns that have no historical precedence. To advance the science of prediction under environmental and human-induced changes, the critical issues lie in developing models that address these challenges and that are supported by suitable observational systems and diagnostic tools to enable adequate detection and attribution of model errors.

1. Introduction

[2] The aim of this opinion article is to develop a typology of predictability challenges in hydrology. This classification is aimed at understanding methods, tools, and problem structure to improve predictability of complex hydrologic systems under human impact and environmental changes. This is considered in the context of both abiotic and biotic pathways of flow of water in the hydrologic cycle [Kumar, 2007]. Inclusion of the biotic pathways becomes particularly relevant as significant changes are taking place in vegetation functioning because of increase in CO2 in the atmosphere and associated changes in vegetation ecophysiologic response such as decrease in stomatal conductance, increased canopy temperature, and increased biomass [Ainsworth and Long, 2005; Long et al., 2006]. In the spatial context there is increasing evidence of woody encroachment [Archer et al., 2001]. These may be resulting in subtle but large-scale changes in the hydrologic cycle [Gedney et al., 2006]. Improving hydrologic predictability in the context of subtle but perceptible and slow changes calls for the development of new approaches.

[3] While numerous articles have been written about methodological issues, such as parameter estimation, data assimilation, and ensemble forecasting, pertaining to improving hydrologic predictive capability in a variety of contexts, few attempts [National Research Council (NRC), 2002] have been made to identify and classify the essential elements that characterize the challenges that limit our ability to improve predictability. The goal of this article is to demonstrate that prediction challenges may be classified into a number of distinct categories, and addressing them requires different methodological approaches. This characterization could be useful in the context of ongoing efforts in improving large-scale predictions of continental-scale water dynamics (Consortium of Universities for the Advancement of Hydrologic Science, Hydrology of a dynamic Earth: A decadal research plan for hydrologic science, 2007, http://dx.doi.org/10.4211/sciplan.200711) in the context of environmental change, human impact [Wagener et al., 2010], or predictions in ungaged basins [Sivapalan et al., 2003], among others.

[4] The fundamental tool for predictions is a model. Paraphrasing from Iliev [1984, p. 24], the creation of models, or modeling, requires three stages. In the first stage, a mental image of the object to be modeled is formed on the basis of the understanding of the reality. This may require direct or indirect observations and analyses. While a model is a representation of reality, it is also an abstraction that sets it apart through the essential elements chosen for its characterization. For example, in a stream a hydrologist may focus on the flow properties of water, whereas an ecologist may focus on the habitat structure and a biogeochemist may be interested in the hyporheic zone (R. Hooper, personal communication, 2010). Abstraction is accomplished in the context of our existing knowledge base, or world view, and it brings into focus the components of creativity and purpose. While the purpose determines selection of the essential elements of the model, creativity is required for their appropriate configuration that maps the model to the real world. The second stage therefore involves representation of the mental image into a pragmatic form, be it descriptive, graphic, mathematical, physical, etc. This requires the use of a suitable language with its constituent vocabulary, and the representation chosen plays an important role in ascertaining the utility of the information content of the model. For example, Roman numerals do not facilitate the same ease of mathematical operations as does the Hindu-Arabic numeral representation, and a binary representation is even more suitable for digital computation. Often multiple representations may be chosen to enable effective use of the different aspect of the model, such as overlaying mathematical equations with conceptual diagrams. The third or application stage involves comparison of the model with the reality that enabled the formation of the model in the first place or prediction of aspects of reality not yet known. Inconsistency between predicted and objective reality serves as a basis for reevaluating the abstraction (or assumption) of the model, and the process is iterated. Models therefore are cognitive tools for comprehending reality, and they embody in them theories and assumptions of such understanding.

[5] Reality is hierarchical; that is, components that are lower in hierarchy organize to create the higher levels, whereas the higher levels impose constraints on the lower levels, thereby reducing the number of organizations that are possible. This evolutionary process by which higher levels of complexity and control emerge through the process of generation of a variety of distinct dynamical modes at the lower level and selective retention of a few modes, often identified as an order parameter [Haken, 1983], is termed metasystem transition [Turchin, 1977]. We comprehend reality through concepts, mathematical or otherwise, that are consonant or isomorphic with the hierarchical organization of nature, and as such, our models should reflect this hierarchy. Given the challenges imposed in including the myriad of hierarchies present in reality, our models are often abstracted to represent only a select few levels in the hierarchy, colloquially termed as the “scale of representation.” The scales lower than that represented are either parameterized or ignored, while those at the higher scale are identified as drivers or forcing or controls or constraints for the model.

[6] Prediction may be broadly defined “as a claim about matters that are not already known and whose truth or falsity has not already been independently ascertained by some more direct method than that used to make the prediction itself” [Barrett and Stanford, 2005, p. 585]. This does not necessarily imply events that have not yet transpired but includes statements about a phenomenon hitherto unknown. Indeed, discovery of the existence of novel phenomena based on the prediction from new theories (or models) that run the risk of being proven false (i.e., testable) is the central tenet of science. This property lends itself to the test of falsification and, therefore, a reliable mechanism for ascertaining the veracity and validity of the model [Popper, 2002]. The latitude of creativity provides options for the development of a variety of models for a specific purpose. Predictive success of select models and their acceptance and adoption over time therefore represent or lead to theories, notions, or paradigms of scientific thinking. For example, the unit hydrograph model [Sherman, 1932] for the prediction of streamflow has become the basis of the linear system theory of hydrologic prediction.

[7] Therefore, there are two dualities in the space of a model prediction framework. The first deals with model complexity or the level of detail represented in the model in our attempt to make it isomorphic to the reality for a specific purpose that is stated either explicitly or implicitly. A very complex model runs the risk that our models will be just as complex and perhaps just as incomprehensible as the reality they are trying to model, while on the other hand, a simple model may not capture all the essential elements to allow for reliable predictions. The second is the prediction of unknown phenomenon whose validation through experiments leads to new discoveries, whereas the prediction of future events or forecasts is generally based on the current scientific understanding. We stand to make significant progress in hydrologic science by understanding prediction as a way to bridge across these dualities.

2. Model Complexity and Predictability

[8] Complexity refers to the property of a process that leads to emergent behaviors in space or time or both, usually identified through their geometric, dynamical, or statistical characteristics. These characteristics arise from the combinatorial and cumulative effect of the interaction of many components (or degrees of freedom) where each individual part by itself does not exhibit that property. This is a result of feedback and nonlinear interactions between the components comprising the process and/or the process with the environment, where a small change leads to a series of interrelated changes that are not predictable from the knowledge of the behavior of the individual components. In other words, it refers to the adage that “the whole is greater than the sum of its parts.” Surprisingly simple models, through the creative characterization of component interactions, can give rise to very complex behavior as demonstrated by the Lorenz model of chaos [Lorenz, 1963], the logistic map equation [May, 1976], or cellular automata [Wolfram, 2002]. Model complexity should therefore be characterized in terms of the “richness” of the model output space. This collection of the time trajectories of the model variables comprises the attractor, and their properties comprise the behavioral response of the model [Willems, 1991]. In the context of the modeling objective, this behavioral response may be characterized through a variety of properties that may be of interest such as joint probability distribution function, phase space representation, limit cycles, Poincare maps, fractal dimensions, threshold behavior, and emergent properties. This characterization of model complexity is different from representational complexity, which may be defined (akin to Kolmogorov complexity) as the minimal number of words used from the vocabulary of the chosen language for the model representation to achieve the target objective. In the context of dynamical systems, this translates to the minimal number of observable and hidden (or latent) variables (degrees of freedom or dimensionality), elementary operations, and parameters used in the representation. Much as models with simple representations can lead to complex behaviors, models with many degrees of freedom may lead to simple behavior.

[9] Predictability is therefore a measure of the difference between the richness of the behavioral response of the dynamical model in comparison to that of the natural phenomena that the model attempts to capture. When a phenomenon is outside the behavioral space of a model, its predictability is zero. On the other hand, when the model comprises a suitable set of consistent and complete (or closed) representation, it leads to the predictions of novel phenomena that are a logical consequence of the interactions captured by the model but that may be hitherto unobserved or unrecognized. For example, novel modeling of streamflow using a geomorphologic approach [Rodríguez-Iturbe and Valdés, 1979] leads to the identification of geomorphologic [Rinaldo et al., 1991] and kinematic [Saco and Kumar, 2002a, 2002b] dispersion mechanisms. We may argue that the problem of the prediction of a future event is similar to that of predicting a novel phenomenon in that the future event has not been observed. However, in this case it is distinct in that the prediction is based on a constrained input space comprising observed present and (recent) past events that provides the initial and boundary conditions for the model. Assuming that the future events are within the behavioral space of the model, predictability then measures the reduction in uncertainty in the probability distribution of the future event using the model and enabled by the observed data in comparison to a distribution based on a priori knowledge such as that from the known history or expected behavior (climatology) [DelSole and Tippett, 2006]. This known history or climatology provides a characterization of the attractor or the behavioral space of the reality being modeled, and the forecast attempts to narrow down the uncertainty of the specific trajectory. Larger reduction in uncertainty corresponds to increased predictability of the model.

[10] The goals of the two prediction problems, hereafter referred to as Type N and Type F, are therefore different. Whereas the prediction of novel phenomena (Type N) seeks to explore all possible logical space of the model characterized by the behavioral response, the prediction of future events (Type F) seeks to constrain the model response to a specific trajectory of the known history to achieve the least uncertainty for the forecast. However, the ability of a model to forecast a future event, that is, its ability to capture a known trajectory, is often used as a test of its validity before a prediction of Type N is explored. This, however, need not be the case as the goals of the Type N prediction to explore the characteristics of the space of all possible trajectories can sometimes be divergent from the goal of Type F prediction to constrain the model to a specific trajectory. Numerous factors play a role in limiting our ability in constraining the model to a specific trajectory for prediction (described in section 3). Even though a model may fail to predict a specific trajectory, it may characterize sufficiently the attractor and the behavioral response. This leads to an important question: How can we define predictable structure(s), that is, configurations and attributes of form and function, in a model's behavioral space that can serve as verification and diagnostic criteria for the model? This continues to be an open problem.

[11] The use of models in these two contexts of Type N and Type F predictions has often been characterized as “investigative” and “forecast” models, respectively. Given that their goals are somewhat at odds, the issue of how detailed a model should be has been a challenging problem. For Type F predictions, it is desirable to include only the minimal set that is relevant for the prediction of the target characteristics. This minimalist approach is necessary for effectively and efficiently constraining the model to the target trajectory. However, for Type N predictions, it is desirable to incorporate as many degrees of freedom as possible so as to capture the subtle process interactions and their manifestations. For example, the ecophysiological changes in vegetation arising because of increased CO2 that in turn affect the surface energy balance, soil moisture, and boundary layer formation require that the acclimation response of vegetation and it impact on hydrology be captured [Drewry et al., 2010a, 2010b]. The need or relevance of specific degrees of freedom should be determined by the characteristics of the resulting behavioral response to assess its relevance to the reality that it is attempting to represent.

[12] The level of detail in a model should be determined by both its ability to predict (1) a target trajectory and (2) its behavioral response to characterize the emergent properties observed in nature. The emergent properties that we see in nature are a reflection of choices made by the dynamics that we are attempting to capture in our models, and our modeling effort will be better served by ensuring that our model is able to make the same choices.

[13] Using the properties of emergent characteristics to guide a model for improved forecast capability will help bridge the duality of Type N and Type F predictions and is an open frontier to be explored. For example, consider the problem of prediction of extreme events such as floods. The occurrence of floods may be a result of not only hydrometorologic conditions but human-induced effects, such as progressive loss of wetlands and other land management practices. Extreme floods in the current time may have no historical precedence and are therefore outside the behavioral space of historical observations or models based on those observations. Whereas a Type F prediction is likely to fail in such prediction, a combination with Type N may succeed in predicting such events as emergent characteristics of coupled human-nature interactions.

3. Prediction Typologies

[14] In general, the prediction problems deal with predicting the fate of fluctuations or instabilities through the dynamical system being modeled. The fluctuations of the dynamical system may be internally generated in the system (i.e., endogenous variability) or may arise in initial conditions and/or boundary conditions (i.e., exogenous variability) and can be characterized by a variety of measures of variability such as magnitude, duration, frequency, timing, intermittency, sequencing, and relaxation time. Design problems are concerned with ensuring the dissipation of the fluctuations to a desired outcome, whereas management problems are concerned with constraining the response of a system to fluctuations so that it may operate within desired characteristics. The propagation of fluctuations through a dynamical system and their dissipation or amplification or localization in the space-time domain imposes the difficult challenge of ascertaining the precise state of the system using observations. This leads to uncertainties of various quantities such as the initial state or boundary conditions. On the basis of nonlinear and chaotic dynamics understanding, prediction problems have been classified into two categories [Lorenz, 1975; Schneider and Griffies, 1999]. The predictability studies of the first kind address how the uncertainties in the initial state of the dynamical system affect the prediction at a later stage. Predictability studies of the second kind address how uncertainties in the boundary conditions or forcing affect future predictions. Recognizing the unique context of hydrologic predictions where models use a variety of parameters, predictability studies of a third kind have been defined [NRC, 2002] that address the impact of uncertainty in the model parameters on prediction uncertainty.

[15] The above three classifications which target Type F predictions, however, do not account for a number of additional sources of error that limit our predictive ability in hydrologic context. For example, connectivity patterns between processes change as the system as a whole transitions from one dynamic regime to another. These connectivity patterns are often associated with characteristic emergent responses that in turn constrain the dynamics. Another source of error is the cross-scale interactions, where the slower modes of the dynamics, while constraining the behavior of the faster dynamic modes, also evolve in response to the latter. After briefly reviewing the prediction problems of the first three kinds (sections 3.13.3) to provide a context, two additional typologies are discussed in sections 3.4 and 3.5.

3.1. Predictability Problem of the First Kind

[16] Dynamical systems with nonlinear interactions often show sensitivity to initial conditions. When system feedbacks amplify errors in initialization, a small error in the initial conditions can cause evolution of trajectories to diverge over time. The initialization uncertainties arise as a result of our inability to measure precisely all state variables in a system at high enough resolution and the mismatch between observational and model scale. An example is the uncertainty in the prediction of the streamflow due to lack of knowledge of the soil moisture state. While chaotic systems are prime examples of such behavior, other nonlinear systems may also show these characteristics. In chaotic systems the divergence of trajectories is measured by Lyapunov exponent. The more divergent the trajectories (larger exponent), the more unpredictable is the system. For such systems perturbations grow to an extent that after some time, a predictability horizon, the trajectory is not distinguishable from a randomly selected trajectory [Ehrendorfer, 2005]. The predictability horizon can be increased by improving the accuracy of the initial conditions but only to a limited extent because of the exponential growth of the errors. This problem is usually addressed by developing methods to extract the most accurate estimate of the initial conditions, that is, to maximize the information content, using all available data [Daley, 1993]. However, given that data resolution is only finite in both space and time, we may not be able to eliminate this problem completely. In such events, a probabilistic prediction is attempted using ensemble forecasting (see Lewis [2005] for a historical review) to reflect the uncertainty associated with the prediction of a specific trajectory.

3.2. Predictability Problem of the Second Kind

[17] This may be characterized as a boundary value problem where the specification of the dynamics at the boundary of the systems or forcings is the bottleneck. As in the previous case, the uncertainties for the dynamical forcing at the boundaries arise from the same sources of data accuracy and mismatch between the model scale and data resolution. The propagation of these errors may result in the system being indistinguishable from the same dynamics resulting from internal variability of the system. The more separable these two distributions are in state space, the more predictable the system is on the basis of the information contained in the boundary condition. To achieve this separation, data assimilation techniques have been developed to prevent model drift by periodically resetting the model state variables closer to observations, whenever they are available, by taking into account the inherent uncertainty in both the prediction and observations [McLaughlin, 1995, 2002; Liu and Gupta, 2007]. Ensemble studies based on the response of the system to perturbations in the boundary conditions are also used to obtain a probabilistic characterization of the predictions [Schneider and Griffies, 1999].

3.3. Predictability Problem of the Third Kind

[18] This may be characterized as a problem of inadequate estimation of model parameters due to the presence of heterogeneity in various land attributes needed in the model [NRC, 2002]. Proper specification of heterogeneities through parameters in the hydrologic model has been one of the most difficult challenges for hydrology. The problem is exacerbated because of the unobservability of many of these attributes such as subsurface macropore organization and the resulting influence on flow properties. Furthermore, the estimation uncertainty also arises because scaling up of microscale heterogeneity does not provide appropriate estimates at the “scale of representation” of the model. As a result model calibration is used to arrive at effective parameters, making the model scale dependent with the need to recalibrate when the scale of representation changes.

[19] The three typologies capture in a nutshell the current understanding of the prediction challenges [NRC, 2002] in a broad hydrologic context. Methods for extraction of best information from finite resolution data, constraining the state space evolution using data assimilation to account for uncertainties in model and observations, sophisticated methods for estimation of parameters, often jointly with data assimilation [Liu and Gupta, 2007], and accounting for propagation of uncertainties through the model using ensemble-based probabilistic prediction are the primary tools for improving model predictability. This framework implies that predictions are best understood in a probabilistic context, and it is a combined property of the dynamical system as well as observations. An event is unpredictable if the probability of the future event conditioned on the observations is independent of the observations; that is, observations provide no reduction in prediction uncertainty [DelSole, 2004, 2005]. Measures of predictability that exploit this probabilistic structure using information theoretic metrics have been developed [DelSole and Tippett, 2006]. The emphasis clearly is on Type F prediction in attempting to constrain the model trajectory to that of a historical observation to achieve the least error variance in the probability distribution of the forecast.

[20] Prediction errors can be also attributed to inadequate model physics along with the uncertainties in initial and boundary conditions and model parameters. More recently, multimodel ensemble predictions have also been developed [Duan et al., 2007] to account for uncertainty in model physics. Ways for handling model structure errors are also emerging [Doherty and Welter, 2010; Renard et al., 2010]. The tradeoff between model errors and parameters leads to the problem of equifinality [Beven, 1993] where multiple model structures and parameters associated with each can result in similar or indistinguishable predictions. This derails the process of attributing predictive success as a measure of validating the model physics and therefore using it for Type N predictions. This problem can be partially addressed by specifying and estimating model parameters directly from observations made at the scale of representation of the model. However, it has been argued that the uncertainty ranges in both parameter estimate and prediction may allow for multiple model structures and parameter sets and therefore equifinality [Beven, 1993], but this issue remains moot. The concept of equifinality has its roots in open system dynamics where the “same final state may be reached from different initial conditions and in different ways” [Bertalanffy, 1976, p. 40]. Equifinality is therefore a dynamical attribute, and it should be expected that states and parameters identified from one or a few instances of an observed dynamical trajectory will be insufficient to solve the inverse problem of uniquely identifying the attributes (parameters and model structure) of the system. Existing limitations to represent reality and specify parameters such that an accurate forecast (Type F) is achieved should only provide the impetus to address this challenge through creativity in developing new observational technologies and conceptualizing parameterization rather than serve as a basis for skepticism of using modeling as an indispensable cognitive tool.

[21] While the three kinds of prediction problems summarized here provide an important framework for addressing them in a systematic way, they do not capture all aspects. In sections 3.4 and 3.5, I discuss two additional categories. The first challenges the static specification of landscape organization, and the second brings into focus the issue of cross-scale coupling in time and space.

3.4. Predictability Problem of Dynamic Connectivity and Spatial Complexity

[22] This section deals with the issue of the evolution and complexity of spatial patterns and their role in system dynamics. While heterogeneity characterizes the organization of a landscape (understood broadly to include hydrologic, geomorphologic, edaphic, geologic, and ecological attributes) and its variability, there is now sufficient evidence that the dynamical response is significantly influenced by the connectivity patterns that emerge as a result of the interaction between the spatial structure [Schulz et al., 2006] and process dynamics. The connectivity patterns [Michaelides and Chappel, 2008] may exist persistently in the form of continuum or network organization. Alternatively, they may be ephemeral with temporal and spatial thresholds for establishment, periods, and spatial extent of persistence and rapid or gradual dissipation. For example, a fill-and-spill hypothesis has been proposed to explain the threshold flow behavior arising when subsurface stores fill to create a connected pattern of flow paths [Tromp-van Meerveld and McDonnell, 2006]. Rainfall-runoff response has been found to vary according to the connectivity patterns of soil moisture [Western et al., 2001] and to exhibit threshold response [James and Roulet, 2007; Zehe and Sivapalan, 2009]. Convergence of biogeochemical constituents through connectivity patterns gives rise to biogeochemical hot spots as patches that show disproportionate high reaction rates relative to the surrounding matrix and hot moments as short periods of time that exhibit disproportionately high reaction rates relative to longer intervening time periods [McClain et al., 2003]. Spatially localized persistence of hot moments gives rise to hot spots, and the linked transport and reaction pathways give rise to interfaces (and may be associated with state transformations) and account for the net fluxes. Implicit in this characterization is that the resolution of event-scale dynamics, which is localized in space and time, has significant bearing on prediction.

[23] The role of event-scale dynamics brings into question the entire notion of scaling up of model parameters or estimating “effective” parameters using calibration since what is required is scaling characteristics not of the heterogeneity but of the dynamics that “lives” on these heterogeneities. It may be envisioned that the spatial patterns of heterogeneities arise as an emergent property of the interaction between the landscape properties and the dynamics [Lehmann et al., 2007; McDonnell et al., 2007]. Spatial organization may change rapidly from highly disconnected to connected patterns, thereby enabling enhanced throughput of fluxes. These emergent patterns, therefore, serve a dynamic function [Sivapalan, 2005], and our predictive ability will be enhanced by including them explicitly in our models.

[24] Connectivity patterns may also be associated with different dynamic regimes defined as the stable basins of attraction in a state space of the component dynamics (set of all variables that are relevant for further growth of the system). New connectivity patterns established as instabilities are either enhanced or dissipated as a result of the tug of war between the positive and negative feedbacks and the possible transitions of the dynamical system from one dynamic regime to another [see Dent et al., 2002, Figures 3, 5, and 6]. It is also possible that there is a continuous tradeoff between strong and weak links [Csermeley, 2006] in the dynamics. For example, soil moisture controls on the exchange at the land-atmosphere interface have different strengths [Koster et al., 2004] that may change as the dynamic regime changes. Since the variability is an inherent property of the hydrologic cycle [Kumar, 2007], the latter continuously explores alternate dynamic regimes and therefore connectivity patterns. Indeed, one of the possible consequences of the climate- and human-induced changes is that the systems may spend different fractions of time in alternate dynamic regimes rather than creation of new dynamic regimes.

3.5. Predictability Problem of Cross-Scale Interaction in Time and Space

[25] The flow of water carves its own path for the flow. The strong nonlinear dynamic coupling between the flow of water (dynamics) and the pathway of the flow (landscape organization or structure) introduces changes in the magnitude and variability of the water cycle itself. This coevolutionary dynamics comprising the interplay between fast and slow modes has significant consequences as the variability is an important mechanism of communication between biotic and abiotic components connected through the water cycle that leads to adaptive self-organization [Kumar, 2007]. This interaction between the flow and landscape dynamics happens at a variety of time scales ranging from near instantaneous (e.g., landslides) to geologic. Often the slower dynamics constrains the faster dynamics (what Haken [1983] describes as the “slaving principle”), but over a longer time, the faster dynamics influences the slower dynamics. This can be understood by observing that after an extreme or catastrophic event, such as a flood, the morphology of a severely altered stream channel determines the immediate flow dynamics, but over a longer period the landscape attempts to recover some, if not all, of its original morphology. Another example is where slower-mode soil moisture dynamics of deep layers constrains the faster land surface energy fluxes through hydraulic redistribution [Amenu and Kumar, 2008]. Often “fine scale processes propagate non-linearly to have broad scale effects and, conversely, there are situation when broad-scale drivers overwhelm fine-scale processes” [Peters et al., 2004, p. 15,130]. This cross-scale interaction acts in both space and time and can lead to catastrophic behavior [Peters et al., 2007].

[26] Connectivity and spatial complexity driven by flow dynamics play a significant role in determining the cross-scale coupling strength. Typical modeling efforts implement the slower dynamics as an imposed boundary condition or parameterized constraint with little scope for their evolution driven by the cross-scale coupling. Identification of cross-scale coupling through experiments and observations is needed; such experiments and observations are challenging because of the disparate space and time scales involved. As a result, modeling strategies that enable effective cross-scale coupling do not yet exist, resulting in severe limitations in our predictive ability in a variety of situations. Evolving emergent patterns of form and function across the landscape may serve to illustrate the cross-scale coupling and possibly a way to incorporate it into models.

4. Summary and Discussion

[27] I have elucidated that the traditional classification of predictability challenges is based on the assumption of a static landscape that does not change under the influence of the dynamics it supports. Further, it is largely targeted to address the problem of predicting an observed trajectory (Type F prediction). I have argued that characterizing the behavioral space of possible outcomes, possibly leading to the discovery of new phenomena (Type N prediction), is an important dual problem. This problem is particularly relevant for situations where landscape organization is subject to change either in response to slow modes of the dynamics or because of exogenous factors such as human impact or environmental change (Figure 1). Under these circumstances, new connectivity patterns may arise, leading to ephemeral or persistent new interactions. The connectivity patterns enslave the short-term dynamics, typically resulting in threshold-type responses and alternate dynamic regimes. The problem of representing the coevolution of systems with vastly different time scales of feedbacks is an open challenging problem. However, this is the type of interaction that leads to nonstationary as well as catastrophic responses. Given this categorization, new challenges emerge for formulating a suitable “theory of evaluation” [Gupta et al., 2008] using model diagnostics that attempt to find those components of the model that can explain the observed discrepancy between observation and prediction. For appropriate attribution, the model diagnostics need to consider prediction errors in conjunction with the model structure error arising from the attributes for facilitating emergent connectivity patterns and coevolutionary dynamics across space and time scales involving a hierarchy of processes representations. This will require novelty in defining predictable structure(s) in a model's behavioral space and developing quantitative methodologies for detection and attribution. New observational and interpretation framework may need to be devised to disentangle the fast and slow dynamics and their coupled behavior so that a suitable framework for model representation as well as model diagnostics may be developed.

Figure 1.

Summary of concepts in the characterization of the typology of hydrologic predictability.

[28] Climate change and human impact on nearly all aspects of the hydrologic cycle are limiting the use of past as a predictor of future [Milly et al., 2008]. New modes of science need to be developed to predict the evolutionary dynamics of the water cycle where quantity, quality, variability, and flow paths are all subject to change, and new connectivity and cross-scale interactions are likely to emerge. In this context, rare events of extreme significance and consequence, or “black swans” [Taleb, 2007], are likely to play an important role. We may have already seen such events in the context of large floods or persistent droughts or desertification. To anticipate surprises and manage them, it is imperative that hydrologic science explore the “possible” with just as much vigor as the “probable.” The value of hydrologic science cannot be determined by the forecast needs alone or, for that matter, hindcasts, but more so through the predictions of novel phenomena so we may be better prepared to expect the possible, albeit with small probability. Bridging the gap between Type N and Type F predictions is, therefore, a fundamental imperative. While vigorous efforts during the past decades have generally addressed the Type F problem in improving predictability of the first three kinds, new paradigms of investigations, including suitably designed observational systems, are needed to address the Type N problems with a similar vigor. Investigations will need to reconcile process complexity, model complexity, computational complexity, and observational density so that one does not limit the others for improving predictability. This will require radical departure in thinking deeply rooted in the shackles of calibration and equifinality. That is, models based on parameters estimated to reproduce past observation may not exhibit the behavioral response that encompasses future events, and there is no one-to-one mapping between initial and final states. It will also require that the debate on the dichotomy of reductionism and holism, or bottom-up and top-down [Savenije, 2009], be abandoned to develop a thinking that recognizes the cross-scale dynamic connectivity between the small-scale component behavior and the system-scale emergent patterns of form and function. This has been argued in various ways, such as bridging the Newtonian and Darwinian approaches [Harte, 2002] or Baconian and Cartesian science [Dyson, 2002]. Recent research [Schulz et al., 2006; Tromp-van Meerveld and McDonnell, 2006; Jencso et al., 2009; Lane et al., 2009; Gomi et al., 2008; Peters et al., 2004; Ruddell and Kumar, 2009a, 2009b] suggests that we may be already laying the foundations in place. Note that the distinction of models as deterministic, stochastic, conceptual, etc. is irrelevant to this discussion as they are all subject to the same argument. Indeed, we may need alternate or blended perspectives to capture the multitude of aspects to narrow the uncertainty range of predictions.

[29] Hydrologic discourse is rife with notions such as “all models are wrong but some are useful” and “good models don't exist; they can be only made better” [Savenije, 2009]. These notions are a reflection of the challenge of improving predictability in the context of the complexity of the hydrologic cycle. Models are far more than tools; they are indispensable frameworks for representing the principles of nature as a cognitive system. They enable us to reduce the dimensionality of observation, provide explanation for understanding, and elucidate cause and effect. They are arrived at through observations and experimentation as empirical studies lead to theoretical synthesis and then to computational predictions. They allow us to “prophesy” [Beven, 1993] beyond the extent of the observed by allowing us to fill the gap of the unobserved or unobservable using that of the known. Models are an indispensable component of the “macroscope” [Rosnay, 1979] to enable us to understand the complexity of the coupled human and natural systems in that they allow us to bridge the realm of the observed with the realm of the possible.

[30] With the advent of low-cost and ubiquitous sensing we may be approaching an era where the phenomenon described through observation may become more reliable than that described through our models [Hey et al., 2009] because of the sheer density of measurement of covarying variables. Mining of large multivariate databases may reveal hidden, and perhaps subtle but essential, dependencies that go beyond the “dominant” modes of exploration [White et al., 2005]. They are likely to reveal hypotheses that can then be tested through experiments, reversing the usual method of hypothesis-driven observational design. In the context of this emerging framework of intensive data-driven hydrologic science, there is a significant possibility that the primacy of mathematical model–based prediction is itself at stake, and data-driven models or some amalgamation thereof are likely to emerge as an alternative framework for predictive explorations.


[31] The research presented here has been supported by NSF grants ATM-0628687 and EAR-0636043 and NOAA grant COM NA06OAR4310053. Feedback and discussions with several colleagues, including students and postdocs over several years, is gratefully acknowledged.