Field studies in watershed hydrology continue to characterize and catalogue the enormous heterogeneity and complexity of rainfall runoff processes in more and more watersheds, in different hydroclimatic regimes, and at different scales. Nevertheless, the ability to generalize these findings to ungauged regions remains out of reach. In spite of their apparent physical basis and complexity, the current generation of detailed models is process weak. Their representations of the internal states and process dynamics are still at odds with many experimental findings. In order to make continued progress in watershed hydrology and to bring greater coherence to the science, we need to move beyond the status quo of having to explicitly characterize or prescribe landscape heterogeneity in our (highly calibrated) models and in this way reproduce process complexity and instead explore the set of organizing principles that might underlie the heterogeneity and complexity. This commentary addresses a number of related new avenues for research in watershed science, including the use of comparative analysis, classification, optimality principles, and network theory, all with the intent of defining, understanding, and predicting watershed function and enunciating important watershed functional traits.
 Watersheds exhibit a wide array of heterogeneity in landscape properties and complexity of their responses to spatiotemporally varying climatic inputs. As a result, watershed hydrology lacks the compact organization of empirical data and observations of watershed responses that will facilitate extrapolation to and prediction of watershed behavior in different places and across a range of scales. Models based on our current small-scale theories emphasize the explicit mapping of more and more of the heterogeneities of landscape properties and the resulting process complexities (an impossible task at even the most intensively studied watersheds). Consequently, the models based on current theories rely on calibration to account for our lack of knowledge of the spatial heterogeneities in landscape properties and to compensate for the lack of understanding of actual processes and process interactions.
 In this commentary, we outline a new vision for watershed hydrology that seeks new understanding of how watersheds work and propose a number of new directions of where the field might head in the coming years. We focus on the incorporation of ideas from ecology and related disciplines into hydrology and seek to define new generalizable insights and understanding. We ask whether there is a simple explanation for the existence of landscape heterogeneities and process complexity and simple ways to describe organizing principles that govern their emergence, maintenance and interconnections. While not a blueprint for the way forward, we hope that these ideas will stimulate some new thinking in watershed science, from both a field and modeling perspective to move us forward.
2. Motivation and Historical Perspective
 Two decades ago a special issue titled “Trends and Directions in Hydrology” appeared in this journal. This special issue dealt with the need to bring greater coherence to the study of hydrology. One paper in that special issue by Jim Dooge [Dooge, 1986] titled “Looking for hydrologic laws” presented a new vision for the science of hydrology; a veritable manifesto for change. Dooge suggested a three-pronged framework for theory development that included (1) searching for new macroscale laws, (2) developing scaling relations across watershed scales, and (3) upscaling from small-scale theories. Dooge's  manifesto for change was equivalent to a combination of what we might now call top-down and bottom-up investigations [Sivapalan, 2005]. His vision was a search for scale invariance and scale dependence of landscape properties and watershed responses. Some twenty years later, the critical ideas and positive vision presented in that paper remain just as fresh, relevant and, unfortunately, very much unfulfilled.
3. What's Wrong With the Status Quo?
 Recent work in watershed science has revealed new and interesting puzzles of heterogeneity and process complexity (here, we define process complexity as the degree to which a process is difficult to observe, understand or explain). Yet, most field experiments and observations in watershed science to date, remain largely descriptive. Many of these field studies have not set out to seek fundamental truths or understanding (nor test any formal theory or hypothesis per se) regarding watershed behavior, and hence their results have been difficult to generalize. Experimental watershed studies generally have not been driven by hypotheses governing general behavior (across places and scales). As a result, progress in developing new methods has been driven mainly by newly available technology capable of generating more data at higher temporal and spatial resolution. As a community, and as individuals, we have progressed along a philosophical path that says that “if we characterize enough hillslopes and watersheds around the world through detailed experimentations, some new understanding is bound to emerge eventually.” What this approach to experimental design has succeeded in doing is to help characterize the idiosyncrasies of more and more watersheds, in different places and at different scales, but with little progress toward realizing the Dooge  vision.
 Current models in watershed science are based on well known small-scale theories such as Darcy's law and the Richards equation built into coupled balance equations for mass and momentum. As Beven  and many others have noted, almost all physically based distributed models base their model architecture on the blueprint presented by Freeze and Harlan . Aided by increased process understanding, digital terrain attributes and increased computing power, a large number of highly sophisticated models have been developed [Singh and Frevert, 2002]. Physically based, spatially distributed models can, in principle, produce distributed predictions of state variables and fluxes over the body of the watershed and give the appearance of great realism and theoretical rigor. In reality, however, there are many unresolved issues with the current generation of physically based distributed models, arising from the fact that their theoretical foundation is still small-scale physics or theories [Kirchner, 2006]. There have been difficulties in their application to larger watersheds due to the effects of spatial heterogeneity in landscape properties, the inherent nonlinearity of many hydrological processes and process interactions at all scales. Together these give rise to a change of dominant processes or the emergence of new processes with the increase of spatial and timescales [Klemes, 1983; Sivapalan, 2003], which are not yet fully understood. Consequently, the resulting models are heavily overparameterized, and many combinations of parameter values can yield the same final result, leading to a large degree of predictive uncertainty [Beven, 2000].
 In spite of their apparent physical basis and their complexity, the current generation of detailed models are process weak even at small scales. While the Darcy-Richards equation approach as a subgrid-scale parameterization is often consistent with the point-scale measurements (tensiometers, TDR etc.) in soils which are dominated by matrix flow, it often breaks down at larger scales or in soils dominated by preferential flow [Weiler and Naef, 2003]. For example, recent work at the scale of entire hillslopes shows clear network-like preferential flow structures that control the timing and spatial location of mobile water flow during rainfall runoff events [Weiler and McDonnell, 2007]. Subsurface features such as the bedrock topographic surface have been observed in many field studies to control the lateral mobile water flux. Delivery of this water to streams is highly threshold-dependent based on precipitation amount [Buttle et al., 2004; Tromp-van Meerveld and McDonnell, 2006]. Lack of, or only intermittent, connectivity of subsurface flow systems and the flow pathways between upslope and riparian zones of hillslopes also contribute to highly nonlinear behavior, especially in semiarid environments, invalidating the assumptions built into many of our current models [Ocampo et al., 2006].
 The old water paradox [Martinec, 1975; Kirchner, 2003] is another example of where our field evidence is at odds with the formulation of our models. Most models give us the wrong relationship between the transit time of the water and the transit time of solutes [Vaché and McDonnell, 2006], although rigorous implementation of process knowledge appears to provide improved solutions at least at some specific spatial scales [Weiler and McDonnell, 2007; McGuire et al., 2007]. Recent work suggests that the scaling relationships used in our models are also at odds with our experimental findings [McGuire et al., 2005]. Here again, we are certainly not the first to make this point. Indeed, as far back in 1983, Dunne [1983, p. 25] noted that “runoff concepts need to be refined, developed and formalized through more vigorous combination of rigorously defined field experiments and realistic physically based mathematical models.”
4. Possible Ways Forward
 How will we respond to the challenge of Dooge  and discover the macroscale “laws” that govern hydrological responses of watersheds? How might we define a new theory base for watershed hydrology, and what will be its essential elements? How will the new experimental findings be accommodated within new theories of watershed hydrology and how will they lead to very different modeling approaches? We believe that advances in this direction will come from insightful analysis of landscape heterogeneity and process complexity, an understanding of watershed function, and the exploration of the underlying organizing principles that connect the pattern and process with function [Sivapalan, 2005]. These directions are themselves quite related, with much overlap in what each may provide. Below, we outline some ideas pertaining to each, in hopes of stimulating debate in this regard.
4.1. Asking Why Heterogeneity Exists
 Heterogeneity and process complexity are ubiquitous at all scales: (1) pore geometry at the smallest-scale fractures, macropore networks, soil layering and other preferential flow arrangements in soils, (2) heterogeneity of vegetation canopies, root distribution in the subsurface and complex root water uptake behavior, (3) geological heterogeneity including bedrock topography and its composition, (4) complex spatial patterns of soil moisture and groundwater flow, and (5) input heterogeneity of rainfall and snowmelt. All of these are increasingly recognized as very important in governing watershed responses, but are inherently difficult to observe. Of course, as our ability to observe the world improves, we can improve our capacity to characterize and quantify, at all scales, the heterogeneity of landscape properties, as well as the complexity of resulting processes and process interactions. However, we would still argue that simply further prescribing this heterogeneity and describing the resulting process complexity in ever greater detail will not move us beyond the current theoretical bottleneck. In other words, enhanced observational capability will not improve our ability to extrapolate to ungauged locations. The more we explore, the more heterogeneous and complex nature appears to be, and thus model predictions will still be saddled by the problem of equifinality [Beven, 2006]. Furthermore, heterogeneity and complexity may change over time, with climate changes and landscape disturbance further exacerbating the rate and direction of these changes.
 The question therefore is whether there is a simple and theoretically more elegant alternative to describing the heterogeneity and complexity that may enable us to make predictions that are right for the right process reasons. We need to make models more realistic and useful but we must also figure out a way to embed heterogeneity or the consequence of the heterogeneity into models in a manner that does not require enormous amounts of generally unavailable data. To date, watershed hydrologists have largely asked questions of “what” heterogeneity exists, rather than “why” this heterogeneity exists. We argue that rather than asking “what”: What is the peak flow in a given watershed? What is the traveltime in the stream? What are the dominant flow pathways in a hillslope?, we must begin to ask questions of “why”: Why is there preferential, network-like flow at all scales? Why is water in the stream so well mixed despite the ubiquity of preferential flow? Why are hydrological connections at the hillslope and watershed scale so threshold-like when the soil, climate, vegetation and water appear so tightly coupled? We argue that addressing these “why” types of questions will lead to more useful insights, and may present a new way forward toward making realistic predictions without having to prescribe all the gory details of heterogeneity that may be present in a watershed.
4.2. Watershed Functional Traits
 One way forward is to explain the answers to our “why-type” questions and the existence of heterogeneity and complexity in the context of watershed function. At the most basic level, watershed function might be defined as collection, storage and release of water [Black, 1996], but should also include the ecological functions of providing diverse sites for biogeochemical reactions and habitat for flora and fauna. This may be a simpler way to describe their genesis, the landscape patterns that form, the process complexity and richness they produce, and/or the underlying organizing principle that underpins their emergence, maintenance and interconnection [Schulz et al., 2006]. The spatiotemporal landscape and process patterns that arise as a result of the watershed could be described as “functional traits”. Functional traits are an ecological concept which assumes that instead of analyzing the complex history of evolution (which is usually unknown) one can examine the net result (termed the traits) which embodies all relevant historic information. For example, plant stature and seed size have evolved as a result of selective evolutionary pressure and are deemed to be a finger print of past climate [e.g., Adler et al., 2004]. Our contention is that if we can connect the functional traits to watershed function, then the apparent heterogeneity and process complexity will collapse into a coherent and reproducible pattern, resulting in a simpler explanation for a set of observations than is presently available.
 While seemingly straightforward, explaining functional traits will require no less than a paradigm change in the way that we measure, model and observe watersheds, a change in focus from merely gauging to diagnosis (i.e., diagnosis of patterns, including patterns of soils, vegetation, and hydrologic flows that evolve in a particular place). Currently, we examine only the net result of all these forces that formed the complex landscape in a watershed. However, examining why the watershed has evolved as it has, could lead to new, relevant historical information that may guide us to describe signatures of variability of say, runoff and evaporation, and connections between these and temporal patterns of vegetation cover change and water quality etc. There is abundant evidence that generic and repeatable patterns do exist. Examples include the Budyko curve [Budyko, 1974] and many others that describe the tight functional relationships between climate, soils, vegetation and topography that arise from their coevolution. Water and hydrological processes play a central role in such coevolution.
 Recent detailed field explorations of runoff generation processes have unearthed threshold behavior, competitive feedbacks, hysteresis, saturation-depletion behavior [Freer et al., 2002], and temporal pattern dynamics [Struthers et al., 2007; Naef et al., 2002], which are characteristic descriptors of watershed drainage arising ultimately from the heterogeneity of landscape properties. These can be viewed as emergent properties, properties that could not be predicted from the component parts. The promise of a new theory of watershed hydrology will provide compelling motivation to embrace these nonlinearities as larger-scale manifestations of unknown, small-scale heterogeneities that contribute and reflect the collective watershed function. It is our contention that the search for the organizing principles and traits underpinning watershed function will help us to better understand and interpret the patterns we can easily see and explore (such as river networks, vegetation patterns in space and time, soil catena), but will also help us to understand and predict patterns of variability that we cannot easily observe but which are nevertheless important for predicting the overall functioning of watersheds.
4.3. Watershed Classification and Similarity Analyses
 To date, our studies in experimental watersheds have identified the idiosyncrasies of many watersheds, and produced rather complex characterizations of watershed behavior. Attempts to extrapolate or regionalize observations of watershed behavior have been of limited value because of the difficulty in producing concise, easily understood explanations of watershed behavior [McDonnell and Woods, 2004]. A time-honored method for finding connections is to develop and use a classification system, which would allow us to group watersheds into distinct groups, and try to understand the differences between these places as opposed to the similarity between places within the groups. Therefore a crucial step may be to develop a watershed classification system [McDonnell and Woods, 2004; Wagener et al., 2007] based on dimensionless similarity indices or dominant hydrological processes [e.g., Naef et al., 2002] and newly defined functional traits that can help provide some structure to our classification approach. Chemistry uses the periodic table to group together those elements which have similar chemical properties. In more complex, less well-behaved systems (biology, for example), the hierarchical Linnaean system is used to classify organisms. Despite advances in genetics, the Linnaean system remains of central importance in biology. Closer to hydrology, fluid mechanics has developed dimensionless numbers such as the Reynolds and Froude numbers to classify different flow regimes; limnologists use similar numbers to classify lakes and distinguish different turnover rates or trophic status. We argue that hydrology should likewise aim to develop a hierarchical classification system and a set of dimensionless similarity indices to compare and contrast watershed traits in different places. While each class may still contain significant internal complexity, classification may group similar watersheds together, and thus limit the variability within each class. This will lead to increased targeting of the dominant process controls, and through this, to improved predictability in the long term.
4.4. Scaling Behavior and Emergent Properties
 Heterogeneity exists at multiple spatial scales and hence the effects of that heterogeneity also manifest over a wide range of scales. One approach to dealing with heterogeneity without their detailed characterization is to focus on those properties that emerge with increasing scales, and their resulting hydrological effects. It is possible that as we look more closely and find a way to filter out unimportant details, we might begin to see emergent features or properties that serve as a natural skeleton to connect descriptions of hydrological responses across scales [Sivapalan, 2003]. They could then form the basis of models that are inherently simpler at the macroscale, but with sufficient links to essential aspects of the detailed heterogeneity and complexity observed at the microscale.
 One notable example of an emergent property is the notion of connectivity. The effects of heterogeneity on hydrological responses manifest through the water flow pathways, and the degree of connectivity of these pathways, as well as how connectivity changes with time as a result of interactions with climate and landscape elements. An alternative to developing macroscale parameterizations of watershed response in terms of the small-scale heterogeneities is to develop these parameterizations in terms of a measure of the degree of connectivity. This inevitably will give rise to new theories and new ways of parameterizing the effects of heterogeneity. The analysis could be done in two stages: (1) exploring the connection between the small-scale heterogeneity and the measures of the degree of connectivity, and (2) expressing the macroscale watershed response in terms of this degree of connectivity. This process could lead to parsimonious descriptions of watershed responses that have a better chance at extrapolation to other ungauged watersheds. Another example of an emergent property is the notion of traveltime. When it comes to timing of watershed storm responses, recent work has shown that the hydrological responses can be described in terms of tracer-based traveltime distributions [McGuire et al., 2005] with clear scaling rules, not apparent in physical flow data. By focusing on a concept that easily connects and can also be easily scaled [Sivapalan, 2003], we define a macroscale representation that is clearly tied to process descriptions at small scales, but is not overly complex in terms of the needed characterization of landscape heterogeneity.
4.5. Network-Like Flows and Optimality Principles
 One of the most remarkable aspects of natural heterogeneity and process complexity arises because of network-like structures and flows. Networks abound in nature. Indeed, network-like preferential flow is ubiquitous across all scales, and appears to be related to or is the inevitable consequence of the watershed's overall functioning. The balance of micropores and macropores may self-organize in all watersheds, from the smallest stream draining the smallest watershed to the largest river draining a large part of a continent. Network-like patterns can also be observed, associated with both root structure and above ground plant canopy architecture, and can be deemed as functional traits, just like the surface drainage network. One of the most significant challenges faced by hydrologists is admitting and confronting the reality that there is network-like flow at all scales, from large intact soil cores that show a duality of matrix flow and preferential flow [Clothier et al., 1998] and complex network-like infiltration behavior into natural soils [Weiler, 2005] to hillslope drainage systems where slow diffusive water and tracer movement is balanced during events by organized slope-scale network preferential flow [Sidle et al., 2000; Weiler and McDonnell, 2007]. Identifying the organizing principles behind these functional traits is the key to fundamental new understanding of hydrologic systems that will ultimately permit reliable predictions.
 Optimality principles applied to watersheds may offer a framework to begin this process of identification. In theoretical ecology, optimality theory assumes that evolution produces an optimal behavior, and ecologists attempt to determine the characteristics of the optimal behavior so it can be compared with observed behavior. Nature evolves in an interactive way under constraints: limits to the amount of energy available, water available, nutrients available for vegetation, constraints of land surface slope, etc. Much as our ecological colleagues have postulated optimality as a way to learn about plant systems and their relation to the environment, we might likewise do the same (even while acknowledging the potential dangers along the way). Whereas the objective criteria for optimality in ecology are often things like limits on water use and carbon gain, in hydrology these criteria might be minimum work and minimum (or maximum) entropy generation. Geomorphologists have long argued that the form of watersheds arises naturally from their apparent tendency to minimize flow resistance. The same may be true for the more hard-to-see patterns of internal drainage within hillslopes and soils. Our ability to understand how biology, geology, geomorphology, and climate define watershed systems and affect watershed behavior could be greatly enhanced if we begin to ask questions related to quantities that are being optimized and why, and the constraints within which any such optimization takes place.
4.6. Predictions That Include Generalities and Contingencies
 A student of the philosophy of science would argue that a new theory should be accepted if (1) it makes more precise assertions and these more precise assertions stand up to more precise tests, (2) it takes account of and explains more field observations than do current theories (3) it passes tests current theories fail, (4) it suggests and passes new experimental tests not previously considered and (5) it unifies or connects various hitherto unrelated problems [Popper, 1961]. The combination of functional traits, watershed classification and similarity analysis, scaling of watershed behavior and the set of optimality principles would together constitute the elements of a new theory of hydrology at the watershed scale. The optimality principles are testable and should be tested by recourse to observations, especially of observable functional traits.
 One concern about adding optimality, or any other organizing principle, to our theoretical toolkit, is the issue of uniqueness of place [Beven, 2000] and how this may override any organizing/optimality principles in a given locale. Every watershed is unique and the way every watershed has evolved in response to unique climatic and geological features and the history and initial conditions is different, and so the exact pattern and process that results will be different, even under similar conditions. However, the more we understand the general, the more we accept and understand the anomaly or the outlier from the mainstream. In a sense, this goes to heart of the question of what are the limits to predictability [Blöschl, 2006]. Our efforts at generalization should not be abandoned simply because we will never be fully able to predict each individual watershed.
 Indeed, uniqueness of place is neither an unusual problem nor a problem unique to hydrology. Phillips [2004, p. 39] addressed this issue in relation to geomorphology:
Many areas of science are characterized by creative tension between a search for fundamental laws and generalities that are independent of place and time and the recognition—particularly in the earth and environmental sciences—that geography and history matter.… General laws are acknowledged and utilized, but as constraints and context to the specific events, objects, or situations that are the basis of explanation.
 Harte  has also addressed this issue in relation to ecology, and called for a synthesis of the Newtonian and Darwinian approaches. The appeal to the use of optimality and other organizing principles as a way of overcoming issues of landscape heterogeneity and process complexity is not the complete abandonment of the traditional Newtonian or reductionist approaches. In watershed science, a combination of the Newtonian worldview (based on established mass and energy balances and the specific boundary and initial conditions), with a Darwinian worldview can lead to descriptions of watershed function based on the conditions which constrain the watershed throughout its long-term evolution.
 As we move through the International Hydrological Decade on Prediction in Ungauged Basins [Sivapalan et al., 2003], it is timely that we use the experience gained from numerous past process studies and model development to develop new approaches for prediction. New approaches should rely not on calibration, but rather on systematic learning from observed data, and on increased understanding and search for new hydrologic theories through embracing new organizing principles behind watershed behavior that are derived from our sister disciplines. Most of our measurement campaigns continue to be driven by a desire to increase precision in quantification rather than to develop or test theory. We should instead focus on the development of systematic measurement programs that are specifically targeted to the generation of tests of new theories. This is especially true for new ecological and hydrological observatory networks proposed in the USA and Europe. The quest for new theories that can help us explain natural variability and provide improved predictions based on understanding must be embedded into the design of new observatories. We would argue that any mapping or characterization of landscape heterogeneity and process complexity must be driven by a desire to generalize and extrapolate observations from one place to another, or across multiple scales, and must not be allowed to perpetuate the notion of characterization or mapping for its own sake. Since the principles that govern the self-organization and coevolution of landscape structure and the critical role of hydrologic processes in governing the watershed function involve many disciplines, the design of hydrologic observatories and field campaigns must also be interdisciplinary; the traditional hydrological perspective is too narrow and must be broadened to embrace these new interdisciplinary perspectives. This extends to the way we analyze and learn from data, as well the way we model and predict watershed responses.
 This paper resulted from a CUAHSI vision workshop on “From new descriptions of watershed form and function to new model blueprints,” convened 14–26 June 2004 in Corvallis, Oregon, and funded by NSF grant 03-26064. This document captures much of the flavor of the discussions that occurred at that time. While the meeting did not result in an overall consensus regarding concrete steps toward new theory, all coauthors agree that new thinking and broad discussion are needed to further develop the science of hydrology. Student observers at the meeting are thanked for input and participation, including Ilja Tromp van Meerveld, Kevin McGuire, Willem van Verseveld, Derek Godwin, Cara Poor, and David Rupp. Christina Tague is thanked for her participation in part of the discussions. Postworkshop discussions with Sue Kieffer, Nick Tuffalaro, Adrian Bejan, and Chris Graham were very helpful. Adam Mazurkiewicz, Richard Keim, and Patrick Bogaart are thanked for their critical reviews. The comments of four anonymous reviewers were very helpful. Finally, TU Delft is thanked for their support of the first author during the final preparation of this paper.