## 1. Introduction

[2] The hydrologic models in use today build on a legacy of hydrologic applications in different disciplines. *Singh and Woolhiser* [2002] provide an excellent discussion of the historical context leading to the development of many of the existing hydrology computer models, which range in complexity from simple bucket models to detailed multiprocess models of water flow physics. Some of the earliest hydrologic models were forms of the rational method used to convert rainfall rates to estimates of runoff (based on the comprehensive and incisive work of *Mulvany* [1851]) for designs of urban sewers, drainage systems, or reservoir spillways [*Todini*, 1988]. Through the latter part of the 19th century and early 20th century, hydrologic “models” mostly consisted of empirical formulas or modified forms of the rational method [*Dooge*, 1957; *Todini*, 1988], with model development largely driven by the need to address particular engineering problems.

[3] With the advent of digital computers, new possibilities emerged for hydrologic modeling. The Stanford Watershed Model [*Crawford and Linsley*, 1966] was an early example of a modeling approach that took into account most of the rainfall-runoff processes that occur, creating a moisture accounting model structure that could be applied to a wide variety of catchments. When they produced this model, *Crawford and Linsley* [1966] were at the forefront of the use of digital computers for hydrologic modeling. They faced constraints of limited hydrologic data (largely daily average streamflow rates, some hourly increment precipitation, and in a few locations measured pan evaporation), and extremely limited computer power and storage, so they developed strategies for conceptualizing bulk catchment hydrologic processes within those constraints. Moore's law for computers had not been proposed, and no futurist anticipated the computer capabilities that are taken for granted in 2007. The history of the development of their model is given by *Crawford and Burges* [2004]. By the 1970s, hydrologic models were well-established tools, and the types of modeling approaches had begun to proliferate.

[4] To sort through the many existing models, numerous model classification schemes have been proposed [e.g., *Dawdy and O'Donnell*, 1965; *Clarke*, 1973; *Todini*, 1988; *Singh*, 1995; *Refsgaard*, 1996]. Of these, one of the most comprehensive classifications of mathematical models used in hydrology was introduced by *Clarke* [1973]. In this classification, models were considered either stochastic, with model variables displaying random variation, or deterministic, with model variables regarded as free from random variation. Both stochastic and deterministic models were classified by *Clarke* [1973] as conceptual or empirical, with conceptual models approximating in some way the physical processes. Within these groups, models could be either linear or nonlinear and either lumped, probability-distributed, or geometrically distributed. Lumped models are defined by *Clarke* [1973] as those that do not account for the spatial distribution of input variables or parameters. Distributed models, in contrast, account for spatial variability of input variables, with geometrically distributed models specifically expressing the geometric configuration of points within the model. As an addition to the definition of “distributed” proposed by *Clarke* [1973], *Todini* [1988] proposed dividing distributed approaches into “distributed integral” and “distributed differential.” Using those divisions, a distributed integral model is a network of connected lumped models, whereas a distributed differential model actually includes distributed flow calculations.

[5] The modeling approaches we discuss largely fit within the deterministic-conceptual classification of *Clarke* [1973], and they are all geometrically distributed, representing water pathways from a fixed spatial coordinate (Eulerian) perspective. In practice, the term, distributed hydrologic model, is now widely used in hillslope and catchment (surface water) hydrology to refer to a model that represents in some way the spatial variability and pathways of water through a catchment. Most numerical groundwater models have long been distributed, as they are designed to represent spatial patterns in piezometric head. In contrast, the primary objective of “surface water” hydrologic models has traditionally been to simulate water fluxes at a specified point, usually the discharge in a stream at the outlet of a catchment. For this purpose, spatially explicit models are not always needed.

[6] Distributed modeling approaches have, however, gained significant attention in surface hydrology, for they have great potential as tools for applications such as nonpoint source pollutant transport, hydrologic responses to land use or land cover changes, land-atmosphere interactions, erosion and sediment transport, and many others. Because distributed models can incorporate topographic features, the effects of shade and aspect on hydrologic response, and geologic and land cover variability, they are increasingly used for scientific research, hydrologic forecasting, and engineering design applications. The spatially explicit structure of distributed models also offers the potential to incorporate spatial data from Geographic Information Systems (GIS), remote sensing, and geophysical techniques. Distributed models are currently used to facilitate incorporation of spatially explicit radar rainfall data, snow cover extent, soil moisture, and land surface temperature data into hydrologic response simulations. Although many existing distributed models are used largely for hydrologic research, agencies and private sector groups use distributed models for applications including streamflow forecasting, water resource engineering design, management, and operations, and land use planning.

[7] Structures of existing distributed models incorporate a fusion of factors including computational capacity, data availability, and the original theoretical framework used in building the model. The initial conceptualization of a physically based surface-subsurface hydrologic model, from a reductionist perspective, is attributed to *Freeze and Harlan* [1969], who introduced a blueprint for how such a model could be configured using partial differential equations of fluid flow in three spatial dimensions and time. *Freeze* [1974] followed up on this work and showed how the various components of this approach could be modeled numerically to demonstrate conditions that produce observable hillslope flow mechanisms. The simulations he included were performed using the largest and fastest computers available within the IBM Corporation. No other organization at the time had the resources to replicate this effort. Many modern distributed models are based on the Freeze-Harlan or a similar “blueprint,” with most including simplifications of the flow representations. Other distributed models were designed with a focus on surface water pathways through space, and many of these models represent only Hortonian (infiltration excess) overland flow. Some distributed modeling structures originated from the variable source area concept [*Hewlett and Troendle*, 1975], which was developed in response to evidence that saturation excess overland flow occurs in hillslope hollows, with the area contributing to overland flow varying over time. To simulate this process, models were developed that had the capability of representing moisture contents at different depths and positions on a hillslope.

[8] Many efforts have been undertaken to compare distributed model outputs with one another [e.g., *Smith et al.*, 2004; *Reed et al.*, 2004; *Yang et al.*, 2000; *Wigmosta and Lettenmaier*, 1999; *Michaud and Sorooshian*, 1994; *Chen et al.*, 1994; *Loague and VanderKwaak*, 2002; *Pebesma et al.*, 2005], but model structures often differ so widely that it is difficult to isolate reasons for differences in model performance. With all of the complexity introduced into models, *Klemeš* [1986, p. 179S] warned that “hydrologic models make … ideal tools for the preservation and spreading of hydrologic misconceptions.” When reviewing hydrologic models, *Freeze* [1978] viewed the primary limitations to successful modeling as relating to theoretical assumptions, data scarcity, inadequate computer capacity, and limitations of calibration procedures. Nearly 30 years later, we no longer face inadequate computer capacity, but the other modeling challenges remain.

[9] In an era where distributed models are often considered the hydrologic “state of the art,” it is important to examine closely the developments that led to the creation of the distributed models in use today to determine their suitability for the uses to which they are or will be applied. If these distributed models are to be used effectively to advance the science of hydrology, we need a framework to catalog their capabilities and limitations. Effective use of a distributed model for studying hydrologic processes and/or predicting future hydrologic responses requires an understanding of the influence of the model configuration on simulation output. Given the broad range of existing distributed models and the level of process complexity they contain, there is an increasing need for a common base from which to compare the modeling approaches, ensure selection of appropriate models for particular applications, and adequately merge the scientific interests and needs of the modeler with suitable modeling tools.

[10] In this paper, we give a comprehensive description of the types of process representations in distributed models with reference to the inherent assumptions and limitations of different modeling approaches. On the basis of these approaches, we introduce a framework with which to understand and evaluate existing distributed models and show how representative models fit within this framework.