A framework for classifying and comparing distributed hillslope and catchment hydrologic models



[1] The term distributed model is widely applied to describe hydrologic models that can simulate broad classes of pathways of water through space, e.g., overland flow, saturated groundwater flow, and/or unsaturated flow in the vadose zone. Because existing distributed modeling approaches differ substantially from one another, we present a common framework from which to compare the many existing hillslope- and catchment-scale models. To provide a context for understanding the structure of the current generation of distributed models, we briefly review the history of hydrologic modeling. We define relevant modeling terms and describe common physical, analytical, and empirical approaches for representing hydrologic processes in the subsurface, surface, atmosphere, and biosphere. We then introduce criteria for classifying existing distributed models based on the nature of their process representation, solution scheme, coupling between the surface and subsurface, and treatment of space and time. On the basis of these criteria we describe 19 representative distributed models and discuss how process, scale, solution, and logistical considerations can be incorporated into model selection and application.

1. Introduction

[2] The hydrologic models in use today build on a legacy of hydrologic applications in different disciplines. Singh and Woolhiser [2002] provide an excellent discussion of the historical context leading to the development of many of the existing hydrology computer models, which range in complexity from simple bucket models to detailed multiprocess models of water flow physics. Some of the earliest hydrologic models were forms of the rational method used to convert rainfall rates to estimates of runoff (based on the comprehensive and incisive work of Mulvany [1851]) for designs of urban sewers, drainage systems, or reservoir spillways [Todini, 1988]. Through the latter part of the 19th century and early 20th century, hydrologic “models” mostly consisted of empirical formulas or modified forms of the rational method [Dooge, 1957; Todini, 1988], with model development largely driven by the need to address particular engineering problems.

[3] With the advent of digital computers, new possibilities emerged for hydrologic modeling. The Stanford Watershed Model [Crawford and Linsley, 1966] was an early example of a modeling approach that took into account most of the rainfall-runoff processes that occur, creating a moisture accounting model structure that could be applied to a wide variety of catchments. When they produced this model, Crawford and Linsley [1966] were at the forefront of the use of digital computers for hydrologic modeling. They faced constraints of limited hydrologic data (largely daily average streamflow rates, some hourly increment precipitation, and in a few locations measured pan evaporation), and extremely limited computer power and storage, so they developed strategies for conceptualizing bulk catchment hydrologic processes within those constraints. Moore's law for computers had not been proposed, and no futurist anticipated the computer capabilities that are taken for granted in 2007. The history of the development of their model is given by Crawford and Burges [2004]. By the 1970s, hydrologic models were well-established tools, and the types of modeling approaches had begun to proliferate.

[4] To sort through the many existing models, numerous model classification schemes have been proposed [e.g., Dawdy and O'Donnell, 1965; Clarke, 1973; Todini, 1988; Singh, 1995; Refsgaard, 1996]. Of these, one of the most comprehensive classifications of mathematical models used in hydrology was introduced by Clarke [1973]. In this classification, models were considered either stochastic, with model variables displaying random variation, or deterministic, with model variables regarded as free from random variation. Both stochastic and deterministic models were classified by Clarke [1973] as conceptual or empirical, with conceptual models approximating in some way the physical processes. Within these groups, models could be either linear or nonlinear and either lumped, probability-distributed, or geometrically distributed. Lumped models are defined by Clarke [1973] as those that do not account for the spatial distribution of input variables or parameters. Distributed models, in contrast, account for spatial variability of input variables, with geometrically distributed models specifically expressing the geometric configuration of points within the model. As an addition to the definition of “distributed” proposed by Clarke [1973], Todini [1988] proposed dividing distributed approaches into “distributed integral” and “distributed differential.” Using those divisions, a distributed integral model is a network of connected lumped models, whereas a distributed differential model actually includes distributed flow calculations.

[5] The modeling approaches we discuss largely fit within the deterministic-conceptual classification of Clarke [1973], and they are all geometrically distributed, representing water pathways from a fixed spatial coordinate (Eulerian) perspective. In practice, the term, distributed hydrologic model, is now widely used in hillslope and catchment (surface water) hydrology to refer to a model that represents in some way the spatial variability and pathways of water through a catchment. Most numerical groundwater models have long been distributed, as they are designed to represent spatial patterns in piezometric head. In contrast, the primary objective of “surface water” hydrologic models has traditionally been to simulate water fluxes at a specified point, usually the discharge in a stream at the outlet of a catchment. For this purpose, spatially explicit models are not always needed.

[6] Distributed modeling approaches have, however, gained significant attention in surface hydrology, for they have great potential as tools for applications such as nonpoint source pollutant transport, hydrologic responses to land use or land cover changes, land-atmosphere interactions, erosion and sediment transport, and many others. Because distributed models can incorporate topographic features, the effects of shade and aspect on hydrologic response, and geologic and land cover variability, they are increasingly used for scientific research, hydrologic forecasting, and engineering design applications. The spatially explicit structure of distributed models also offers the potential to incorporate spatial data from Geographic Information Systems (GIS), remote sensing, and geophysical techniques. Distributed models are currently used to facilitate incorporation of spatially explicit radar rainfall data, snow cover extent, soil moisture, and land surface temperature data into hydrologic response simulations. Although many existing distributed models are used largely for hydrologic research, agencies and private sector groups use distributed models for applications including streamflow forecasting, water resource engineering design, management, and operations, and land use planning.

[7] Structures of existing distributed models incorporate a fusion of factors including computational capacity, data availability, and the original theoretical framework used in building the model. The initial conceptualization of a physically based surface-subsurface hydrologic model, from a reductionist perspective, is attributed to Freeze and Harlan [1969], who introduced a blueprint for how such a model could be configured using partial differential equations of fluid flow in three spatial dimensions and time. Freeze [1974] followed up on this work and showed how the various components of this approach could be modeled numerically to demonstrate conditions that produce observable hillslope flow mechanisms. The simulations he included were performed using the largest and fastest computers available within the IBM Corporation. No other organization at the time had the resources to replicate this effort. Many modern distributed models are based on the Freeze-Harlan or a similar “blueprint,” with most including simplifications of the flow representations. Other distributed models were designed with a focus on surface water pathways through space, and many of these models represent only Hortonian (infiltration excess) overland flow. Some distributed modeling structures originated from the variable source area concept [Hewlett and Troendle, 1975], which was developed in response to evidence that saturation excess overland flow occurs in hillslope hollows, with the area contributing to overland flow varying over time. To simulate this process, models were developed that had the capability of representing moisture contents at different depths and positions on a hillslope.

[8] Many efforts have been undertaken to compare distributed model outputs with one another [e.g., Smith et al., 2004; Reed et al., 2004; Yang et al., 2000; Wigmosta and Lettenmaier, 1999; Michaud and Sorooshian, 1994; Chen et al., 1994; Loague and VanderKwaak, 2002; Pebesma et al., 2005], but model structures often differ so widely that it is difficult to isolate reasons for differences in model performance. With all of the complexity introduced into models, Klemeš [1986, p. 179S] warned that “hydrologic models make … ideal tools for the preservation and spreading of hydrologic misconceptions.” When reviewing hydrologic models, Freeze [1978] viewed the primary limitations to successful modeling as relating to theoretical assumptions, data scarcity, inadequate computer capacity, and limitations of calibration procedures. Nearly 30 years later, we no longer face inadequate computer capacity, but the other modeling challenges remain.

[9] In an era where distributed models are often considered the hydrologic “state of the art,” it is important to examine closely the developments that led to the creation of the distributed models in use today to determine their suitability for the uses to which they are or will be applied. If these distributed models are to be used effectively to advance the science of hydrology, we need a framework to catalog their capabilities and limitations. Effective use of a distributed model for studying hydrologic processes and/or predicting future hydrologic responses requires an understanding of the influence of the model configuration on simulation output. Given the broad range of existing distributed models and the level of process complexity they contain, there is an increasing need for a common base from which to compare the modeling approaches, ensure selection of appropriate models for particular applications, and adequately merge the scientific interests and needs of the modeler with suitable modeling tools.

[10] In this paper, we give a comprehensive description of the types of process representations in distributed models with reference to the inherent assumptions and limitations of different modeling approaches. On the basis of these approaches, we introduce a framework with which to understand and evaluate existing distributed models and show how representative models fit within this framework.

2. Background

2.1. Definition of Terms

[11] Comparisons of distributed hydrologic models require careful consideration of the meaning of terms used to describe the models. For reference, consider the hypothetical catchment shown in Figure 1. This catchment has three dimensions, with the X and Y dimensions representing the land surface (map or plan view) and the Z dimension representing the depth below (or above for near-surface atmospheric boundary layer processes) the land surface. A lumped hydrologic model is effectively a one-dimensional model, as it will average processes over the XYZ spatial domain of the catchment to produce an estimate of streamflow at the outlet. Distributed hydrologic models, in contrast, represent in some way the water pathways through XY or XYZ space, making them two- or three-dimensional. Definitions of distributed models vary in the literature, and Singh and Woolhiser [2002] have noted that models can only be considered truly distributed if all aspects of the model are also distributed. We follow a less rigorous definition and consider any model that simulates pathways of water through XY or XYZ surface-subsurface space to be “distributed.”

Figure 1.

Examples of distributed model spatial configurations: (a) hypothetical catchment in plan (XY) view, (b) TIN discretization, (c) rectangular grid discretization, (d) planes and channel segments, (e) explicit discretization of depth (Z), and (f) separation of depth into unsaturated (above water table) and saturated (below water table) zones.

[12] For all models, we use the term, domain, to refer to the spatial area or volume represented in a simulation. Models also consider interfaces between the atmosphere, land surface, and subsurface, domains with distinct water movement processes and timescales. Each of these domains could therefore be considered separate subdomains, or they could be represented as components of a fully connected continuum. Distributed models divide the domain into separate model elements based on either a triangular irregular network (TIN, Figure 1b), rectangular grid (Figure 1c), or some other fundamental subunit (e.g., Figure 1d) defined using criteria such as topography or land surface characteristics.

[13] The movement of water through a distributed model domain is affected by both internal and external factors. Internal factors are characteristics of the material in the model domain (land surface or subsurface), whereas external factors affect the rate at which water enters or leaves the model domain. In modeling terminology, these factors are called variables or parameters, terms that are often used interchangeably. We use the term, variable, to describe factors whose values change over time but will not usually be modified to calibrate the model. These may either be simulated factors in the model (e.g., heads or fluxes) or external factors input as time series (e.g., precipitation). We use the term, parameter, for any other internal or external factor whose value is not calculated directly by the model. Parameter values must instead be input prior to a model simulation, and they could potentially be modified to calibrate the model. Parameters usually have constant values over time (e.g., soil porosity), but some may be assigned time-varying values (e.g., grass height during different seasons). If the parameter values could be measured (e.g., soil porosity), they are considered physically based; otherwise, parameters are empirical. Calibration may result in values of “physically based” parameters that are outside of a physically meaningful range, so even so-called physically based parameters may have empirically derived values.

[14] When representing water movement through space, a model may use a physical, analytical, or empirical approach. A physically based model is derived from equations describing conservation of mass, momentum, and/or energy [Kavvas et al., 2004] in and between the surface and subsurface domains. These and other conservation equations for fluids can be formulated using the Reynold's transport theorem, which relates the rate of change of an extensive property (e.g., water mass or momentum) in a control volume to the rate of change of the property within the volume and through the control surface. For a distributed hydrologic model, a model grid cell (e.g., a single cell in Figure 1c) could be considered the control volume.

[15] Conservation equations derived from the Reynold's transport theorem are partial differential equations (PDEs) and can be represented in one, two, or three spatial dimensions, and time. Solving PDEs or ordinary differential equations (ODEs) requires assignment of initial conditions and boundary conditions. Initial conditions are values of the independent variables in the PDE at time zero. For a distributed hydrologic model, initial conditions (e.g., heads) would need to be assigned for every subunit (e.g., grid cell) in the model domain. Boundary conditions describe the value of the PDE solution or its derivatives at the edges of a simulation domain. For the hypothetical catchment in Figure 1, these edges are the outer perimeter of the catchment, the land surface (boundary with the atmosphere), and some elevation in the subsurface.

[16] A combination of a differential equation, initial, and boundary conditions represents a boundary value problem. For representations of water flow as in a distributed model, no closed form solution to the governing equations (see section 2.2) exists (except for a few prismatic geometries with homogeneous isotropic domains and steady state flow), so solving the full physically based equations requires numerical techniques (see section 3.2). We refer to this type of model representation as physical. Some models instead use simplifying assumptions to derive closed form solutions of the governing conservation equations. We refer to these types of flow representations as analytical. If the representation of a water flow processes is not derived from the governing physically based conservation equations, then we refer to it as empirical. Empirical approaches are based on experimentally determined relationships such as linear regressions. Approaches within all of these categories may overlap to some degree, so these definitions are intended only for differentiating broad categories of modeling approaches.

[17] The following section describes common physical, analytical, and empirical methods used in distributed hydrologic models to represent water movement over and through the landscape.

2.2. Processes to Represent in a Distributed Hydrologic Model

[18] Hydrologic processes that could be simulated in a distributed hydrologic model include unsaturated or saturated subsurface flow, surface overland flow, channel flow, and evaporation and transpiration. Here we introduce the physically based equations that are typically used to describe these processes. We focus solely on how distributed models represent bulk water movement. Quantitative process descriptions are constrained to bulk water movement by means of subsurface matrix flow, surface overland and channel flow, and evapotranspiration. Other processes that affect water movement are described qualitatively. Many distributed models are also designed to simulate the transport of other constituents such as nutrients, contaminants, or sediments, but those processes are not included in this discussion. Our discussion is also restricted to liquid precipitation input.

[19] In addition to the following brief summary, there are numerous books that cover the subject matter of hydrology and the associated environmental physics. The broad scope of the field is covered by Meinzer [1942], Stefferud [1955], Linsley et al. [1982], and Dunne and Leopold [1979]. Detailed analytical treatments are presented by Eagleson [1970] and Brutsaert [2005]. Extensive treatments of land-atmosphere interactions are given additionally in Monteith and Unsworth [1990], Brutsaert [1982], and Eagleson [2002]. Additional detailed treatments of the vadose zone are given by Hillel [1998] and Smith et al. [2002] and in a collection of papers to honor John Philip [Raats et al., 2002]. Perspectives on hydrologic processes and models are given by Beven [2001b], and a collection of papers edited by Anderson and Bates [2001] presents discussions of hydrological model testing.

2.2.1. Subsurface

[20] Water in the subsurface is typically represented in a hydrologic model as flowing through a porous matrix, which may be either unsaturated or saturated. Much has been written on this topic, and an excellent historical perspective of the development of ideas in this field is given by Narasimhan [1998]. The equation used to describe flow through a porous matrix is Darcy's empirical law, which assumes that above some representative elementary volume the separate grains of the porous matrix act as a continuum for which macroscopic parameters can be defined [Freeze and Cherry, 1979, p. 17]. To represent the physical process of porous matrix flow, Darcy's law can be substituted into a conservation of mass (continuity) equation, shown here as the Richards' equation for variably saturated flow in three spatial dimensions and time:

equation image

where θ is the water content; t is time; K(h) is the unsaturated hydraulic conductivity function; h is the piezometric head, and S is a source/sink term.

[21] The unsaturated hydraulic conductivity function can be represented through empirical equations that describe how water content and hydraulic conductivity vary with pressure head for a particular soil. Numerous such equations are used, and some of the most common in hydrologic models are the Brooks and Corey [1964] equation and the van Genuchten [1980] relation. The van Genuchten [1980] functions are

equation image

where θr is the residual water content; θs is the saturated water content (porosity); hs is the air entry pressure head; Ks is the saturated hydraulic conductivity; l is the pore connectivity factor; α and n are empirical parameters, and m = 11/n. These analytical functions are fit to measured data to facilitate analytical calculation of gradients.

[22] The water retention functions for soils are highly nonlinear, and solving Richards' equation under unsaturated conditions requires a robust numerical solution scheme. As a result, many distributed hydrologic models either solve Richards' equation in one dimension (Z) only or use some other scheme to represent subsurface flow. Often, the model representation of the subsurface is separated into saturated and unsaturated “zones” that are each simulated with separate flow equations (e.g., Figure 1f). This strategy can save computational time, for hydraulic conductivity stays constant under fully saturated conditions.

[23] For the unsaturated zone, many analytical approximations to Richards' equation have been applied in distributed hydrologic models. The most common analytical approximation of the Richards' equation for hydrologic modeling is an infiltration equation, which represents one-dimensional (Z) flow of water into the soil. Infiltration equations can be derived from (1) by applying simplifying assumptions for nonlayered soils with uniform initial water content [see, e.g., Tindall and Kunkel, 1999, pp. 352–361]. These types of infiltration equations incorporate both physical and empirical parameters. The commonly used Green and Ampt [1911] infiltration equation, for example, uses the physical parameters, Ks, θs and θi, where θi is the water content at the beginning of the simulation. Other infiltration equations commonly used in distributed models are empirical. For example, the Soil Conservation Service [1968] curve number method estimates “infiltration losses” to the subsurface by empirically derived standard curves for different soil types and moisture conditions. For empirical infiltration equations, none of the parameters used can be physically measured, and values must be determined either through a look up table or through model calibration.

[24] Infiltration equations are useful for simulating the rate of water entering the subsurface, but they do not typically track the location of that water within the subsurface or the change in water content during and after water infiltration at the land surface. Therefore models that use only infiltration equations are often constrained to rainfall-runoff event simulations. As a result, several techniques have been developed for tracking the subsurface moisture state over time in models that use infiltration equations [e.g., Ogden and Saghafian, 1997; Smith et al., 1993].

[25] For saturated subsurface flow, the physical approach used in some models is a PDE for two-dimensional unconfined flow (e.g., Boussinesq equation) in which the saturated thickness can vary through time as the water table rises and falls. One approximation of saturated unconfined flow is the Dupuit-Forchheimer (D-F) theory, which assumes that flow lines are parallel to the impermeable sublayer, and the hydraulic gradient is equal to the slope of the water table [Freeze and Cherry, 1979, pp. 188–189]:

equation image

where h, in this case, is the elevation of the water table. The approach neglects vertical components of flow and works best for shallow flow fields with small water table slopes [Freeze and Cherry, 1979, p. 188].

[26] For steeper slopes, where the assumption of parallel flow lines in (3) breaks down, a common approach for saturated flow simulation is a kinematic approximation, which assumes that the head gradient is approximately equal to the land surface slope [Beven, 1981]:

equation image

where θ is the bed slope, and h is the depth of flow. Beven [1981] showed ranges of acceptability for the kinematic wave equation using a dimensionless parameter that incorporates slope, hydraulic conductivity, and water input rate. His results [Beven, 1981, Figure 5] show that this type of approximation works best for steep slopes with high hydraulic conductivity and a low rate of water input. He provided examples for throughflow rates of 10 and 1 mm h−1. For 10 mm h−1 throughflow and Ksat = 0.1 m h−1 (a tight soil), a slope of approximately 40° is required for the kinematic approximation to be valid, whereas if Ksat is 1 m h−1 (a porous rangeland soil) the required slope is about 15°. At a lower throughflow rate (1 mm h−1) and the same range of Ksat, the required slopes change to 15°, and 5°, respectively.

[27] To accommodate D-F or kinematic approximations, distributed models often define a fixed depth to impermeable bedrock or some other relatively impermeable material, with the bedrock slope assumed to be parallel to the ground surface. In most such models, water leaves the domain only through lateral flow paths or by evapotranspiration. Some models of this type allow for “deep infiltration” losses to account for water percolating below the maximum subsurface depth specified in the model [e.g., Wigmosta and Burges, 1997].

[28] Empirical schemes for estimating subsurface flow are also used in some distributed models. These include some combination of linear and nonlinear reservoir schemes, hydrograph separation techniques, or base flow recession curves for estimating quantities of base flow.

[29] The previous descriptions of model representations of subsurface flow have considered only isothermal flow through a porous matrix. Subsurface flow may occur through large fractures, pores, or other cavities for which the matrix flow representations do not apply; however, most distributed models do not explicitly account for these nonmatrix pathways. For descriptions of a number of approaches for representing “preferential flow” processes, see Beven [1991] and Šimůnek et al. [2003]. Temperature is also known to have a significant effect on infiltration and subsurface flow [see, e.g., Musgrave, 1955; Ronan et al., 1998]. Figure 2[from Musgrave, 1955] shows an increase in the saturated infiltration rate by a factor of about 2.5 as the temperature increases from near freezing to 21°C, demonstrating the significant influence of temperature/viscosity on soil water flow. These effects are rarely incorporated into distributed models. Other factors that influence subsurface flow processes but are not typically included in distributed models include freeze-thaw cycles, and hysteretic wetting and drying in the unsaturated zone.

Figure 2.

Variability in infiltration with changing soil temperature. From Musgrave [1955].

2.2.2. Surface

[30] For the ground surface, water flow occurs in stream channels or on the land surface as overland flow. The physically based equations typically used to represent surface water flow are known as the de St. Venant [1871] equations for shallow water flow, which include equations of continuity and momentum. Distributed hydrologic models typically represent surface flow in one or two (XY) dimensions because representation of three-dimensional surface flow would, for most applications, give more detail than is necessary or feasible for representing bulk surface water movement. Following the form introduced by Gottardi and Venutelli [1993], the 2-D continuity equation is

equation image

[31] The 2-D momentum equations for an assumed hydrostatic pressure flow field are

equation image

where ux and uy are depth-averaged flow velocities in the x and y directions; d is the water depth; hs = z + d; z is the bed elevation; S is the source/sink term; S0 is the bed slope, Sf is the friction slope; g is the gravitational acceleration, and t is time.

[32] Several distributed models have dynamic routing schemes in which these equations are fully solved; however, some have encountered problems with numerical solutions of dynamic routing schemes [e.g., Meselhe and Holly, 1997; Downer et al., 2002]. Instead of dynamic routing, many distributed models instead use either diffusion or kinematic wave approximations to the de St. Venant equations.

[33] The diffusion wave approximation assumes that the inertial terms in (6) are negligibly small. There are various clever ways to arrange the resulting equations. For example, the 2-D diffusion wave approximation derivation by Gottardi and Venutelli [1993] allows the surface water flow equations to be configured similarly to the governing equations for subsurface flow, facilitating coupled surface-subsurface flow computation [see, e.g., VanderKwaak, 1999; Panday and Huyakorn, 2004]. This is accomplished by using an additional equation (Manning, Chezy, or Darcy-Weisbach) to approximate the friction slope. Terms from the governing de St. Venant equations can then be grouped into a “conductance” term, which is analogous to hydraulic conductivity. This approximation results in

equation image

where hs is the water surface elevation, and k is the conductance. For the Manning equation, this conductance is given by

equation image

where d is the depth of flow; n is the Manning coefficient, and s is the length along the direction of the maximum local slope. Manning's representation of frictional flow resistance is valid for fully rough turbulent flow. For thin film (up to about 3–5 mm, depending on slope) overland flow, the flow domain is laminar, and a different flow law is appropriate. This concern was addressed by Crawford and Linsley [1966, Figure A6], who showed that a Manning roughness approximation to laminar plane surface flow agreed well with the pioneering experimental measurements of laminar overland flow from Izzard [1944].

[34] The diffusion wave approximation is effective for representing surface water flow under many conditions, but it does not distribute backwater effects properly through time and is inaccurate for fast rising hydrographs [Fread, 1993]. This is of concern in flat river channels but has little effect in most hillslope surface flow situations. Strict criteria are presented by Ponce et al. [1978] for evaluating the adequacy of a diffusion wave model for open channel flow as a function of the wave period, flow depth, and bed slope.

[35] The kinematic wave approximation to the de St. Venant equations considers only the effects of gravity and friction on flow, resulting in a one-to-one relationship between water depth (stage) and discharge [Lighthill and Whitham, 1955; Singh, 1997]:

equation image

where α and n are constant coefficients. As opposed to a dynamic wave formulation, the kinematic wave approximation results in only one possible wave speed [Henderson, 1966, pp. 367–373], so when combined with the continuity equation, it has only one unknown for each model element [Qu, 2005]. As a result, kinematic approximations are computationally robust and are used in many distributed hydrologic models. Limitations of the kinematic approach include inability to predict looped rating curves [Henderson, 1966; Beven, 1985] and to account for backwater effects. Kinematic approaches are not well suited to represent flows at low slopes or in areas with high lateral inflows either to channels or in the form of intense precipitation falling on a hillslope [Freeze, 1972]; however, Eagleson [1970, p. 330] showed example calculations to demonstrate that the effect of rainfall-infiltration processes on flow dynamics is negligible compared to the effects of gravity.

[36] Because the kinematic approximation assumes that the friction slope can be approximated by the channel bed or land surface slope, it is only valid if the other components of the friction slope are negligible. For channels, Henderson [1966] indicates that this slope approximation is valid for natural floods in steep rivers with bed slopes greater than 10 ft per mile (∼0.002). For overland flow, Woolhiser and Liggett [1967, p. 763] give examples of when a kinematic wave solution is adequate. In follow-up work, Morris and Woolhiser [1980, Figure 6] provide sharp guidelines for the validity of the kinematic approximation. In most situations, the scheme is valid for land surface slopes greater than about 0.001. Morris and Woolhiser [1980, p. 360] advise “… it is necessary to use the full shallow water equations, or at least the diffusion equation, for very flat grassy slopes.”

[37] Numerical techniques are often used to solve the de St. Venant PDEs. Using this approach, the numerical solution will determine the direction of overland flow from cell to cell in XY space. A common alternative approach for simplifying the overland flow computations is to assign the flow direction using a topographically determined routing network. Numerous algorithms have been developed for deriving flow networks by assigning flow directions between model elements. For grid-based models (see Figure 1c), these include O'Callaghan and Mark [1984], Costa-Cabral and Burges [1994], and Tarboten [1997]. Strengths and limitations of these flow direction models have been evaluated by Endreny and Wood [2001]. For models built with a triangulated irregular network (Figure 1b), flow direction algorithms are given by Tucker et al. [2001], Palacios-Velez and Cuevas-Renaud [1986], and Jones et al. [1990]. Models with other fundamental subunits (Figure 1d) often have some other topography-based method of assigning overland flow directions.

[38] Flow routing networks can be either one- or two-dimensional. Most grid and TIN-based models have two-dimensional networks that partition flow from a single model element (e.g., grid cell) into one or more surrounding model elements. One-dimensional networks, in contrast, route all surface flow from one model element to a single downstream model element. Models that use a flow direction network will route surface flow through the network using either analytical or empirical schemes. Those that use analytical schemes typically incorporate an equation derived from the kinematic approximation (equation (9)). Empirical “hydrologic” routing approaches include unit hydrograph and linear reservoir methods for catchment-wide flow and Muskingum-Cunge [see, e.g., Bedient and Huber, 1992, pp. 292–296] for routing in stream channels.

[39] In model representations of surface flow, interaction of surface water with the subsurface, atmosphere, and characteristics of the ground surface are also important. Surface runoff in the form of overland flow can result from either infiltration excess (Horton) or saturation excess (Dunne) mechanisms. Many distributed models that use infiltration equations to represent water movement into the subsurface will simulate only infiltration excess overland flow. To represent saturation excess overland flow, a model must in some way simulate the feedback between the land surface and subsurface. The same is true for model representations of channel flow. Models may be constructed such that water in channels can infiltrate into the subsurface and/or water in the subsurface can exfiltrate into the channel. Methods of representing surface-subsurface interactions are described in section 3.3.

[40] Model representations of surface flow may also incorporate additional features that affect surface runoff. For example, both the land surface microtopography and the land cover affect overland flow by changing the continuity of flow paths and the frictional resistance. Field studies of overland flow patterns [e.g., Emmett, 1978] show that overland flow rarely occurs as a continuous sheet of flow. Dunne et al. [1991] showed that surfaces with microtopography have higher hydraulic conductivities in the microtopographic highs than in the depressions. This results in spatial variability in infiltration rates, with apparent increases in infiltration rates downslope as higher fractions of the elevated microtopography are inundated. Similarly, Seyfried and Wilcox [1995] showed how shrub mounds in rangeland soils have higher infiltration capacities than interspace areas. Zhang [1990] showed that neglecting trending variations in microtopography can significantly affect overland flow simulations. He demonstrated that an equivalent plane surface kinematic wave approximation to the microtopographically varying surface can yield the correct hillslope surface outflow hydrograph, but the simulated flow pattern is incorrect at every location on the land surface. Some models incorporate microtopography by allowing surface water to be retained in depression storage or flow in rills, but few models explicitly incorporate relationships between microtopography and hydraulic conductivities.

[41] Frictional effects are often represented in models by empirical friction factors, such as Manning's n, and these friction factors are usually assumed to be constant through time. Some models have explicit representations of important small-scale land surface features such as roads, ponds, pumps, drainage ditches, or impervious surfaces [e.g., Panday and Huyakorn, 2004; Yeh et al., 2004].

2.2.3. Atmosphere

[42] The atmosphere is linked to both the land surface and subsurface through precipitation and evaporation. Because distributed models represent water movement over and through the landscape in XY or XYZ space, with Z extending from the land surface downward, atmospheric processes are typically represented through external variables. Precipitation is one of these variables, which can be incorporated as a direct input to a model domain. In most models, “throughfall” (precipitation minus water intercepted by vegetation) is applied directly to the ground surface. Because distributed models explicitly represent spatial areas, they offer a structure for spatially variable precipitation input.

[43] Land surface evaporation rates are either included in distributed models as external variables, or they are calculated within the model as a function of other external variables. Models that internally calculate evaporation rates often use the Priestley and Taylor [1972] or the Penman-Monteith equation [Monteith, 1965] to calculate potential evaporation rates, largely because all information needed for more complete representation is not available. The Penman-Monteith approach is

equation image

where LvE is the latent heat flux (energy transfer via evaporation, with Lv the latent heat of vaporization and E the evaporation depth); Rn is the net radiation; G is the soil heat flux; ρa is the air density; cp is the specific heat of moist air at constant pressure; es is the saturated vapor pressure; ea is the actual vapor pressure; Δ is the gradient of the vapor pressure-temperature curve at the water temperature; γ is the psychrometric constant (which varies slightly with temperature and atmospheric pressure); rs is the surface resistance, and ra is the aerodynamic resistance. All of these variables can be measured or calculated from measurements using empirical equations. Some models simulate land surface radiation and energy fluxes based on latitude, season, hillslope aspect, and vegetation height and density and/or simulate ra based on turbulent transfer equations for the atmosphere [e.g., Wigmosta et al., 2002].

[44] Once potential evaporation rates are determined (either by the model or from off-line calculations or measurements), these rates can be incorporated directly as a sink for water in the model for any time and location where water is directly exposed to the atmosphere (e.g., in lakes, stream channels, overland flow, and leaf interception). Incorporating evaporative losses from parts of the model domain without open water surfaces, however, is less straightforward, as evaporation can result in water flux anywhere in XYZ space, with evaporation rates varying with space, time, and moisture availability. In theory, evaporation from the subsurface could be represented by governing physically based flow equations. In soils that are sufficiently wet to transmit enough water to meet surface evaporative demand, the Richards' equation (1) for a vertical soil column can be used to predict soil evaporation with the potential evaporation rate introduced as an external forcing [Hillel, 1980]. The Richards' equation, however, assumes isothermal conditions, so it does not account for the soil's energy balance or vapor transport. For drier soils, equations simultaneously describing water and heat transport may be required [Tindall and Kunkel, 1999, p. 260].

[45] Few (if any) distributed hydrologic models include such detail in representations of soil evaporation. Most models instead “link” evaporation to the soil zone by including a forcing term equal to the land surface evaporation rate. Water will then be extracted by the model from a specified zone of the subsurface. Many models also include empirical algorithms that allow the soil evaporation to vary as a function of vertical soil water content.

2.2.4. Biosphere

[46] Similar to atmospheric processes, the interactions of vegetation with water pathways are usually represented in distributed models through parameters and/or analytical or empirical equations. Many of these interactions are not fully understood and have been described as an important research frontier [Rodriguez-Iturbe, 2000]. Eagleson [2002] has identified numerous rich research opportunities in his pioneering book on ecohydrology. Here we discuss important processes to consider and how these processes are incorporated into distributed models.

[47] Plants interact with water pathways by intercepting precipitation, affecting overland flow processes, and dynamically influencing physical, chemical and biological properties of soil. The process of plant transpiration also has a direct link to water movement in the subsurface. Because transpiration constitutes some or all of the land surface latent heat flux, approaches for simulating transpiration usually are included in evaporation simulations. The rs term in the Penman-Monteith equation is the parameter used to represent plant transpiration processes. This term is often considered a constant, but some models simulate temporal variability of rs to account for vegetation response to environmental factors such as temperature, radiation, and moisture availability [e.g., Wigmosta et al., 2002]. Evaporation and transpiration may be treated as composite terms, or rates may be partitioned between different vegetation types.

[48] For the subsurface, as with evaporation, model representations of transpiration are usually simple sink functions that allow water to be extracted from the entire soil profile at a specified rate. More complex subsurface models incorporate root distribution functions that describe spatial variability in the location of plant roots [e.g., Vrugt et al., 2001b] so that water can be extracted from the subsurface at specified locations. Such models could also include an empirical root water uptake function [e.g., Feddes et al., 1978; van Genuchten, 1985] that describes the conditions under which plants are likely to transpire.

2.2.5. Snow

[49] An important surface process for distributed model applications is snow accumulation and melt. Distributed models that represent snow processes either use empirical heat indices to simulate snow evolution or incorporate more detailed algorithms for simulating snowpack energy and mass transfer. Because the details of snow accumulation and melt processes are beyond the scope of this work, we do not include descriptions of the algorithms here. For examples of distributed modeling approaches for incorporating snow, see, for example, Kuchment et al. [2000], Wigmosta et al. [2002], and Etchevers et al. [2004].

3. Criteria for Distributed Model Comparison

[50] From the preceding discussion of hydrologic processes that are represented in distributed hydrologic models, many model configurations are possible. Therefore, to introduce a common framework for comparing the models, we define several categories for describing the models and the processes they represent. The distributed model classification criteria we introduce are intended to provide guidance for selecting a distributed model for a particular application and understanding how the outputs and internal states of a model are affected by its configuration. Depending on the perspective and models considered, various criteria have been used to classify distributed models in the past [e.g., Singh, 1995; Singh and Woolhiser, 2002; Kavvas et al., 2004], with much of the terminology relating to the focus domain and scale considered as well as the theoretical basis of the model and the model development history.

[51] Our criteria build on those of Loague and VanderKwaak [2004] and stem from the model representations of water flow pathways in time and space. In the classification scheme, these flow pathways are considered the fundamental building blocks of the models, for all other model components are influenced by how the water moves through the domain. The criteria therefore define not only which processes are represented in the models but also differences in how those processes are represented. Model descriptions in Table 1 are restricted to the structural components of the model that influence how the model simulates water movement, for we consider these components the core building blocks of the distributed models. Most distributed models are continually evolving research tools rather than static software packages; therefore we do not include descriptions of model features that are peripheral to this core structure configuration such as code details, user interfaces, data requirements, parameterization schemes, sensitivity analysis packages, and details of the numerical or other computational schemes used to solve the equations.

Table 1. Examples of Spatially Distributed Hydrologic Models and Their Characteristics
ModelSourcesSubsurfaceaOverlandbChannelbOther ProcessescSolutiondCoupling (Surface-Subsurface)dSpaceeTimee
  • a

    Subsurface abbreviations: U, unsaturated zone; S, saturated zone; RE, Richards' equation; B, Boussinesq equation; I, infiltration; IR, infiltration with redistribution; SMA, soil moisture accounting; D-F, Dupuit-Forchheimer; K, kinematic; E, empirical.

  • b

    Surface abbreviations: H, Horton; D, Dunne; DYW, dynamic wave; DW, diffusion wave approximation. KW, kinematic wave approximation; E, empirical.

  • c

    Other abbreviations: DS, depression storage; IS, interception storage; ET, evapotranspiration; SN, snow accumulation and melt; EB, energy balance.

  • d

    Solution/coupling abbreviations: N, numerical; FD, finite difference; FE, finite element; FV, finite volume; IT, iterative; D, direct; FO, first order; S, sequential; SIT, sequential iterative; SNIT, sequential noniterative; BC, boundary condition.

  • e

    Space/time abbreviations: H-C, hillslope-catchment; E, event; E/C, either event or continuous; A, adaptive time stepping; V, variable time steps; F, fixed time steps.

Physical Models With 3-D Subsurface
InHMVanderKwaak [1999]3-D U/S RE2-D H/D DW2-D DWDS ETN/FE/FVFOTIN; H-CE/C A
MODHMSPanday and Huyakorn [2004]3-D U/S RE2-D H/D DW1-D DW, various shapesDS IS ETN/FDFO, SIT, or SNIT linked through BC'srectangular or curvilinear grid; H-CE/C A
WASH123DYeh et al. [2004, 2006]3-D U/S RE2-D H/D DYW, DW, or KW1-D H/D DYW, DW, or KW N/FESTIN; H-CE/C
CATHYPaniconi et al. [2003]3-D U/S RE1-D H/D DW1-D DWDS ETN/FE subsurface N/FD surfaceSNIT: Subsurface to surfacetetrahedral grid, subsurface network, surface H-C scaleE/C A subsurface; F surface
HYDRUS 2-D/3DŠimůnek et al. [2006]3-D U/S REnot simulatednot simulatedETN/FENone; surface represented through atmospheric BCTIN; H-CE/C A
FEMWATER USEPAYeh et al. [1992]3-D U/S REnot simulatednot simulatedETN/FENone; surface represented as variable BCrectangular grid H-CTransient or steady state; F or V
Physical Models With 2-D Subsurface
PIHMQu [2005]1-D U 2-D S Two-state dynamic2-D H/D DYW, DW or KW1-D DYW, DW or KWIS ET SNN/FVFOTIN; H-CE/C A
SHEAbbott et al. [1986a, 1986b]1-D U RE 2-D S B2-D H/D DW1-D DWIS ET SNN/FDSITrectangular grid; H-CE/C A/V for each component
GSSHADowner and Ogden [2004]1-D U RE, I, or IR 2-D S2-D H/D DW1-D DWDS IS ET SNN/FV/FDSNITrectangular grid; H-CE/C A/V for each component
IHDMCalver and Wood [1995]2-D (XZ) U/S RE1-D H/D KW1-D KW N/FE subsurface N/FD surfaceSNIT linked through BC'sH-CE/C F
WEHYKavvas et al. [2004]1-D U: (Z) I (X) K (subsurface stormflow) 2-D S B (regional)1-D H/D KW1-D DWIS ET SN EBN/FDSNIThillslopes or first order watershed subunits with ensemble average parameters; large watershed, regional scaleE/C V for each component
Analytical Surface/Subsurface Models
TRIBSIvanov et al. [2004]1-D U IR 1-D S analytical1-D H/D analytical1-D KWIS ET EBN/FE channel D otherwiseSNITTIN H-CE/C V for each component
DHSVMWigmosta et al. [1994, 2002]1-D U analytical 2-D S analytical2-D H/D E or unit graph1-D EIS ET EB SNDSNITrectangular grid H-CE/C F
Physical Surface Runoff Models
CASC2DJulien et al. [1995], Ogden and Saghafian [1997], Ogden [1998], Senarath et al. [2000]1-D U I, IR, and/or SMA2-D H DW1-D DYW or DW various shapesDS IS ETN/FDsurface runoff model; not coupledrectangular grid H-CE/C F
KINEROSWoolhiser et al. [1990], Smith et al. [1995]1-D U IR1-D H KW1-D KW various shapesISN/FDsurface runoff model; not coupledplane-channel network H-CE A or F
Analytical/Empirical Models With 1-D Surface and Subsurface
THALESGrayson et al. [1992a, 1995]1-D U I 1-D S K1-D H/D KW1-D KW N/FDSNITTopographically defined stream tube network H-CE/C F
TOPKAPICiarapica and Todini [2002]1-D U E 1-D S K1-D D KW or E1-D KETNSNITNetwork of cells H-CF
PRMSLeavesley et al. [1983], Leavesley and Stannard [1995]1-D U I 1-D S E1-D H KW or E1-D KW or EIS ET SN EBDSNITnetwork for storm mode; hydrologic response units for daily mode Cdaily or storm mode (shorter time steps)
HEC-HMSUSACE [2001]1-D U I, SMA 1-D S1-D H KW or E1-D KW or EDS IS ETDSNITnetwork CE/C F

[52] Table 1 shows 19 representative distributed models and gives brief descriptions of their solution schemes and how they represent flow processes, space, time, and coupling.

3.1. Processes

[53] The first category we use to describe distributed models is process representation. The processes listed in Table 1 include subsurface flow, surface flow, and any other processes that affect water movement. For the subsurface, we specify whether the model includes unsaturated flow (U for unsaturated) and/or fully saturated flow (S for saturated). Table 1 also specifies the type of approach used (RE for Richards' equation, I for infiltration, IR for infiltration with redistribution, B for Boussinesq, D-F for Dupuit-Forchheimer approximation, and K for kinematic approximation).

[54] For the surface, we distinguish between overland and channel flow, as most models have separate schemes for simulating these two types of flow. Overland flow representations are described as either infiltration excess (H for Horton) or saturation excess (D for Dunne) or both (H/D), and the table indicates whether the model uses a dynamic wave (DYW), diffusion wave (DW), kinematic wave (KW), or empirical (E) routing scheme.

[55] The “other” category in Table 1 lists whether the model simulates depression storage (DS), interception storage (IS), evapotranspiration (ET), energy balance (EB), or snow (SN). Details of the types of process representation in the “other” category are not included. Any additional capabilities of the model, such as transport simulations, are not listed.

3.2. Solution Scheme

[56] Models that solve boundary value problems (see section 2.1) for a 2-D or 3-D domain employ numerical solution approximations (N in Table 1) to continuous functions. These solutions use either finite difference (FD), finite volume (FV), or finite element (FE) methods (or some combination). In space, finite difference methods are discrete, with the domain represented by a series of points in a structured mesh (e.g., a rectangular grid, Figure 1c). In this method, the values of functions are represented at each grid point. Finite volume methods are also discrete, but they divide the domain into volumes and calculate fluxes averaged across the surface of the volumes, which can be part of an unstructured mesh (e.g., TIN, Figure 1b). Finite element methods can also accommodate unstructured meshes with complex geometries, but in contrast to the discrete finite difference and finite volume methods, they use continuous functions to interpolate within the values of vertices, with the simplest of these functions representing a plane.

[57] Models that are not structured to solve boundary value problems directly use a range of different solution schemes, often representing a combination of approaches. If a component of the model has a continuous function with one or more unknown variables in time or space, the model may use a numerical solution for that component. If several components of a model depend on values of an unknown variable, the model may have an iterative (IT) solution in which different values of the unknown variable are iteratively tried until all subcomponents converge on a single solution. Finally, we refer to any nonnumerical, noniterative model solution as direct (D). Typically, a direct approach is used in models with analytical or empirical functions. The model will calculate a solution by proceeding through the simulation subdomains and model elements in a specified order, using only input variables, stored values from a previous time step, and/or values from a subdomain already solved.

3.3. Coupling

[58] Any distributed model that simulates water movement through both the land surface and the subsurface is a coupled model. If the model accounts for the surface or the subsurface only as a source or sink for water, then it is not coupled. Coupled models require some scheme to account for the interactions between the land surface and subsurface. Holzbecher et al. [2005] describe coupling processes in hydrologic models by means of links and feedbacks between compartments (subdomains) in a model (e.g., surface, subsurface). Links may represent either one-way or two-way interactions between the subdomains. For example, an infiltration equation is a one-way link from the surface to the subsurface.

[59] For a numerical model, two-way links with feedback between the land surface and subsurface can be represented through a first-order approach in which governing equations for the subsurface and surface subdomains are solved simultaneously. In models with first-order coupling (FO), flux terms that represent interactions between the surface and subsurface domains are included directly in the governing equations (e.g., (1) and (7)). As a result, the solution of the governing equations considers the surface and subsurface as a continuum, with interdependent flow processes. Two-way links with feedbacks could also be represented through a sequential iterative approach (SIT) [Holzbecher and Vasiliev, 2005]. In this approach, for example, the surface and subsurface flow equations are solved separately, and the model iterates through different surface and subsurface solutions until it converges on approximately the same piezometric head distribution at the land surface [see, e.g., Panday and Huyakorn, 2004].

[60] One-way links between the subdomains can be represented through a sequential noniterative approach (SNIT) [Holzbecher and Vasiliev, 2005]. For example, in a sequential noniterative approach for a numerical model, flow in one domain is solved first; then the solution for that domain at the surface-subsurface interface is used as a boundary condition for the adjacent domain. For analytical or empirical models, one-way sequential links are typically represented by fluxes from one subdomain into the adjacent subdomain. For some models listed in Table 1, the model references do not clearly describe whether the surface-subsurface coupling is iterative or noniterative; for these models, Table 1 describes the coupling as S, sequential.

[61] Some models do not represent the subsurface as a continuum and therefore require a coupling scheme for the interface between the unsaturated and saturated zones. Coupling between these two zones can follow the same approaches as those described for surface-subsurface coupling. In many distributed hydrologic models, solutions and coupling techniques for the various domains include a fusion of multiple approaches.

3.4. Space

[62] To describe model representations of space, we consider the intended spatial scale of the model and its elements, model discretization, and the spatial dimensions represented. Model discretization schemes are listed in Table 1 as either grid, TIN, network, or other. The intended spatial scales of distributed models range from single hillslopes to continental-scale river basins. Most models could be applied at hillslope to catchment scale (H-C) with variable model element sizes. In practice, computational and data limitations result in more detailed models being used at smaller spatial scales, and simpler models used at larger scales.

[63] When explicit spatial representation is included, it may be in one, two, or three dimensions. For example, detailed coupled surface-groundwater models may simulate the subsurface in three dimensions, surface overland flow in two dimensions, and channel flow in one dimension. Models listed in Table 1 as 2-D in a particular subdomain must have the capability of representing flow paths from each model element to more than one adjacent model element. For surface overland flow, for example, rectangular grids are usually associated with fully 2-D flow representations (see section 2.2.2). In contrast, if each model element is connected to only one downstream element via a routing network, then the model represents 1-D flow routed through 2-D space. This occurs in models that divide the land surface into a network of connected planes in XY space. For the subsurface, many models have simplified representations of the Z dimension, considering it only in terms of a length scale such as a “saturated depth” or as a series of predetermined linked layers.

3.5. Time

[64] In describing model representations of time, we consider the intended time period for the model simulation and the time intervals used in model runs. Some distributed models are intended only for rainfall-runoff event simulations (E), whereas others can simulate both rainfall-runoff events and longer (continuous) time periods (C). The interval of time used in a model is a reflection of both the solution scheme and the types of processes represented. Subsurface, surface, and atmospheric processes occur at different timescales, and the compatibility of these timescales is challenging to represent in a model. Distributed models that use numerical solution schemes may not proceed at a regular time interval but rather change time steps adaptively (A) as needed for stability of the numerical solution. Analytical or empirical models more often proceed at a regular, fixed time interval (F). Many models have separate modules for simulating surface and subsurface processes, and these modules may be implemented at different time steps (V, variable). If the available model documentation does not describe the nature of the time steps (A or F), time step descriptions are excluded from Table 1.

4. Existing Distributed Models

[65] In this section, we describe several characteristic or well-known distributed models (Table 1). The list of existing distributed models was chosen to be representative and to highlight the range of possible modeling approaches. The models in Table 1 are organized into groups with similar features. Most subdivisions are based on model representations of the subsurface, for this is the subdomain in which distributed model characteristics vary most significantly. The subsurface is also the most difficult to model and the most computationally intense when unsaturated flow is represented completely.

[66] For representing bulk water flow processes, an end-member of distributed hydrologic models fully represents the governing conservation equations in three dimensions with first-order coupling between the land surface and subsurface. Examples of models of this type are the integrated hydrology model (InHM) (see VanderKwaak [1999], with key details also given by VanderKwaak and Loague [2001]) and MODHMS [Panday and Huyakorn, 2004]. These models are numerical with implicit solutions to the Richards' equation in three dimensions for the subsurface and two-dimensional overland flow represented by a diffusion wave approximation of the de St. Venant equations. Because the models have first-order coupling between the surface and subsurface, they are capable of simulating feedbacks between these domains including all forms of overland flow generation and reinfiltration. With three-dimensional subsurface representations, they are capable of tracking movement of wetting fronts and capturing characteristics of subsurface flow through hillslopes of any shape (concave, convex, convergent, and divergent) and with any configuration of subsurface stratigraphy. Another surface-subsurface model, WASH123D [Yeh et al., 2004, 2006], also simulates the subsurface in three dimensions using the Richards' equation and can represent surface flow with either a full dynamic wave, diffusion wave, or kinematic wave approximation. Equations in the different domains may be solved at different time steps, with different numerical solution schemes for each of the model components. To accommodate the variable time intervals, the model is sequentially coupled, meaning the surface-subsurface interaction terms are not included in the governing equations.

[67] Three other models shown in Table 1 solve the three-dimensional Richards' equation for the subsurface but have simplified representations of surface water flow and surface-subsurface interactions. The catchment hydrological (CATHY) model [Paniconi et al., 2003] links a one-dimensional surface flow network to the 3-D subsurface subdomain. The USEPA FEMWATER 1,2,3 model [Yeh et al., 1992] does not explicitly simulate surface water flow and instead uses variable surface boundary conditions to represent interaction of the subsurface with the surface. These surface boundary conditions can be either specified as flux for time periods with precipitation or specified as pressure head gradient for time periods with no precipitation.

[68] The HYDRUS model [Šimůnek et al., 1999,2006] is designed to represent variably saturated subsurface flow (and transport) and is now available in 2-D and 3-D as well as a separate 1-D (vertical) version [Šimůnek et al., 2005]. In HYDRUS 2D/3D [Šimůnek et al., 2006], interactions with the surface are represented through an atmospheric boundary condition, which is either a specified head or a specified flux. Of the models discussed here, HYDRUS has the most extensive capabilities for representing details of subsurface flow. The model can represent soil hydraulic properties using either the Brooks and Corey [1964], van Genuchten [1980], Vogel and Císlerová [1988], Kosugi [1996], or Durner [1994] equations. The model also includes capabilities for representing hysteresis, scaling, and temperature dependence in soil hydraulic properties. It incorporates root water uptake [Feddes et al., 1978; van Genuchten, 1985] with spatial root distribution functions [Vrugt et al., 2001a, 2001b] and can represent preferential flow in fractures or macropores [Šimůnek et al., 2003].

[69] Although not included in Table 1 due to the variability in model features between versions of the code, several versions of the U.S. Geological Survey MODFLOW code for 3-D groundwater simulation also fit within the scope of distributed hydrologic models described here. MODFLOW (for the latest version, see Harbaugh [2005]) is a 3-D finite difference numerical model designed to simulate saturated subsurface flow. Packages of the MODFLOW code can include 1-D stream flow, either connected directly to saturated subsurface flow [Prudic et al., 2004] or including variably saturated flow below channels using a kinematic approximation to the Richards equation [Niswonger and Prudic, 2005]. Two separate versions of the code, MODBRNCH [Swain and Wexler, 1996] and MODFLOW/DAFLOW [Jobson and Harbaugh, 1999] connect the saturated subsurface simulation to 1-D channel networks. Another package of MODFLOW simulates variably saturated flow for locations other than the channel network [Niswonger et al., 2006] using a kinematic approximation to the Richards' equation. A separate version of MODFLOW, VSF [Thoms et al., 2006], is designed to simulate 3-D variably saturated subsurface flow.

[70] Other distributed models listed in Table 1 represent subsurface flow in one or two dimensions. Some are numerical models with varying approaches for coupling. The Penn State Integrated Hydrology Model (PIHM) [Qu, 2005] has first-order coupling between the surface and subsurface. The flexible structure of the model allows surface flow simulations using either the full de St. Venant equations or the diffusion or kinematic approximations. Rather than having a number of discrete elements representing the Z dimension in the subsurface, the model uses finite volumes that represent the entire surface-subsurface depth. Within those volumes, subsurface flow is simulated in a two-state (saturated-unsaturated) dynamic mode [Duffy, 1996] in which unsaturated flow occurs in the Z (vertical) direction, and saturated flow occurs in the XY (lateral) directions.

[71] The Système Hydrologique Européen (SHE) model [Abbot et al., 1986a, 1986b] and models developed with the SHE structure (e.g., SHESED [Bathurst et al., 1995] and MIKE SHE [Refsgaard and Storm, 1995]) also solve the Richards' equation in one dimension (Z) for the unsaturated zone then simulate saturated subsurface flow in two dimensions using a 2-D (XY) Boussinesq equation. Unlike PIHM, which simulates a dynamic two-state (unsaturated-saturated) subsurface, SHE considers the water table to be the boundary between the unsaturated and saturated zones. This boundary is determined through a sequential iterative approach. This sequential iterative coupling is also used to connect the surface and subsurface flow in the SHE models. Another detailed distributed model, the gridded surface/subsurface hydrologic analysis (GSSHA) model [Downer and Ogden, 2004] is similar in that it solves the vertical Richards' equation and 2-D lateral transient saturated groundwater flow.

[72] Rather than simplifying computations by decreasing model detail with depth in the subsurface, the Institute of Hydrology Distributed Model (IHDM) [Calver and Wood, 1995] subdivides catchments into hillslope segments along lines of greatest slope. Each hillslope segment is simulated separately, and the outflows of the segments cascade into a channel network. The Richards' equation can then be solved in two dimensions (XZ) for a specified width hillslope segment.

[73] Many other distributed models do not solve Richards' equation for the subsurface and instead represent flow processes analytically. For example, the TIN-based real-time integrated basin simulator model (TRIBS) [Ivanov et al., 2004] approximates unsaturated zone flow by simulating infiltration using a kinematic approximation [Cabral et al., 1992] that allows tracking of multiple wetting fronts. The model routes saturated flow based on a topographically determined routing network that follows the TIN edges with the steepest descent from each model node. This network represents 1-D flow through 2-D space, for each node is connected to only one downstream node. The model routes saturated subsurface flow based on gradients in the saturated water levels calculated at each node from the infiltration scheme. Unlike the numerical approaches for representing subsurface flow, this analytical scheme requires assuming quasi-steady state groundwater flow.

[74] The distributed hydrology soil vegetation model (DHSVM) [Wigmosta et al., 1994, 2002] uses a slightly different analytical approach for representing the subsurface. The model simulates vertical water movement in the unsaturated zone using an assumed unit gradient for downward movement and an analytical (sorptivity) approximation to the Richards' equation for evaporation (vertical movement). Similar to TRIBS, saturated flow is routed based on the gradient in water levels determined from the unsaturated zone calculations for each model grid cell. Alternatively, saturated flow can be routed using the ground surface slope as the assumed gradient. The saturated flow routing is described in Table 1 as 2-D because water in one grid cell can move downslope to one or more adjacent grid cells. The analytical subsurface schemes for both TRIBS and DHSVM are solved directly with no iteration. Therefore, although they are simplified approximations of the subsurface flow behavior, their solutions require much less computational capacity than numerical approaches. These models are both generally intended to be applied at larger scales than many of the numerical surface-subsurface models. Model development focused extensively on land-atmosphere interactions, and as such both models include extensive computations of land surface-atmosphere energy transfer.

[75] Another class of models is designed to represent Hortonian (infiltration excess) runoff only (e.g., CASC2D [Julien et al., 1995] and KINEROS [Woolhiser et al., 1990]). KINEROS was designed specifically for simulations of bulk hillside erosion and sediment transport. These models represent infiltration as a sink for surface water; they do not explicitly model subsurface water movement. Both models are capable of representing redistribution of infiltrated moisture during breaks in precipitation [Ogden and Saghafian, 1997], and CASC2D also has an option for soil moisture accounting [Senarath et al., 2000], which allows the model to be run in continuous mode. A new version of KINEROS (KINER-OPUS2) will also have capabilities for continuous simulation [Goodrich et al., 2006]. Otherwise, the Hortonian models are rainfall-runoff event models, and for each event, the initial soil water content distributions must be assigned.

[76] For surface overland flow, only WASH123D, PIHM, and CASC2D have options for solving the full dynamic de St. Venant equations for shallow water flow; however, these models and several others (InHM, MODHMS, SHE, GSSHA) are typically run using a diffusion wave approximation in 2-D for overland flow and 1-D for channel flow. All of these models numerically solve the diffusion wave approximation, so overland flow directions in XY are determined as an integral part of the numerical solution. Other models such as DHSVM also simulate 2-D overland flow, but the flow direction is determined based on a topographically driven algorithm (see section 2.2.2). Another common approach for simulating surface flow is through an interconnected network of planes and channel segments. Overland flow can be routed through such a network in one dimension, with a specified width or area accounting for the second dimension. The CATHY model uses a numerical diffusion wave approximation in 1-D to represent overland flow through this type of network. IHDM and KINEROS both solve a kinematic wave approximation to route flow in 1-D through the hillslope-channel network. Each model defines this network slightly differently. IHDM separates the domain into hillslope segments of specified width, whereas KINEROS employs user-defined overland flow planes.

[77] For overland flow simulations, models represent microtopographic influences with varying degrees of complexity. The InHM model [VanderKwaak, 1999] keeps track of the surface water depth relative to the specified height of microtopography. When water is infiltrating, the model allows the relative permeability of the land surface to increase as the surface water depth approaches the height of the microtopography. As such, the model can account for the types of microtopographic influences on infiltration rates described by Dunne et al. [1991]. KINEROS also allows direct interaction of flow with the microtopography in that the ground surface area covered by water varies linearly up to a maximum specified microtopographic elevation. The research version of the model (KINEROS2 [Goodrich et al., 2002]) can incorporate small-scale infiltration variability by including a lognormal variation in saturated hydraulic conductivity for each model element [Woolhiser and Goodrich, 1988]. The model incorporates the infiltrability equation developed by Smith and Goodrich [2000], which accounts for spatial heterogeneity in saturated hydraulic conductivity. Most other models allow “depression storage” of surface water but do not explicitly represent variability in infiltration rates associated with microtopographic relief. THALES [Grayson et al., 1992a, 1995] explicitly simulates overland flow in rills, but most other models that include treatment of rill flow (e.g., KINEROS2, WEHY) group rills together with overall overland flow computations.

[78] Channel flow in most distributed models is simulated in one dimension, as stream channel widths are typically smaller than the size of plan view distributed model elements or computational units. Most physically based models simulate channel flow using the one-dimensional version of the equations used for surface overland flow. Channels are usually represented as linear features that can be placed anywhere in the model domain. Models based on a TIN structure often assign channel locations before building up the TIN mesh (e.g., PIHM, TRIBS), so in the process of mesh generation, the channel features end up forming edges of mesh triangles. Because channel flow is affected by channel shape, some models (e.g., MODHMS and CASC2D) have several channel cross-section options that can be assigned, including rectangular, trapezoidal, circular, or specified measured cross sections. The standard shapes are useful if the model is to incorporate engineered channels such as drainage ditches or pipes. For natural channels, some models allow assigned stage-discharge relationships (e.g., MODHMS) so that field rating curves can be incorporated into the flow simulation.

[79] Models that simulate 1-D channel networks have various methods for representing interactions of channels, overland flow, and subsurface flow. Models with first-order coupling can simulate interactions between these domains in multiple directions. For example, MODHMS and PIHM represent the interaction between overland flow and channel flow using equations of flow over a wide broad crest weir [Panday and Huyakorn, 2004]. Models that have fully coupled surface-subsurface representation introduce flux terms in the governing equations to represent the movement of water between channels and the subsurface; however, because 1-D channels are subgrid-scale features for most distributed model applications, representing the dynamics of channel-groundwater interactions in a model is a challenge. MODHMS determines the direction of channel-subsurface flow based on the calculated heads in each domain. The larger head dictates the assumed wetted perimeter of the channel. This type of approach yields a fully coupled channel-subsurface model but does not explicitly represent all possible flow scenarios such as development of seepage faces above the water level height in the channel. Channel flow in models without first-order coupling is typically represented as flow into the channel from the surface or the subsurface but not the reverse direction (channel to surface or channel to subsurface). KINEROS, however, has an option to compute seepage loss from channels.

[80] Several of the analytical models in Table 1 have unique structures for representing both subsurface and surface flow. THALES [Grayson et al., 1992a, 1995] uses an algorithm called TAPES-C to derive flow stream tubes based on topographic contours. These stream tubes are then used to route both surface and subsurface flow using kinematic approximations. Because the tubes dictate the flow direction, all water movement is simulated in 1-D.

[81] Other 1-D network models build on geomorphic conceptualizations of catchment runoff processes. For example, one such approach was presented by Szilagyi and Parlange [1999], who used a geomorphic nonlinear cascade based on Horton-Strahler stream network ordering. Under this approach, catchment runoff processes are represented through a cascade of nonlinear storage elements. Similarly, the topographic kinematic approximation and integration (TOPKAPI) model [Ciarapica and Todini, 2002] generates a tree-shaped network of connected cells and routes surface and subsurface flow through that network using kinematic approximations. TOPKAPI builds in part on some of the assumptions of two widely used conceptual models, TOPMODEL [Beven and Kirkby, 1979; Beven et al., 1995] and ARNO [Todini, 1996], which merit some explanation here due to their influence in the realm of distributed modeling.

[82] TOPMODEL is a lumped conceptual model that uses a topographic index of hydrologic similarity to estimate the characteristics and states in a catchment. Stemming from the variable source area concept for overland flow generation, the model was designed to represent spatial variability in a catchment without explicitly simulating flow through a distributed domain. Computing capabilities were extremely limited when this model was first developed, and it was infeasible to simulate the spatially variable flow. In this approach, the hydraulic gradient is assumed to be equal to the topographic slope (i.e., kinematic approximation), and subsurface flow is approximated by a succession of steady state flow rates. The ARNO model also has a conceptual representation of catchment moisture state but without the topographic and steady state constraints of TOPMODEL. The model simulates dynamic contributing areas to stream flow, with the size of the areas driven by total catchment soil moisture and storage [Todini, 1996]. The TOPKAPI model is designed incorporate the transient phase neglected in TOPMODEL but maintain a simple parameterization that enables application of the model at increasing spatial scales.

[83] We also include several commonly used analytical/empirical distributed parameter models in Table 1. The U.S. Geological Survey Precipitation-Runoff Modeling System (PRMS) [Leavesley et al., 1983; Leavesley and Stannard, 1995] performs runoff calculations for separate defined subunits (homogeneous response units (HRUs)) that are based on landscape characteristics and location in the catchment. The model usually runs in daily time increment mode, in which case the HRUs are the model elements. For storm periods, the model can be run at shorter time intervals, with flow routed through a series of interconnected flow planes and channel segments. The HRUs could be single flow planes, or they could be divided up into several flow planes. The model has a flexible structure, which allows various combinations of analytical and empirical representations of surface and subsurface flow in the HRUs or through the plane and channel segment network.

[84] The U.S. Army Corps of Engineers (USACE) HEC-HMS model [USACE, 2000] also allows for distributed simulation by dividing basins into subunits, each of which can be assigned separate properties. The model structure includes a range of simulation options for infiltration, surface overland flow, and channel flow. HEC-HMS does not directly simulate distributed subsurface flow, but it does incorporate options for empirical estimations of “base flow” contributions to stream discharge. Both PRMS and HEC-HMS have lumped representations of the model subunits, but the subunits are connected to each other to enable 1-D routing of water through 2-D space.

[85] In a separate category, some distributed models are designed for simulations at large scale (kilometer-scale model elements), potentially allowing interaction between a hydrology model and a climate model. The large-scale model, watershed environment hydrology (WEHY), [Kavvas et al., 2004] is based on upscaled conservation equations to account for the heterogeneity of watershed characteristics at large spatial scales. Model subunits in WEHY are hillslopes or first-order watersheds, which are assigned ensemble average parameter values from a probability density function. The model is a physically based numerical model with a 2-D (XY) kinematic wave approximation for overland flow, 1-D diffusion approximation for channel flow, Green-Ampt infiltration, kinematic hillslope subsurface stormflow, and 2-D (XY) saturated regional groundwater flow. Each component of the model operates at a different time step, and the components are sequentially linked. The variable time steps and the ensemble averaging of parameters and state variables permit computationally feasible solutions at the large scale.

[86] Other large-scale models use approaches that are distinct from those introduced here, due to differences in process behavior at large scale and computational challenges in representing detailed water flow processes for large domains. Similar to the ensemble averaging approach in WEHY, the variable infiltration capacity (VIC) model [Liang et al., 1994; Cherkauer et al., 2003], for example, incorporates the representative elementary area (REA) concept [Wood et al., 1988], which assumes that at a large scale (>1 km2 for land surface hydrology), an average hydrologic response can be defined, and explicit representation of smaller-scale parameter distributions is unnecessary. As such, VIC assigns probability distributions of soil and vegetation characteristics to represent the effective hydrologic behavior within large-scale grid cells.

5. Model Selection Considerations

[87] Many of the fundamental processes and equations used in distributed models have been known for many decades (some for more than a century), yet our understanding of how best to represent spatially distributed hydrologic processes is still quite limited. Before plunging into a complex distributed modeling exercise, it is important to consider the broader picture of what a model is and is not. All hydrologic models, however complex, are inherently approximations and simplifications of reality [see, e.g., Oreskes et al., 1994]. A model can only be proven to replicate reality completely if reality is completely known, and this is elusive for a natural or engineered system. Indeed, catchment hydrology has been referred to as a transscientific discipline [e.g., Philip, 1975; Beven, 1996], or one with “questions that can be asked of science and yet cannot be answered by science” [Weinberg, 1972]. Oreskes and Belitz [2001] argue that models are always approximations and should be viewed as tools to be modified in response to increasing knowledge about a natural system.

[88] Getting lost in the details of a particular model, one can easily forget (or be completely unaware of) the legacy of theory, assumptions, and approximations that led to the current model structure and assume that the model can “tell us” what happens in a hydrologic system. Much of our understanding of hydrologic processes evolved before catchment hydrologic models were developed, and one might argue that hydrologic models are frequently used as tools for “justifying” preconceived notions of how a catchment behaves. Dooge [1986, p. 49S] commented that “It is to be feared that a number of hydrologists fall in love with the models they create. In hydrology, as in many other fields, the proliferation of models has not been matched by the development of criteria for the evaluation of their effectiveness in reproducing the relevant properties of the prototype.” On the groundwater side, Theis [1967, p. 138] commented: “The models which we choose and find satisfactory in some types of study of an aquifer have an insidious way of dominating our thought when we want to make other types of study of an aquifer. No one is quite immune to this domination,” yet Anderson and Bates [2001, p. 8] point out that “models can never be conclusively validated, only falsified.”

[89] Given the transscientific nature of the hydrologic problem, how can we hope to evaluate which distributed modeling approaches are appropriate to use for particular applications? Philip [1975, p. 29] suggested: “Let us at least work towards a situation where the trans-scientific judgments which practical hydrologists are forced to make are informed by a truly scientific hydrology: a skeptical science with a coherent intellectual content firmly based on the real phenomena…. The most science can do is inject some intellectual discipline into the republic of trans-science.”

[90] With these philosophical limitations in mind, the following discussion is intended to follow Philip's suggestion and offer some “discipline” for selecting and developing distributed models. We do not endeavor to suggest which of the distributed models described in this paper should be chosen for particular applications, for only a modeler experienced with the nuances of each of the individual codes could be qualified to make such a judgment. Most of the models we have described are research models with continually evolving codes, interfaces, supporting software, parameterization schemes, and parameter optimization packages. Therefore, rather than state which models are most suitable for particular application in 2007, we raise important issues to consider in model selection and leave to the motivated reader the decision of which modeling approach to use.

5.1. Process Considerations

I have been depressed to find people cheerfully applying theory developed for a uniform soil profile to patently non-uniform situations, and others putting into a grand hydrological model a time-dependence of infiltration capacity relevant to ponded-water and not to the rainstorms they had in mind. Philip [1975, p. 25]

[91] Selected distributed models should, at a minimum, represent the major processes that govern water pathways. When deciding which processes to include in a model, a first step is to determine the relative importance of flow in different domains to the modeling objective [Kirkby, 1988]. For example, a common objective of a distributed model simulation is to predict stream flow. In areas where all or nearly all of the streamflow may originates from surface runoff (e.g., arid regions with deep vadose zones or urban watersheds), it may be unnecessary to represent water flow pathways through the subsurface; a Hortonian model with an infiltration equation may be adequate to account for movement of water from the surface to the subsurface. Downer et al. [2002] have noted that Hortonian runoff representation is appropriate in watersheds with low infiltration capacity or where rainfall rates greatly exceed infiltration rates. If a stream is fed by a substantial portion of base flow (groundwater), then a Hortonian model cannot be expected to reproduce the observed hydrographs [see, e.g., Gan and Burges, 1990, Figure 2b and Figure 3, case 2]. If simulations of base flow and/or of subsurface water pathways are important to the model objective, then subsurface flow processes must be included.

[92] For cases in which both subsurface and surface flow processes must be included in a simulation, the nature of the interaction between these processes is an important consideration. Interactions between subdomains occur at the land surface and at the water table (if the model is divided into saturated and unsaturated zones). If these interfaces are dynamic (change frequently through time), then a very detailed representation of the subsurface and robust coupling schemes may be required. Such dynamic interactions may occur, for example, in hillslopes with shallow soils where the water table frequently rises to the ground surface. Figure 3, taken from Langbein and Wells [1955], gives an excellent visual demonstration of the differences in hydrologic processes that occur in two different geologic settings separated by only a short distance; different types of model capabilities are needed to simulate the flashy rainfall-runoff response in Wildcat Creek as opposed to the attenuated runoff response in the Tippecanoe River.

Figure 3.

Variations in hydrologic response from Langbein and Wells [1955].

[93] For any flow process included in a simulation, simplifications of the process dynamics and spatial dimensions are common. Any simplification is based on assumptions, so it is critical to ensure that those assumptions are met. For example, the unsaturated zone is often simulated in the vertical dimension only. This assumption may be adequate in flat terrain, but it will miss important lateral flow paths that result in subsurface stormflow through steeper slopes. Kinematic approximations of subsurface flow, on the other hand, are appropriate only for steep terrain, where water flow directions are largely governed by topography (see section 2.2.1). Simplifications of subsurface characteristics (such as the common assumption of layers of uniform depth, homogeneous soils) may result in a model entirely missing the subsurface stratigraphic features that most influence water pathways. In defining the important processes, however, we always face the challenge of quantifying the fluxes of water into and out of spatial domains that are much larger than we can measure. Beven [2006a, p. 772] refers to this challenge as the “closure problem,” which he cites as being “effectively the ‘Holy Grail’ of scientific hydrology.”

[94] Additional processes such as evapotranspiration, snow accumulation and melt, preferential flow, and soil freezing and thawing all influence greatly water movement in a catchment. As with surface and subsurface flow processes, in some cases, these processes have a minimal contribution to catchment-wide flow behavior, whereas in other cases, they exert strong influences on catchment hydrologic response. If these processes are included in a model, however, the approach is usually empirical, adding to the number of unknown model parameters.

[95] In many models, there is a tradeoff between comprehensive process representation and overparameterization. This is particularly true for any component of a model that includes an empirical process representation. Betson and Ardis [1978] suggested that models should be no more complicated than necessary, and interactions among parameters or coefficients should be minimized. Citing Dawdy [1969], they note (p. 315) “the optimal rule to follow in developing a model would be that of Occam's razor, which states that if a simple model will suffice, none more complex is necessary.” In the case of distributed models, however, pure model simplicity is not achievable, for the models are specifically intended to represent water movement through a space divided into numerous subunits. Beven [1993, 1996, 2006b] has suggested that hydrologic modeling problems always face the dilemma of “equifinality” (often called “nonuniqueness” by others), in which many possible parameter sets can yield the same model outcome. This problem becomes particularly pronounced in models that attempt to incorporate all potential processes using many separate but potentially interrelated parameters.

[96] Therefore, in considering which processes should be represented in a model, it is important to strive for balance between a comprehensive representation of dominant flow processes and a minimum number of unknown parameters, with attention given to process importance, dynamics of interactions with other flow simulations, and relevance of process simplifications [see, e.g., James and Burges, 1982]. Barnes [1995] suggests that models should have parameters that can be estimated from available data at appropriate scales, minimal parameter interactions, adaptability, and transferability. If key processes are neglected, the model will never accurately represent water pathways, whereas if too many processes are included, challenges of overparameterization become compounded. In the case of distributed modeling, one strategy for striking this balance is to begin with an adequate model structure for representing bulk water movement in the surface and subsurface. Of the distributed models listed in this paper, those with the most comprehensive representations of flow physics often have fewer unknown parameters than analytical and empirical models. Recent studies with integrated surface-subsurface models such as InHM, for example, have shown that the ability to simulate all physical mechanisms of water movement is a powerful tool for constraining possible values of unknown parameters [e.g., Ebel and Loague, 2006].

[97] The theoretical advantage of a physically based over an empirical model is that it could be used to predict hydrologic response in areas without “empirical” measurements. In a seminal lecture on physically based modeling, however, Woolhiser [1996] pointed out that one of the major criticisms of physically based models is that too many parameters make them no better than black boxes, likely to be misused and inaccurate. Indeed, many have questioned the ability to represent physical processes accurately [e.g., Grayson et al., 1992b; Woolhiser, 1996; Beven, 2001a]. Loague and VanderKwaak [2004] summarize many of the cited challenges with distributed models but point out that physics-based distributed simulations are of great use in such applications as concept development, hypothesis testing, and field experiment design. In carefully designed modeling problems, physically based approaches have the potential to enhance our conceptual understanding of processes and potentially lead us to new insights [Bredehoeft, 2005; Loague et al., 2006]. The advantage of using models that represent physical mechanisms as completely as possible is that such models allow us to test hypotheses and avoid shortcomings of trying to understand processes without a model structure that is capable of representing them. O'Connell and Todini [1996, p. 6] pointed out that “it is always possible to criticize process representation, but much more difficult to come up with a better alternative.”

5.2. Scale Considerations

For catchments of any size and heterogeneity, the problems of sampling (including data collection and system characterization) and integration become formidable; and the beautiful economy of analytical scientific methods is soon lost in the magnitude, complexity, and imprecision of the task of synthesis. Philip [1975, p. 24]

[98] Consideration of scale is critical in any distributed modeling application. Most field measurements used as model variables are collected at different scales than those represented in the models. With the exception of flow in rivers, almost all distributed models represent large-scale processes with equations that were developed to describe fluxes and states in small-scale experiments. Dooge [1986] posed this problem from a model application perspective and noted that for entirely linear systems, equations for conditions at a microscale can be spatially integrated over a specified domain to give a relationship for the macroscale; however, parameterization at the macroscale becomes much more difficult for nonlinear processes. Most of the processes we wish to represent in hydrology are nonlinear, so the issue of scale is inescapable.

[99] An important ongoing dialogue in hydrology concerns the scale at which the governing conservation equations used in the models described here are appropriate [e.g., Beven, 1989, 1996, 2002; Kavvas, 1999; Singh and Woolhiser, 2002]. As an alternative to the Freeze and Harlan [1969] physically based distributed model blueprint Reggiani and Schellekens [2003] introduce a large-scale modeling approach based on a representative elementary watershed (REW) over which the governing mass and momentum conservation equations can be integrated. This and other new ideas for the next generation of distributed models will help guide our conceptual representation of hydrologic processes.

[100] The dialogue about scale in distributed models should also incorporate lessons from different subdisciplines. Surface water, groundwater, and unsaturated zone communities have all developed model strategies that are well suited to address particular types of problems at different space-time scales. Surface water models bring the constraint of integrated fluxes (stream flow), whereas distributed parameterization of models has been more extensively addressed in groundwater modeling. Soil physics/unsaturated zone models have given significant attention to parameter scaling and small-scale process details such as root water uptake.

[101] When selecting a distributed modeling approach based on existing models, scale can be considered in terms of a model's ability to represent processes at the spatial and temporal scale of interest in a particular application. For example, some physically based numerical models require high spatial resolution to resolve flow fields, particularly near the ground surface or in areas of converging flow. For large-scale modeling applications, setting up a fine enough model element scale to resolve these flow features may be computationally infeasible or unwarranted by the available data. Conversely, if a model is to represent explicitly the XY distribution of soil moisture in the upper few centimeters of soil, then a model that represents bulk average water content for the entire unsaturated zone will be inadequate for the task.

5.3. Solution Considerations

[102] Model solution schemes directly affect simulation output and are critical factors to consider in simulation configuration. The solution's procedures for representing coupled processes, determining time steps, tracking mass balance, and evaluating convergence are all important considerations in evaluating simulation output. For numerical models, solution schemes can differ substantially from one another depending on the constraints built into the computer code and may cause problems such as residual mass balance errors. Celia et al. [1990] showed that certain types of numerical approximations to the Richards' equation could lead to significant mass balance errors, sometimes higher than 10%, and they introduced a mass conservative method for solving the Richards' equation. van Dam and Feddes [2000] further demonstrated differences in numerical solutions of the Richards' equation due to varying nodal spacing and averaging of hydraulic conductivity. Applications of numerical models should include checks to ensure the solution is adequate for conserving mass and capturing the relevant flow processes.

[103] One critical consideration relating to solution schemes is correct representation of boundary conditions. Water flow paths are influenced by boundaries, so analytical or empirical models that do not solve boundary value problems may miss key features of flow processes in space and time. Numerical models that do solve boundary value problems, however, face the challenge of assigning accurate boundary conditions, often a nontrivial task. Definable boundaries such as watershed divides and impermeable bedrock (when located properly) can reasonably be assigned no flux conditions, and many numerical models can accommodate variable boundary conditions at the land surface. Other hydrodynamic boundaries, such as a catchment outlet, can be much more difficult to define. Because solutions are driven by assigned boundary values, effects of a boundary condition can extend well beyond the boundary location into the domain of interest; the accuracy of a BVP solution is constrained by the ability to assign correct boundary conditions.

[104] Finally, a model's solution scheme will affect the computational resources required to operate the model. For example, models that have numerical solutions to Richards' equation will often require multiple iterations for each time step to converge on a solution, whereas models that solve only equations for fully saturated flow may require fewer iterations. The actual computational resources required for a model will relate to the solution scheme, number of model elements, number of subroutines, characteristics of the flow processes simulated, time stepping, length of simulation time period, and any other feature involved in the simulation calculations. Freeze [1974] reported that 98% of computational effort in his pioneering model simulations was consumed by subsurface calculations.

5.4. Logistical Considerations

At the present stage of hydrologic science, hydrologic modeling is most credible when it does not pretend to be too sophisticated and all inclusive, and remains confined to those simple situations whose physics is relatively well understood and for which the modeler has developed a good “common sense” within his primary discipline. Klemeš [1986, p. 181S]

[105] Ultimately, practical considerations play a significant role in selecting a distributed modeling technique, and James and Burges [1982] give a comprehensive overview of these types of considerations for model selection. However elegant theoretically, for a model to be useful, it must have a robust code that ensures that simulations behave as intended in the model's conceptual framework. Amorocho and Hart [1964] and Betson and Ardis [1978] place hydrologic modeling efforts in the realm of the arts, and Barnes [1995] added that creative vision and subjectivity are an essential part of the modeling process.

[106] With complex scientific models, the process of “modeling” involves both the code of the model to be used and the skill of the person using it. Accessibility is a key consideration, as some models are appropriate only for use in research groups, whereas others have been designed for more widespread use. Unlike readily available, comprehensively tested software packages from major software vendors, many distributed hydrologic models require extensive training and expertise. Many of the models described in this paper have steep learning curves and can be implemented most effectively by those involved in model development. Extensive experience with a particular model, especially if the modeler understands the strengths and limitations of the model structure, is invaluable.

[107] When selecting a model, a user should consider not only the structural details of a particular code but also the requirements for supporting software for input data preparation, model configuration, and/or visualization of model results. Some models designed for widespread application have accessible user interfaces and structures that link directly with common tools such as GIS; many research codes, in contrast, do not have user interfaces, so modelers must be capable of converting input and output files to the desired format.

[108] Given these challenges in implementing many distributed hydrologic models, before selecting a model, a prospective modeler should find out how accessible the model is to a new user and the resources available for learning and how to use the model. For users acquiring an existing research model, an important consideration is whether the model code is open source and freely available to the user; a modeler concerned about the details of a simulation algorithm should have the capability of evaluating the suitability of the model code and potentially modifying the code. Users should apply research codes cautiously, however, for they often contain pieced together fragments written by many different researchers. Some codes are much more streamlined, well tested, and robust than others. As highlighted by Sasowsky [2006], model documentation and testing are crucial to ensure that results are reproducible and scientifically sound. It is not uncommon for codes to have unintentional errors, so only comprehensive testing can ensure that a model performs calculations as intended. It is important to understand the limitations of numerical or other solution schemes, particularly software and hardware numerical precision and mass balance conservation, when implementing a model. Users particularly concerned about model uncertainty should also consider whether the model architecture can adequately accommodate schemes for sensitivity analysis or propagation of model, data, and parameter uncertainty.

[109] Another important practical consideration is the type, quantity, and quality of available data. Field measurements provide time series of external variables that affect simulations, information on material characteristics that can be used to parameterize simulations, and data for model calibration and performance evaluation. In an ideal case, measurements would be available to support the level of detail incorporated in a model, though in reality, this rarely (if ever) happens. If detailed measurements for characterizing a model domain are not available, then detailed representations of flow processes in the domain can only be considered hypothetical.

[110] A modeler must also consider how the available data correspond with the architecture of the model. Models all have different structures for incorporating input meteorological (“forcing”) data. Some are designed to incorporate single values of distributed parameters and state variables, whereas others are intended to accommodate ranges or distributions of these model inputs. Some distributed modeling applications incorporate spatial or other data into simulations using data assimilation techniques; if a user wishes to incorporate these types of tools, then it is important to consider how well the model structure can accommodate a data assimilation scheme. In most circumstances, a model code can be modified to incorporate additional types of data into the model forcing and/or evaluation; however, many users may wish to select a code that has an existing configuration capable of merging modeled output with available data.

[111] Relative spatial and temporal scales of measurements and models are also important. If small length- or volume-scale measurements (e.g., soil water content measurements on the order of 10−6 m3) are to be used to calibrate a model, detailed discretizations of the subsurface may be necessary to ensure that model elements are comparable in scale to point measurements. Comparing point measurements, such as soil water contents, to simulation output for larger model elements may provide little to no insight into model performance. Similarly, if a model process representation is greatly simplified, then the simulated variables may behave in the model only as “indices” with little relation to physically measurable variables.

[112] Some distributed models are designed to enable comparison with spatial data such as remotely sensed images of the land surface or geophysical images of subsurface stratigraphy. If these images are to be used in parameterization or evaluation of a model, then the model should explicitly represent the dimensions of the image (XY for remote sensing and XYZ for geophysical), with consideration given to relative scales of image pixels and model elements. If the spatial data are to be used for model evaluation (e.g., remotely derived surface soil moisture), then it is important to ensure that the model representations of “soil moisture” at the land surface are, at least theoretically, analogous to the remotely derived measurements.

[113] Finally, for a practical application of a distributed model, the simulation run time, model output files, calibration objectives, and analysis goals should all fit within a manageable scope. With so many processes, model elements, time steps, and parameter values in a distributed model, the modeling task can easily grow to a daunting size. One strategy for keeping simulations manageable is to simulate detail in space and time only when necessary (e.g., higher spatial resolution at flow convergence zones or higher temporal resolution during storms). For model calibration, the number of parameters to be modified could be kept small both by minimizing the parameterization of the model and by identifying simulation sensitivity to different model parameters.

6. Concluding Remarks

[114] We have introduced a framework for describing and comparing distributed hydrologic models based on the processes represented, the nature of the flow equations (physical, analytical, empirical, or conceptual), coupling, solution techniques, and resolution in space and time. We describe and compare nineteen distributed models that are either commonly used or have unique, or characteristic, approaches for representing spatial flow processes, and we suggest some general guidelines for distributed model selection.

[115] Given the complexity of the models and the processes they represent, the comparison framework is not intended to make judgments about the capabilities of different models but rather to provide a starting point from which to interpret and compare model results. If distributed models are to be used effectively as tools for prediction or for enhancing our understanding of hydrologic processes, it is imperative that we pay attention to the details of the model computational structure. The schemes used for the flow calculations in a model dictate the model output. For any modeling application, a modeler must understand the underlying assumptions of flow physics representations to ensure selection of a model appropriate to the task at hand. After critical consideration of the details of model structure and computation methods, we can use the many different distributed models that exist as important resources that should lead us to more effective methods for simulating water flow pathways at scales ranging from hillslopes to large watersheds.

[116] As we continue the dialogue and quest for effective distributed simulation methods, we should both find ways to learn as much as possible from existing approaches and be open to new ideas. The convergence of data, technology, and ideas give us an opportunity to incorporate distributed simulations into new directions of investigation. One of the reviewers for this paper asked us to extend a challenge to the hydrologic community to discuss “simulation science,” our future computational needs, and how best to merge simulations with data collection and the scientific questions we wish to explore. We hope the community will welcome opportunities to engage in this discussion.

[117] Finally, for colleagues seeking key references for further reading, the six integrative works that follow give useful perspectives about what has been done and the major issues to consider when building distributed models. The paper by Freeze [1974] is as fresh today as when it was first published. The integrative book Hillslope Hydrology edited by Kirkby [1978] remains essential reading. James and Burges [1982] addressed specifically the topics of selection, calibration, and testing of hydrologic models. Spatial variability has been of concern for at least a century, and spatial variability of landscape hydrologic properties received significant attention starting in the early 1970s. A special issue of Agricultural Management, edited by Bouma and Bell [1983], contains 17 papers devoted to this topic. A special issue of Water Resources Research, edited by Burges [1986], provides historical context and challenges for the future. Singh and Woolhiser [2002] provide a comprehensive list of models (including some distributed models) and the historical context from which they were developed.


[118] We appreciate many helpful comments on modeling approaches from David Goodrich, Fred Ogden, Keith Loague, Andrew Western, and Sorab Panday. We have benefited considerably from the thoughtful and helpful comments from the reviewers and from Editor in Chief, Marc Parlange. This work has been funded in part by NSF EAR-0537410, the Osberg Fellowship, and the PEO scholar award.