SEARCH

SEARCH BY CITATION

Keywords:

  • rainfall-runoff modeling;
  • sensitivity analysis;
  • hydroclimatic gradient

Abstract

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Data and Models
  5. 3. Methods
  6. 4. Results
  7. 5. Discussion
  8. 6. Conclusion
  9. Acknowledgments
  10. References
  11. Supporting Information

[1] Lumped rainfall-runoff models are widely used for flow prediction, but a long-recognized need exists for diagnostic tools to determine whether the process-level behavior of a model aligns with the expectations inherent in its formulation. To this end, we develop a comprehensive exploration of dominant parameters in the Hymod, HBV, and Sacramento Soil Moisture Accounting (SAC-SMA) model structures. Model controls are isolated using time-varying Sobol sensitivity analysis for twelve MOPEX watersheds in the eastern United States over a 10 year period. Sensitivity indices are visualized along gradients of observed precipitation and streamflow to identify key behavioral differences between the three models and to connect these back to the models' underlying assumptions. Results indicate that the models' dominant parameters strongly depend on time-varying hydroclimatic conditions. Parameters associated with surface processes such as evapotranspiration and runoff generally dominate under dry conditions, when high evaporative fluxes are required for accurate simulation. Parameters associated with routing processes typically dominate under high-flow conditions, when performance depends on the timing of flow events. The results highlight significant inter-model differences in performance controls, even in cases where the models share similar process formulations. The dominant parameters identified can be counterintuitive; even these simple models represent complex, nonlinear systems, and the links between formulation and behavior are difficult to discern a priori as complexity increases. Scrutinizing the links between model formulation and behavior becomes an important diagnostic approach, particularly in applications such as predictions under change where dominant model controls will shift under hydrologic extremes.

1. Introduction

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Data and Models
  5. 3. Methods
  6. 4. Results
  7. 5. Discussion
  8. 6. Conclusion
  9. Acknowledgments
  10. References
  11. Supporting Information

[2] Watershed models are valuable tools for predicting streamflow, particularly where historical observations are available for calibration. However, selecting an appropriate model structure for a given application is a challenging task. In practice, hydrologic models are often selected according to perceptions of the system, data availability, and modeling objectives [Mroczkowski et al., 1997; Uhlenbrook et al., 1999; Wagener and McIntyre, 2005; Sivakumar, 2008; Clark et al., 2011a], in addition to less objective criteria such as prior experience and ease-of-use. There remains a need for diagnostic methods to link model formulation to its consequent impacts on process-level behavior to inform model selection, calibration, and interpretation [Gupta et al., 2008]. This task is particularly vital for Predictions in Ungauged Basins (PUB) and Predictions Under Change (PUC) applications, where the absence of measured system behavior requires the inference that dominant model processes across a range of hydrologic conditions match the actual dominant processes [Ewen and Parkin, 1996; van Werkhoven et al., 2008a; Wagener et al., 2010]. This study compares the time-varying dominant parameters within three widely used lumped watershed models across a gradient of hydrologic conditions. By diagnosing multiple models across a hydroclimatic gradient, we can identify the critical differences in dominant parameters between the models under transitioning hydrologic extremes and trace these differences back to their underlying mathematical formulations. Even relatively simple hydrologic models can exhibit counterintuitive behaviors due to their nonlinearity.

[3] Many of the diagnostic methods applied to hydrologic models pursue the same goal: to evaluate the dynamic behavior of a model and its constituent processes with respect to the physical system. Nearly 40 years ago, McCuen [1973, p. 37] noted that “the time-dependent nature of sensitivity should be considered in the formulation of hydrologic models”. The popularity of model diagnostics in water resources applications has been motivated by the idea that the consistency of modeled and observed behavior and process controls, rather than model optimality with respect to some measures of performance, should be the primary objective of environmental model identification [Wagener and Gupta, 2005].

[4] Figure 1 shows an example of dynamic system behavior using hydrographs at the monthly scale for the Guadalupe (Texas) and Bluestone (West Virginia) Rivers, with eight colors superimposed to indicate whether streamflow, precipitation, and potential evapotranspiration fall above or below their respective medians in a given month. The blue, green, red, and yellow quadrants signify the relationship between streamflow and precipitation, while the dark and light shades of each color represent high-potential and low-potential evapotranspiration, respectively. Each colored quadrant suggests different dominant processes—for example, a month with low precipitation, high streamflow, and low PE (denoted by the light green color) likely indicates a release from storage, i.e., a baseflow-dominated regime. If a model structure accurately represents real-world processes, we would expect its dominant processes to change accordingly in time.

image

Figure 1. Monthly hydrographs for the Guadalupe River (Texas) and Bluestone River (West Virginia). The eight superimposed colors indicate whether streamflow, precipitation, and potential evapotranspiration fall above or below their respective medians in a given month, as shown in the quadrants to the left. The respective median values of streamflow, precipitation, and potential evapotranspiration for each watershed are as follows (in mm/month): Guadalupe (2.9, 50.7, 84.0); Bluestone (18.9, 75.5, 54.9). The blue, green, red, and yellow quadrants signify the relationship between streamflow and precipitation, while the dark and light shades of each color represent high and low potential evapotranspiration, respectively. Each color suggests a different dominant process in the physical system which a model should reproduce. For example, a month with low precipitation, high streamflow, and low PE (denoted by the light green color) likely indicates a release from storage, i.e., a baseflow-dominated regime. The frequent changes in these regimes highlight the need for time-varying, rather than static, sensitivity analysis.

Download figure to PowerPoint

[5] Using Figure 1 as motivation, a rigorous diagnostic method should explore process-level model behavior while accounting for spatial and temporal variability in hydrologic conditions. The recent history of hydrologic model diagnostics encompasses a variety of methods to address this need, including performance-based, top-down, and sensitivity analysis approaches.

[6] Performance-based diagnostic methods evaluate the suitability of a model for a particular application by comparing outputs of interest (typically, streamflow) to observations. Examples of such diagnostic methods include Chiew and McMahon [1994]; Perrin et al. [2001]; Merz and Blöschl [2004]; McIntyre et al. [2005]; Laaha and Blöschl [2006]; Martinez and Gupta [2010], and Kollat et al. [2011], in which rainfall-runoff models are evaluated based on their ability to reproduce streamflow observations across many watersheds. Such multicatchment approaches identify geographic regions of poor performance and thus potentially point to structural inadequacies in the model. Performance-based diagnostic methods have been extended to include comparisons across models and multimodel frameworks [e.g., Loague and Freeze, 1985; Franchini and Pacciani, 1991; Refsgaard and Knudsen, 1996; Gan et al., 1997; Wagener et al., 2001; Fenicia et al., 2006, 2007; Clark et al. 2008; deVos et al., 2010; Krueger et al., 2010; Staudinger et al., 2011], where optimal performance is used to choose between competing model structures on a per-watershed basis. In this regard, multimodel frameworks explore the effects of model structure on error and uncertainty. Performance-based approaches benefit from their practical focus on streamflow prediction, a common application area for lumped watershed models due to the widespread availability of flow data. One weakness of this approach is that optimal performance does not necessarily signify the proper representation of underlying system processes [Beven, 2001; Wagener, 2003; Clark et al., 2008; Yilmaz et al., 2008]. This issue can be improved by including multiple hydrologic measures of performance in addition to the typical statistical measures [e.g., Martinec and Rango, 1989; Michaud and Sorooshian, 1994; Yapo et al., 1998; Gupta et al., 1998; van Werkhoven et al., 2009], but in general it is very difficult to infer process-level behavior from statistical metrics alone. Opportunities remain for novel diagnostic methods to evaluate models according to process-level behavior in addition to output performance [Kuczera and Franks, 2002; Wagener et al., 2003; Gupta et al., 2008; Clark et al., 2011a]. These process-level shortcomings of performance-based diagnostic methods can be addressed with sensitivity analysis, which is often applied to quantitatively attribute the variability in performance to individual parameters.

[7] Sensitivity-based methods reveal the practical importance of structural assumptions without relying on optimization, which can produce good statistical performance regardless of model structural errors when sufficient degrees of freedom are present. Sensitivity analysis has a long history of application in hydrological modeling, particularly for exploring identifiability and uncertainty within complex parameter spaces and interpreting model behavior in the context of the system being modeled [Hornberger and Spear, 1981; Franchini et al., 1996; Freer et al., 1996; Wagener et al., 2001; Hall et al., 2005; Muleta and Nicklow, 2005; Sieber and Uhlenbrook, 2005; Bastidas et al., 2006; Demaria et al., 2007; Tang et al., 2007a, 2007b]. In particular, the need to identify the effects of parameter interactions on model behavior is a long-standing issue in the field [Clarke, 1973]. Two recent ideas based on sensitivity analysis will serve as the foundation for this study: first, that the controls on model performance will change across a hydroclimatic gradient [van Werkhoven et al., 2008a]; and second, that parameter sensitivities will change in time as real-world dominant processes change [Wagener et al., 2001, 2003; Sieber and Uhlenbrook, 2005; Cloke et al., 2008; Wagener et al., 2009; Reusser et al., 2011; Reusser and Zehe, 2011; O'Loughlin et al., 2012].

[8] Our study contributes a comprehensive diagnostic analysis of the Hydrologic Model (Hymod), the Hydrologiska Byråns Vattenbalansavdelning (HBV), and the Sacramento Soil Moisture Accounting (SAC-SMA) lumped watershed models for 12 watersheds in the Eastern United States, combining the strengths of many of the diagnostic methods discussed above. Rather than measuring performance alone, we use Sobol sensitivity analysis to identify which model components control performance under different conditions. The sensitivity analysis is temporally discretized and incorporates watersheds from multiple hydroclimates to visualize model behavior across a gradient of hydrologic variability. This experiment leads to an intermodel comparison of dominant parameters, an analysis which is especially critical to understanding the complex relationships between model structure and behavior. Our study builds on the recent work of [van Werkhoven et al., 2008a; Reusser et al., 2011; Reusser and Zehe, 2011], but our motivating questions differ significantly from those of these prior studies. In particular, we compare dominant parameters across multiple models under the exact same hydrologic conditions in order to better understand the process-level implications of model selection. Although it may be possible to anticipate a priori that the dominant parameters will differ across models, our analysis explores the many ways in which they differ, and how these differences change across a hydroclimatic gradient. Thus, we can understand the connections between model formulation, exogenous forcing variables, and the consequent dominant parameters for models of varying complexity. To our knowledge, this type of rigorous cross-model comparison of controls has not been published to date and thus constitutes a significant step forward from the methods proposed in prior work. Moreover, performing time-varying sensitivity analysis at the monthly scale enables us to investigate the effects of seasonal forcing fluctuations on dominant model controls.

2. Data and Models

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Data and Models
  5. 3. Methods
  6. 4. Results
  7. 5. Discussion
  8. 6. Conclusion
  9. Acknowledgments
  10. References
  11. Supporting Information

2.1. Watershed Data

[9] Twelve watersheds in the eastern United States were selected to form a hydroclimatic gradient. The locations of these watersheds and their hydrologic properties are shown in Figure 2. Their climates range from very wet, such as the Amite and French Broad basins, to very arid, such as the Guadalupe and San Marcos basins. The 12 watersheds used here generally do not have significant human impacts, and are sized on the order of thousands of square kilometers; their dominant processes will likely differ from smaller headwater basins. For each watershed, streamflow, precipitation, and temperature data were taken from the Model Parameter Estimation Experiment (MOPEX) dataset [Duan et al., 2006], which provides these and other hydrologic data at a daily timescale for 438 U.S. watersheds. For many of the MOPEX watersheds, the available data range from 1 January 1948 to 31 December 2003, with occasional interruptions. Here we analyze the period 1961–1972, i.e., 10 years plus a 1 year warm-up period to remove the effects of initial conditions. This 10 year period was selected because it contains uninterrupted daily data for the 12 watersheds in this study.

image

Figure 2. Locations and basic properties of the 12 MOPEX watersheds in the eastern United States. These watersheds represent a wide range of runoff coefficients (ROC) and aridity indices (AI). Adapted from van Werkhoven et al. [2008a].

Download figure to PowerPoint

2.2. Models

[10] The Hymod, HBV, and Sacramento Soil Moisture Accounting (SAC-SMA) models were investigated in this study. Schematics of all three models are shown in Figure 3, along with a coarse grouping of their parameters based on intended function. The models were modified to use the same simple degree-day snow model [Bergström, 1975] to ensure that any behavioral inconsistencies were not caused by differences in snow components.

image

Figure 3. Model schematics for the Hymod, HBV, and SAC-SMA conceptual watershed models. Storage elements are shaded, and flow partitions are shown as unshaded diamonds. Model parameters are shown in color and are grouped by the process to which they belong.

Download figure to PowerPoint

[11] Hymod is a parsimonious watershed model based on the Probability Distributed Model (PDM) [Moore, 2007], which is used operationally by the River Flow Forecasting System in the United Kingdom. The amount of soil storage in this model is controlled by a maximum size parameter (Cmax) and the coefficient governing the nonlinearity of the storage size distribution (B), which is intended to replicate spatial variability in the size of storage elements. Potential evapotranspiration is computed with the Hamon method [Hamon, 1961; Vorosmarty et al., 1998]. Actual evapotranspiration then depends on the potential amount and the saturation of the soil moisture store. The parameter α divides soil overflow into quick and slow routing, which are controlled by the rate constants Kq and Ks, respectively. The quick flow process is represented by a Nash cascade of three reservoirs, while the slow flow process contains only a single reservoir [Boyle et al., 2000; Wagener et al., 2001]. The simulated streamflow is then the sum of quick and slow flow.

[12] The HBV model [Bergström, 1995; Seibert, 2000; Arheimer and Liden, 2000], used for operational flood forecasting in Sweden, contains elements of the PDM concept but is a more complex watershed representation. The soil moisture storage element contains the same mathematical formulation as that of Hymod, including Hamon evapotranspiration. Overflow from soil storage enters a shallow reservoir, where it exits via direct streamflow (with rate K1), spillover exceeding a value L (with rate K0), or percolation (with rate PERC) into the deep layer. The deep layer in the HBV model has no size limit, and storage here can only exit via the rate constant K2. In general, HBV employs the same surface model as Hymod, but reformulates the routing component to account for vertical rather than horizontal variability.

[13] The Sacramento Soil Moisture Accounting (SAC-SMA) model [Burnash and Singh, 1995; Smith et al., 2003] is used by the National Weather Service for flood forecasting in the United States. It offers a very different conceptualization of watershed behavior than the previous two models. In SAC-SMA, a portion of precipitation becomes direct runoff depending on the amount of impervious area in the watershed, specified by the parameters PCTIM and ADIMP. Remaining water enters the upper zone, where it can exit via evaporation from tension water (UZTWM), streamflow from free water storage (UZFWM), or percolation to the lower zone. The same possibilities exist for the lower zone, with the addition of a supplemental storage element (LZFSM). A noteworthy difference in SAC-SMA is that all of its storage elements have maximum limits, unlike Hymod and HBV, which each have at least one theoretically infinite reservoir. An additional difference is that percolation in SAC-SMA is governed by moisture deficiency in the lower zone rather than spillover from the surface storage elements.

[14] We selected these models because they are widely applied in practice and represent a gradient of increasing complexity. To perform a sensitivity analysis, it is necessary to assign a prior distribution for each of the model parameters. In the absence of additional information, uniform prior distributions are assumed for this study. Figure 4 shows the allowable ranges for the parameters of all three models. Parameter ranges are based on recommendations from prior studies [Wagener et al., 2004; Moore, 2007; Seibert, 1997; Harlin and Kung, 1992; Anderson, 2002]. We would like the parameter ranges to produce acceptable performance to ensure that the sensitivity analysis captures a reasonable portion of the parameter space; this is confirmed with a table of performance metrics as shown in supporting information. However, our primary focus in selecting parameter ranges is not the quality of model performance, but rather to elicit a broad range of model responses in order to determine the parameter groups responsible for these variations.

image

Figure 4. Prior parameter ranges for all three models based on recommendations from previous studies. Uniform distributions are assumed for the sensitivity analysis.

Download figure to PowerPoint

3. Methods

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Data and Models
  5. 3. Methods
  6. 4. Results
  7. 5. Discussion
  8. 6. Conclusion
  9. Acknowledgments
  10. References
  11. Supporting Information

3.1. Sobol Sensitivity Analysis

[15] Sobol sensitivity analysis [Sobol, 2001; Saltelli, 2002] is a global, variance-based method that attributes variance in the model output to individual parameters and their interactions. In previous work, this approach was found to provide the most accurate and robust sensitivity indices, particularly in models with strong parameter interactions [Tang et al., 2007a]. The number of model evaluations required by the Sobol method increases significantly with the number of parameters, but it remains a feasible option for the relatively simple models studied here.

[16] In the Sobol method, the decomposition of total output variance into its constituent parameters and their interactions can be written as:

  • display math(1)

where D(f) represents the total variance of the output metric f; Di is the first-order variance contribution of the ith parameter, Dij is the second-order contribution of the interaction between parameters i and j; and D12…p contains all interactions higher than third order, up to p total parameters. In this study, each parameter's total sensitivity index is used, i.e., its individual effects plus an estimate of its interactions with all other parameters. The first-order and total sensitivity indices are defined as follows:

  • display math(2)
  • display math(3)

[17] The first-order index is simply the fraction of the total output variance comprised by a single parameter i. The total order index is one minus the fraction of total variance attributed to Di, which represents all parameters except i. The total order index effectively removes parameter i from the analysis and attributes the resulting reduction in variance to that parameter. The difference between a parameter's first and total order indices represents the effects of its interactions with other parameters.

[18] Sensitivity indices were calculated according to the methods proposed by [Sobol, 2001; Saltelli, 2002, 2008], in which sensitivity indices are approximated using numerical integration in a Monte Carlo framework. A global sample of the parameter space is taken using a quasi-random Sobol sequence of values to achieve a uniform coverage of the space [Sobol, 2001]; in this study, N = 10,000 was used. The sampling ranges of model parameters can significantly affect sensitivity results if some parameter sets cause poor performance. In this study, we have adopted recommended sampling ranges from prior studies as described in section 2.2 and confirmed that the distributions of performance metrics (root mean square error (RMSE) and runoff coefficient error (ROCE)) do not contain extreme values that would skew our sensitivity results. The parameter sets generated from these sampling ranges are then evaluated in the model, resulting in a distribution of output values, f, which have a total variance D as follows:

  • display math(4)
  • display math(5)

[19] This is a typical calculation of statistical variance, where f0 is the mean of the distribution and θs represents the parameter set associated with sample s. Identifying the variance contributions requires more complicated expressions derived by Sobol [2001] and Saltelli [2008] for the values Di and Di shown in equations (2) and (3). First, the N sampled parameter sets are divided into two equal groups, A and B. The sample set A is used to calculate the total variance as shown in equations (4) and (5). The sample set B is used to resample or fix each parameter as necessary in the following expressions:

  • display math(6)
  • display math(7)

[20] The parameter sets θi are modified to indicate which parameters are sampled from which set. The sample set is denoted by the superscript A or B; the parameters taken from that set are denoted either by i (the ith parameter) or ∼i (all parameters except i). This scheme allows the estimation of first and total order sensitivity indices with a total of N(p + 1) model evaluations, where p is the number of parameters for which indices are to be calculated. To incorporate time-varying sensitivity, this study repeats the entire analysis at a predefined temporal resolution rather than over a single aggregated time window. This approach aligns with other studies of this type [e.g., van Werkhoven et al., 2008a; Reusser et al., 2011; Reusser and Zehe, 2011]. One potential issue with this approach is that model errors in a given time period will manifest themselves as storage errors, which are liable to carry over to the following time period. This means that the sensitivity of error metrics will potentially reflect a combination of the current time period and accumulated errors from the previous period. This is an acknowledged issue with time-varying sensitivity analysis for which no simple solution exists, since updating model states would interfere with the normal influence of the parameters. However, the sensitivity of error metrics produced by our analysis will still identify the processes responsible for that error, regardless of whether those processes occurred during the current or prior period. Thus, the time-varying Sobol results in this study provide a valid interpretation of dominant model parameters.

3.2. Model Evaluation Metrics and Timescales

[21] The model output analyzed in the Sobol method can be as simple as the streamflow generated by the model. However, it is common to substitute a measure of model performance for the raw output, since we wish to analyze the controls on model performance rather than the output alone. The choice of output metric has been found to significantly impact measurements of model behavior and thus the sensitivity results [Diskin and Simon, 1977; Martinec and Rango, 1989; Michaud and Sorooshian, 1994; Yapo et al., 1998; Gupta et al., 1998; Yilmaz et al., 2008; Gupta et al., 2008; van Werkhoven et al., 2008b]. In this study, RMSE and ROCE are used as model output metrics. The RMSE metric represents the sum of squared residuals over a particular time window:

  • display math(8)

where Qs and Qo are the simulated and observed flows, respectively. The ROCE metric represents the error in the water balance, calculated as a percentage bias:

  • display math(9)

[22] Similarly, the choice of the temporal resolution at which these metrics are calculated will affect the sensitivity results [Tang et al., 2007a]. Here we use a monthly timescale alongside the RMSE objective. This combination of timescale and metric explores the model responses at a moderate temporal resolution, with a focus on peak flows. An annual timescale is combined with the ROCE metric to explore the long-term bias of the water balance calculated by the model. These two settings are intended to provide contrasting pictures of model responses and to highlight the importance of choosing an appropriate metric and timescale for a modeling exercise.

3.3. Sorting Time Periods to Create a Hydroclimatic Gradient

[23] The applicability of a diagnostic model assessment in a particular watershed is necessarily limited to the observed hydrologic conditions during the simulation period. We would like to extend our experiment to include conditions which may arise outside the range of observed variability for a particular watershed. To achieve this, we apply a trading-space-for-time approach in which a spatial gradient of watersheds is used as a proxy for temporal hydrologic variability [Hundecha and Bárdossy, 2004; Yadav et al., 2007; Singh et al., 2011]. In order to compare model performance across a broad range of hydrologic conditions, we sort the months and years for all watersheds in our analysis according to the streamflow, precipitation, and potential evapotranspiration that occurred during that time period. This follows a prior study in which watersheds of moderate wetness were found to behave similarly to either wet or dry watersheds depending on the monthly and annual conditions [van Werkhoven et al., 2008a]. All 12 watersheds are allowed to mix during the sorting; in general, the driest watersheds account for the driest months and years and vice versa, but the potential for overlap exists. Figure 5 shows how the 12 watersheds combine to create hydroclimatic gradients.

image

Figure 5. Sorted hydroclimatic gradients containing all 12 watersheds for the annual and monthly timescales. Watersheds are allowed to mix during sorting, but the low-flow and high-flow periods generally correspond with the watersheds that are dry and wet on a long-term basis, respectively. Watersheds of medium wetness exhibit significant overlap, more so at the monthly timescale where hydrologic conditions are more variable. Streamflow, precipitation, and potential evapotranspiration quantities have been normalized by watershed area to remove size from the comparison of monthly and annual characteristics.

Download figure to PowerPoint

[24] As Figure 5 indicates, the driest months and years typically occur in the driest watersheds (Guadalupe and San Marcos), while the wettest months and years typically occur in the wettest watersheds (French Broad and Tygart Valley). A significant amount of overlap occurs between watersheds of medium wetness, particularly at the monthly timescale where individual periods are subject to greater variability. We sort the resulting sensitivity indices according to these same gradients. Combining watersheds together in this manner allows us to form a well-defined hydroclimatic gradient to interpret model controls with respect to a broad range of hydrologic characteristics that could not be captured within a single basin.

4. Results

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Data and Models
  5. 3. Methods
  6. 4. Results
  7. 5. Discussion
  8. 6. Conclusion
  9. Acknowledgments
  10. References
  11. Supporting Information

4.1. Time-Varying Sensitivity Analysis

[25] Sobol sensitivity indices were computed at the monthly and annual timescales for all 12 watersheds using the RMSE and ROCE metrics, respectively. We expect the temporally discretized sensitivity analysis to provide significantly more information about model behavior than a temporally aggregated approach, better capturing the natural variability of hydrologic processes over time. An example of the inherently dynamic nature of sensitivity indices is shown in Figure 6 for a single watershed, the Bluestone River, West Virginia, on a monthly timescale for the decade beginning in the water year 1963.

image

Figure 6. Time-varying monthly sensitivity of the RMSE metric to model parameters for the Bluestone River, West Virginia. The seasonal streamflow cycles lead to clear patterns in the sensitivity indices. For example, the controls on the SAC-SMA model shift between surface parameters during dry months (RIVA, PCTIM) and lower zone parameters during high-flow months (LZFSM, LZFPM, LZSK, LZPK).

Download figure to PowerPoint

[26] The parameter sensitivities of all three models change significantly through time as streamflow undergoes its seasonal cycles. A sensitive parameter does not necessarily identify which processes are occurring in the actual watershed. Rather, it indicates that parameter variation within a specific range of values strongly influences the performance metric (RMSE in Figure 6), and the absence of a process can impact performance just as much as the presence of one. Sensitivity indices must be interpreted simply as controls on model performance. Several parameters show cyclical sensitivities with respect to the hydrograph, particularly in the SAC-SMA model. During low-flow months, the dominant parameters in SAC-SMA are the percent impervious area, PCTIM, and RIVA, which defines the riparian area subject to evapotranspiration. These sensitive parameters indicate that low direct runoff and high evapotranspiration are expected to influence the error in streamflow during dry months. During high-flow months, SAC-SMA is controlled by a combination of the lower zone storage maxima, indicating that high flows are primarily generated by overflows from lower zone storage. We observed similar results in the remaining eleven watersheds, although their streamflows (and thus their sensitivities) are typically less cyclical in nature. Plotting the time periods in chronological order, as in Figure 6, clearly shows the time-varying sensitivity of the parameters. However, in order to investigate the relationship between sensitivity and the hydroclimatic gradient, a different visualization approach is needed.

4.2. Monthly RMSE

[27] Plots such as Figure 6 provide interesting insight into the temporal nature of sensitivity indices—in this case, the sensitivity of monthly RMSE, which focuses on peak flow errors. However, they do not allow for simple, direct interpretation of the relationships between sensitivities and changing streamflow, precipitation, and temperature conditions, i.e., across a strong hydroclimatic gradient. This is achieved in Figure 7 for the RMSE objective by sorting the monthly sensitivity indices for all 12 watersheds along ascending gradients of streamflow, precipitation, and potential evapotranspiration.

image

Figure 7. Monthly sensitivity of RMSE for all 12 watersheds. All three panels contain the same sensitivity indices, but they are sorted separately according to monthly streamflow (Q), precipitation (P), and potential evapotranspiration (PET). The entire 10 year period is shown, i.e., each row of each panel contains (12 watersheds) × (120 months) = 1440 sensitivity indices. Sorting the indices in this way reveals clear patterns across the hydroclimatic gradient, particularly in the left panel when sorting by monthly streamflow. All 12 watersheds are sorted together—importantly, the monthly characteristics appear to determine sensitivity to a greater extent than the location of a particular watershed.

Download figure to PowerPoint

[28] Figure 7 provides a clear picture of the relationship between the monthly sensitivity of the RMSE objective and the hydroclimatic gradient. Because all 12 watersheds are included, the difference between wet and dry months comprises a broad range of conditions; Figure 7 confirms that sensitivity depends more on the monthly wetness than on the location of each watershed. The clearest patterns emerge in the left panel, where the indices are sorted by monthly streamflow. This is expected since streamflow data are used in the calculation of the RMSE metric. In the Hymod model, low-flow months are dominated by the parameters defining the size of soil moisture storage (Cmax and B) and α, which separates quick and slow flow. Hymod does not allow streamflow to occur until the soil moisture element has been saturated, which explains the importance of this storage threshold for controlling peak events during dry months. During high-flow months, α becomes increasingly dominant, and the quick flow rate constant Kq also becomes sensitive, denoting a shift in sensitivity of peak events from the soil store to the routing parameters as monthly wetness increases. Sensitivity indices for SAC-SMA are spread across a larger group of parameters. The percent impervious area (PCTIM) and riparian vegetation index (RIVA) are most sensitive in low-flow months, when low direct runoff and high evapotranspiration are required to influence the large events during dry months. The lower zone storage maxima LZFPM and LZFSM control SAC-SMA during high-flow months, since any water in excess of the storage maxima will become runoff. This finding contrasts some of the common calibration strategies for the SAC-SMA model in which the lower zone parameters are assumed only to affect low-flow periods [Hogue et al., 2006]. Finally, for the HBV model, the percolation quantity PERC and the shallow layer rate constant K1 are almost uniformly sensitive regardless of hydrologic conditions, indicating that variability in these parameters is the major factor in simulating peak flows and timing for this model. The parameter B (the exponent governing storage distribution) dominates during dry months but becomes unimportant as monthly streamflow increases, which represents similar behavior to the B parameter in Hymod. The presence of clear visual correlations between the sensitivity indices of all 12 watersheds and the hydroclimatic gradient in Figure 7 suggests that physical differences between watersheds (e.g., geology, land cover, etc.) may be indistinguishable when sorting by wetness over a fixed period. In general, the sensitivities during low-flow months correspond to months with high-potential evapotranspiration. Relationships between the sensitivity of the RMSE metric and monthly precipitation are less clear, indicating that model output (streamflow) is more closely linked to parameter sensitivity than is model input (precipitation). This could occur because streamflow is directly involved in the calculation of performance metrics, or because the impact of precipitation on performance is obscured by the lag between time periods. However, the HBV model does reveal some relationships between sensitivity and precipitation, particularly for the parameters B (the soil storage exponent) and PERC (the amount of percolation from the shallow layer to the deep layer). The B parameter also appears sensitive during the months with the highest potential evapotranspiration, so it is controlling the amount of water available for evapotranspiration during the hottest, driest months. Finally, because total order sensitivity is used, the indices in Figure 7 do not necessarily sum to unity for each model due to the overlapping effects of parameter interactions [Saltelli, 2002]. Although total order sensitivity is shown here, the sensitivity indices depend heavily on these interactive effects, and plots of the interactions themselves (total minus first-order effects) show the same trends across the hydroclimatic gradient as those shown in Figures 7 and 8.

image

Figure 8. Annual sensitivity of ROCE for all 12 watersheds, sorted by monthly streamflow (Q), precipitation (P), and potential evapotranspiration (PET). The long-term water balance is dominated by only a handful of parameters in each model regardless of hydrologic conditions, although some clear relationships between sensitivity and annual streamflow are still visible.

Download figure to PowerPoint

4.3. Annual ROCE

[29] The ROCE metric identifies the error in the long-term water balance computed by the model. We calculate the ROCE metric at an annual timescale to allow for a longer aggregation window and to provide a measure of model performance with a very different focus than the monthly RMSE metric shown in Figure 7. Figure 8 shows a similar comparison of model controls across the hydroclimatic gradient for the ROCE metric at the annual timescale.

[30] Figure 8 shows that the sensitivity of the long-term model bias is restricted to only a few parameters for each model. The ROCE metric defines the ability of a model to simulate the water balance of a system by properly partitioning precipitation between evaporative losses and release via streamflow. Release has been defined as one of the fundamental watershed processes with which regimes of watershed behavior may be classified [Wagener et al., 2007]. Based on previous studies, we can expect these sensitive parameters to be at least tangentially related to the evaporative fluxes in each model, since this is the primary route via which water exits the system [van Werkhoven et al., 2008a]. For Hymod, parameter B dominates the long-term bias almost exclusively; as expected, this parameter controls the size of the soil moisture element and thus the amount of water available for evaporation. In SAC-SMA, the driest months are controlled by the PFREE parameter, while the majority of remaining months are dominated by the lower zone tension storage, LZTWM. Evaporation occurs from the tension stores in SAC-SMA proportional to their respective saturation levels, and the PFREE parameter in this context controls the amount of flow entering the lower zone tension store as opposed to the primary and secondary stores. The HBV model is dominated by the B parameter during dry months, similar to Hymod, but the percolation rate PERC also appears during the wettest months. The strong influence of the PERC parameter in the HBV model during high-flow months suggests that it is effectively sequestering water in the infinite lower store, preventing it from exiting the system; this parameter is not involved in evaporative fluxes and therefore should not affect the water balance. This finding indicates that selecting behavioral parameter sets on the basis of RMSE alone will not guarantee an accurate water balance. This is a potentially unexpected behavior of the HBV model revealed by sensitivity analysis and has been confirmed by recording some sample time series of the lower zone storage quantity, shown in supporting information. In general, the long-term water balance for these models is indeed controlled by parameters associated with evaporative fluxes. The dominant parameters for the annual ROCE metric are largely different than those shown in Figure 7 for the monthly RMSE metric, highlighting the importance of selecting an appropriate metric and timescale for a modeling application. Again, dry months show a correspondence with high energy months, and few meaningful patterns emerge when sorting by precipitation. We have supported the conclusions from Figures 7 and 8 by performing a sensitivity analysis of grouped model components, the results of which are shown in supporting information. Importantly, Figures 7 and 8 show that these three models behave very differently across a hydroclimatic gradient, the implications of which will be discussed in the next section.

5. Discussion

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Data and Models
  5. 3. Methods
  6. 4. Results
  7. 5. Discussion
  8. 6. Conclusion
  9. Acknowledgments
  10. References
  11. Supporting Information

5.1. Linking Dominant Parameters to Model Formulations

[31] Figures 7 and 8 show that the sensitivities of model parameters share a strong relationship with observed streamflow. The monthly RMSE metric focuses on model performance during peak flows over relatively short windows; in general, we can expect this metric to be controlled by surface parameters during dry months, and lower zone parameters during wet months. By contrast, the annual ROCE metric focuses on the ability of a model to reproduce the observed long-term water balance, and thus is typically most sensitive to parameters involved in evaporation processes and other means of water permanently exiting the system. A simplified, qualitative summary of these results is shown in Figure 9. In Figure 9, parameters which generally exhibit total-order sensitivities greater than 0.3 during either dry or wet periods are classified as sensitive.

image

Figure 9. A qualitative summary of dominant model components for the (left) monthly RMSE and (right) annual ROCE metrics as streamflow increases from low to high. Dominant parameters are highlighted in gray. Comparing the controls across models reveals fundamentally different dominant processes, suggesting that these models should not be used to infer true physical processes without additional information about the system being modeled. The tendency for each model to be controlled by a certain component can also inform the model selection process.

Download figure to PowerPoint

[32] The differences in dominant controls across these three models can be attributed to their contrasting mathematical formulations. For the RMSE metric, surface parameters tend to dominate during low-flow months across all three models. However, these parameters do not necessarily represent the same physical processes. In Hymod and HBV, the surface storage limit (Cmax and Fcap, respectively) and nonlinear storage coefficient (B) are classified as sensitive during low-flow months—a reasonable similarity, considering their identical soil moisture formulations. In these models, streamflow cannot occur until the soil moisture store is filled, so these parameters define the threshold at which flow occurs during low-flow months. During high-flow months, the surface storages of Hymod and HBV are likely saturated regardless of their parameterizations and thus do not contribute to variance in output performance. In the SAC-SMA model, however, the parameters dominating the RMSE metric during low-flow months are PCTIM and RIVA, which represent the percent impervious area and the riparian vegetation area, respectively. The sensitivity of these parameters during low-flow months indicates that they represent a model process similar to that of Cmax, Fcap, and B in the Hymod and HBV models, allowing water to leave the system via direct runoff and evaporation, although the physical processes they are intended to represent are entirely different.

[33] During high-flow months, the RMSE metric is controlled by different components across the models. Hymod is dominated by a mix of the α and Kq routing parameters, likely due to the importance of quick routing for matching the magnitude and timing of peak flows. SAC-SMA is controlled by its lower zone parameters during high-flow months, owing to its mathematical formulation: the lower zone in SAC-SMA is filled according to its moisture demand, whereas the lower layers in Hymod and HBV are only passive receptors of spillover from above. The importance of this demand-driven percolation function in the SAC-SMA model was also noted by van Werkhoven et al. [2008a]. Finally, during high-flow months the HBV model is controlled by a combination of its percolation parameter and the rate constant K1, suggesting similar behavior to that of Hymod. Importantly, Figure 9 indicates that the dominant physical processes one would infer from a modeling exercise will depend on the choice of model. The various formulations of these models inherently emphasize different processes, and the choice of model structure must therefore be justified with respect to the physical system.

[34] As shown in the right panel of Figure 9, the parameters responsible for the annual ROCE metric will be those that contribute to water leaving the system, typically either via evaporation or runoff. For Hymod, parameters associated with the surface storage element (from which evaporation occurs) dominate the long-term water balance regardless of streamflow and forcing characteristics. As expected, the upper and lower tension storages in the SAC-SMA model control the long-term water balance (or bias). These storage elements are the primary means of evaporation in the model. Also present are the PCTIM and PFREE parameters; the former controls direct runoff, while the latter in this context is responsible for percolation between the upper and lower tension zones. Figure 9 shows that the long-term water balance in the HBV model is slightly more complicated. During low-flow years, the expected combination of Fcap and B is classified as sensitive, similar to Hymod. However, during high-flow years, HBV is dominated by the PERC and K1 parameters, neither of which affect the evaporative fluxes from the soil storage zone. These two parameters are responsible for the division of water between the shallow and deep layers. Because the deep reservoir in HBV is infinite, it is “removing” water from the system by storing it for the duration of the model period when the rate constant K2 is sufficiently slow. This is a significant example of model formulation impacting performance in an unexpected way. A plot showing the constant growth of HBV's lower zone storage element under wet conditions is shown in supporting information. Again, these results show that conceptual models will not necessarily exhibit the same process-level behavior, regardless of whether the output streamflow can be fit to an observed time series.

[35] These differences in dominant parameters are supported by an analysis of streamflow contributions from the different layers of each model, as shown in Figure 10. The divisions between surface and subsurface flows are shown for one year of daily data in the French Broad watershed, which is one of the wetter watersheds in this study. In the Hymod model, precipitation events are accompanied by spikes in quick flow, while low-flow periods are comprised primarily of slow flow. This is also the case in the HBV model, with the addition of overflow from the shallow layer during the peak events. On the other hand, streamflow generated by the SAC-SMA model is almost always controlled by subsurface flows, even during peak events. This supports the findings in the sensitivity analysis where the lower zone of SAC-SMA is most sensitive during wet periods due to its demand-driven percolation function. Figure 10 does not contain any truly dry periods, since the French Broad River maintains fairly steady flow year round. The differences in model behavior in response to precipitation events reinforces the idea that the dominant processes inferred from a model will depend on the choice of model, even for the same watershed and time period.

image

Figure 10. Flow contributions for the three models, divided into surface and subsurface components. The Hymod and HBV models respond to precipitation events with spikes in near-surface flows. The SAC-SMA model is controlled by subsurface flows, even during peak events, as a result of its demand-driven percolation function. The models exhibit different dominant flow contributions, which supports the results of the sensitivity analysis.

Download figure to PowerPoint

[36] On a per-model basis, the results shown in Figures 9 and 10 are generally in agreement with sensitivity analyses performed in prior studies. Wagener et al. [2003] and Yang [2011] each found the performance of the Hymod model to be controlled by surface parameters during low-flow conditions and the quick routing parameter during high-flow conditions, despite using very different watersheds and model variants. Lidén and Harlin [2000] used probability distributions of best-performing solutions to estimate the sensitivity of parameters in the HBV model across watersheds spanning multiple continents; the driest watershed in their study was dominated by the storage parameter Fcap, while the wettest watersheds revealed much higher sensitivities to the rates PERC and K1. Abebe et al. [2010] performed both sensitivity and identifiability analyses on the HBV model in the Leaf River, Mississippi, finding that the RMSE metric is typically dominated by routing parameters (specifically, MaxBas), while the water balance metric depends more on the surface parameters Fcap and B. Finally, van Werkhoven et al. [2008a] found that for the SAC-SMA model, the performance of the ROCE metric is almost always controlled by the LZTWM parameter across the same 12 watersheds studied here. They found that the RMSE metric is sensitive to a mix of impervious area and lower zone parameters in dry watersheds and a combination of LZFSM and LZFPM in wet watersheds; these results held when the analysis was discretized on an annual basis. This study has built on these prior findings by contributing a rigorous exploration of model sensitivity across a gradient of hydrologic conditions in space and time to begin to generalize the understanding of model sensitivity. The temporal discretization of sensitivity indices allows us to compare model controls across hydrologic conditions both within and across watersheds, combining the insights of a dynamic analysis [Wagener et al., 2003; Uhlenbrook et al., 1999; Pappenberger et al., 2008; Reusser et al., 2011; Reusser and Zehe, 2011] with those of a long-term hydroclimatic analysis [van Werkhoven et al., 2008a]. The differences in dominant controls can be traced back to differences in model formulations, often resulting in counterintuitive diagnostic insights.

5.2. Broader Implications of Model Differences

[37] The temporal discretization of sensitivity indices allows us to compare model controls across hydrologic conditions both within and across watersheds, combining the insights of a dynamic analysis [Reusser et al., 2011] with those of a long-term hydroclimatic analysis [van Werkhoven et al., 2008a]. The results of this study indicate that model performance across a gradient of hydrologic conditions is controlled by components intended to represent different physical functions across the Hymod, HBV, and SAC-SMA models. This is not a surprising conclusion; any model will necessarily emphasize certain processes while excluding others, and different formulations have arisen to meet the needs of diverse modeling applications. It follows naturally that a comparative sensitivity analysis will reveal contrasting controls across multiple models. In the context of this study, contrasting controls can be viewed as both a strength and a weakness of watershed modeling. In many ways, modelers benefit from the existence of multiple hypotheses to explain the behavior of a hydrologic system. Given sufficient system knowledge, a modeler could use the comparisons presented here to identify a single appropriate model for a application, or a collection of appropriate model components to be applied in a multimodel [e.g., Clark et al. 2008] or Bayesian model averaging [e.g., Neuman, 2003; Duan et al., 2007] framework. Model averaging benefits from an understanding of model controls to identify which components display overlapping behavior with other models and which contain unique information. In this study, the true dominant system processes remain unknown, so our analysis does not identify which model is “best” for these particular watersheds, metrics, and timescales. However, Figures 7 and 8 clearly highlight the benefits of this approach for the purposes of diagnosing model performance based on the behavior of its underlying components. It has been suggested that models be built or selected based on an evaluation of dominant processes within the system [Grayson and Blöschl, 2001; Sivakumar, 2008], and tools such as the sensitivity approach in this study can be applied to help select a model capable of reproducing the desired system behavior. Contrasting model behaviors, once identified, can be exploited to the advantage of the user. However, these contrasting behaviors also complicate the interpretation of conceptual model performance. For instance, it can be tempting to use the results of a sensitivity analysis to infer dominant processes in the physical system. This study cautions against such inferences; in the absence of additional system data, the dominant processes one infers from a conceptual modeling exercise will strongly depend on the initial choice of model. Thus, it is very difficult to interpret the dominant controls in a lumped watershed model beyond the model's intended function, which is simply to convert forcing data into accurate estimates of streamflow. However, we can interpret dominant controls in the context of individual model formulations, which allows us to understand the effects of the underlying mathematical assumptions on performance across a range of conditions.

6. Conclusion

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Data and Models
  5. 3. Methods
  6. 4. Results
  7. 5. Discussion
  8. 6. Conclusion
  9. Acknowledgments
  10. References
  11. Supporting Information

[38] This study has built on the findings of recent studies by analyzing the temporal discretization of sensitivity over a well-defined hydroclimatic gradient. Our sensitivity results for individual models reveal which components are responsible for model performance across metrics, timescales, and importantly, across a gradient of hydrologic conditions. The results highlight significant intermodel differences in dominant parameters, even in cases where the models share the same process formulation (e.g., the soil moisture component shared by the Hymod and HBV models). The dominant parameters identified are often counterintuitive; even these simple models represent complex, nonlinear systems, and the links between formulation and behavior are difficult to discern a priori as complexity increases. Furthermore, the sensitivities of model parameters will change significantly in time as hydrologic conditions change, which highlights the value of time-varying sensitivity analysis to extract the maximum possible information from variable model behavior [Wagener et al., 2009]. Finally, analysis across a rigorous hydroclimatic gradient permits a more general understanding of model sensitivity, ideally providing diagnostic value in diverse, heterogeneous watersheds where we must evaluate the realism of candidate models in terms of their dominant processes [Wagener, 2003]. We plan to extend this analysis of lumped models to distributed watershed models to explore the spatial variability of sensitivity, building on prior studies [Tang et al., 2007b; van Werkhoven et al., 2008b]. Isolating the dominant assumptions in a model structure is essential to choosing an appropriate system representation for Predictions in Ungauged Basins and Predictions Under Change, both of which remain critical obstacles to reliable hydrologic forecasting.

Acknowledgments

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Data and Models
  5. 3. Methods
  6. 4. Results
  7. 5. Discussion
  8. 6. Conclusion
  9. Acknowledgments
  10. References
  11. Supporting Information

[39] The authors of this work were partially supported by the U.S. National Science Foundation under grant EAR-0838357. The computational resources for this work were provided in part through instrumentation funded by the National Science Foundation through grant OCI-0821527. Any opinions, findings, and conclusions are those of the authors and do not necessarily reflect the views of the U.S. National Science Foundation.

References

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Data and Models
  5. 3. Methods
  6. 4. Results
  7. 5. Discussion
  8. 6. Conclusion
  9. Acknowledgments
  10. References
  11. Supporting Information

Supporting Information

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Data and Models
  5. 3. Methods
  6. 4. Results
  7. 5. Discussion
  8. 6. Conclusion
  9. Acknowledgments
  10. References
  11. Supporting Information

Additional supporting information may be found in the online version of this article.

FilenameFormatSizeDescription
wrcr20124-sup-0001-suppinfo01.eps21885KFigure 11
wrcr20124-sup-0001-suppinfo02.eps8082KFigure 12
wrcr20124-sup-0001-suppinfo03.eps868KFigure 13
wrcr20124-sup-0001-suppinfo04.eps111974KFigure 14

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.