We investigate the observed positive trends in annual runoff in several basins in central Nebraska using the Budyko hypothesis as a diagnostic tool. In basins where runoff is dominated by base flow we found that the estimated annual evapotranspiration (ETa) to precipitation (P) ratio (ETa/P) from data is negatively related to the aridity index (ETp/P, where ETp is potential annual evapotranspiration). This observation is inconsistent with the Budyko hypothesis. We hypothesized that the observed negative trend results from significant interannual changes in basin water storage. This hypothesis is tested using data from groundwater monitoring wells in the Sand Hills region of Nebraska. Plots of the yearly changes in groundwater storage versus the annual aridity index revealed the mean annual aridity index as a critical climatological variable that controls basin storage gain-loss dynamics. For the same absolute deviation from the mean climate, we found that a wetter year leads to a larger gain in groundwater storage than the net loss in a drier year. We argue that this storage gain-loss behavior builds a climate memory in the hydrologic system, causing persistence and statistically significant trends in annual runoff. A parsimonious model was developed that couples the Budyko hypothesis with a linear reservoir equation for base flow and was used to examine the possible causes of observed positive trends of annual runoff. We found that subtle, statistically insignificant, increases in annual P have led to positive and statistically insignificant trends in annual ETa and P − ETa. Annual runoff, on the other hand, was predicted to have high persistence and statistical significance, consistent with observations. Further model sensitivity analyses showed that increasing the size of the groundwater reservoir is associated with increased long-term (multidecadal) persistence in annual runoff and translates high-frequency, high-amplitude variation in climate to low-frequency, low-amplitude runoff response. Our results underscore the importance of evaluating apparent trends in any system variable in a complete water budget context.
 The success of adaptive water resources management to meet growing societal demands under constantly changing environmental conditions (e.g., climate, land use, and urbanization) is contingent on the success and credibility of hydrologic predictions in time and space [Vörösmarty et al., 2000; Middelkoop et al., 2001; Barnett et al., 2005; Intergovernmental Panel on Climate Change, 2007; Wagener et al., 2010]. Identifying and understanding the sources of natural variability and trends in hydrologic variables in relation to both hydrologic processes and their driving forces is fundamental for improved water resources predictions at scales relevant to society. This improved understanding remains among one of the grand challenges in hydrology [e.g., Eltahir and Yeh, 1999].
 A prime example of such a challenge is the observed positive trend in streamflow across broad regions of the United States, excluding the Pacific Northwest and the Southeast, in the 20th century [Lins and Michaels, 1994; Lins and Slack, 1999; Milly and Dunne, 2001; McCabe and Wolock, 2002]. While a general precipitation increase has been noted for most of the United States, temperature has also gone up during the same period [Groisman et al., 2004], and droughts have, for the most part, become shorter, less frequent, less severe, and covered a smaller portion of the country over the last century [Andreadis and Lettenmaier, 2006]. In addition, land use change and anthropogenic disturbances have extensively altered the environment during this time frame, adding another level of complexity to deciphering the underlying causes of the observed hydrologic trends.
 Aside from urbanization, crop areas have grown extensively, especially in the Upper Mississippi River, the Ohio River, and the Missouri River basins, where runoff amounts have also gone up [Raymond and Cole, 2003; Garbrecht et al., 2004; Zhang and Schilling, 2006]. A number of studies have pointed to increased regional precipitation [Milly and Dunne, 2001; Garbrecht et al., 2004; Jha et al., 2004] and intensified anthropogenic disturbances [Zhang and Schilling, 2006] as major causes of streamflow increases in the Mississippi River basins. Zhang and Schilling  hypothesized land use change as being the major underlying cause of hydrologic change, arguing that the conversion of perennial vegetation to seasonal row crops led to reduced evapotranspiration (ET), increased groundwater recharge, and thus increased base flow and streamflow. Milly and Dunne , on the other hand, showed that ET in the Mississippi River basin actually increased from 1949 to 1997 (in association with higher precipitation and consumptive water use), but that the larger trend in precipitation still led to an increase in observed streamflow. Others have demonstrated the influence of groundwater pumping for irrigation in southern Nebraska and northern Kansas where streamflow has dropped dramatically in the second half of the 20th century [Szilagyi, 2001; Wu et al., 2008].
 These effects also play out differently whether streamflow is generated via overland flow (from infiltration excess or saturation excess) or groundwater [Winter, 2001] and are influenced by the regional interplay between wetting and drying forces [Sankarasubramanian et al., 2001; Sankarasubramanian and Vogel, 2003]. In the Great Plains and the Central Lowlands regions of the United States, where trends in annual runoff have been identified, flat topography and deeper soils facilitate the development of groundwater. Especially where soil permeability is high, groundwater provides relatively stable streamflow and longer recession periods, leading to significant lags between precipitation and streamflow [e.g., Changnon et al., 1988; Eltahir and Yeh, 1999]. Basins with lower permeability, on the other hand, often show a higher degree of flow variability and faster hydrologic response to precipitation [Winter, 2001].
 Given the strong nonlinearities and threshold dependence of hydrologic fluxes on soil moisture and groundwater storage, as well as linkages to physical and biological basin characteristics, a richness of hydrologic response can be observed depending on prevailing climate, seasonality, geology, soil type, topography, and vegetation [Milly, 1994; Dooge et al., 1999; Knapp et al., 2002; Niemann and Eltahir, 2005; Istanbulluoglu and Bras, 2006; Wang et al., 2009]. In trying to understand and quantify the underlying causes of hydrologic change, short- and long-term memory in hydrologic response can also contribute to complexities and misinterpretations of data [Mandelbrot and Wallis, 1969; Lettenmaier and Burges, 1978; Rao, 2001]. The question of whether the observed streamflow increases in central portions of the United States indicate “trends” caused by climate and environmental change or whether they are simply a reflection of inherent long-term variability superimposed upon short-term environmental change (or both) is among one of the key global change research questions in hydrology.
 In this paper we investigate the observed trends in annual runoff in central Nebraska in relation to soil texture and groundwater dynamics in several selected basins, where anthropogenic influences on regional hydrology are known to be insignificant. We concentrate on three aspects of the hydrologic system in relation to the predictability of hydrologic trends. These include the role of geology and soil texture, annual precipitation trends, and interannual changes in basin water storage. We used the Budyko hypothesis (BH), as a diagnostic tool to examine the influence of groundwater on annual runoff response and observed runoff trends. BH relates the partitioning of precipitation (annual or longer) to evapotranspiration (and runoff) directly to climate. In the application of the BH, basin evapotranspiration is estimated from water balance, first by neglecting the role of storage, and second incorporating the changes in basin water storage, quantified from observed basin-averaged water table fluctuations. Our data analysis in a regional basin led to the development of a parsimonious model of annual runoff, which is then used to examine the modulating effect of groundwater on annual runoff in response to precipitation forcing. In sections 2 and 3, we describe the study region and illustrate observed patterns of annual runoff (in relation to soil texture and climate) that motivated this study.
2. Study Site
 We examined four basins in the state of Nebraska (central United States) with different soil textures and varying levels of base flow contribution to annual runoff, and investigated the differences in their annual hydrologic response. Climatic and hydrologic observations were obtained for the Elkhorn (18,100 km2), Loup (39,200 km2), and Niobrara (39,900 km2) river basins (Figure 1a). The regional climate is generally semiarid to subhumid continental and is characterized by a strong precipitation gradient from ∼350 mm yr−1 in the west to ∼750 mm yr−1 in the east. Native grasslands are dominant across the landscape, with limited croplands [Dappen et al., 2007], except for the Elkhorn basin where croplands are more significant. The central portion of the study area (red boundary in Figure 1) is known as the Nebraska Sand Hills (NSH). Soils in the NSH are characterized by Holocene eolian sand deposits overlying Quaternary and/or Pliocene alluvial sand and silt (Figure 1a) [Bleed and Flowerday, 1989]. Because of the high infiltration capacity of sand [Wang et al., 2008], ∼30% of the High Plains Aquifer storage lies under the NSH [Weeks and Gutentag, 1988]. The average aquifer thickness is ∼170 m and has a bowl-like shape, ranging from >305 m depth in the west central NSH to only a few meters along the boundaries of the NSH [Chen et al., 2003]. This leads to a strong groundwater influence on streamflow in NSH rivers [McGuire and Fischer, 1999; Wang et al., 2009]. As a result, the region exhibits a wide spectrum of hydrologic response, from strong groundwater control in the center portions of the region to more surface and shallow subsurface-driven systems near the boundaries of the region [Wang et al., 2009].
 The regional groundwater flows eastward as land elevation gradually dips to the east, feeding the basins toward the east of the NSH. However the contribution of the regional groundwater flow to annual water balance of the watersheds studied in this paper is not known. Most studies published in the literature have focused on lake water balance in the NSH that involved basins with much smaller sizes than our study basins [e.g., Winter, 2001; Winter et al., 2003]. There are, however, a few numerical modeling studies that can be used to infer the role of groundwater in the NSH. Schaller and Fan  used the ratio of observed stream discharge (Qr) to modeled basin recharge (Rg) to assess groundwater contributions to streamflow across the continental United States. Basin recharge was derived from 50 years of hydrologic simulations using the VIC model. A basin is considered to be a net groundwater exporter when Qr/Rg < 1, a groundwater importer when Qr/Rg > 1, or a basin having no net gain or loss of groundwater when Qr/Rg ≈ 1. Except for the headwaters of Middle Loup with Qr/Rg > 1, and parts of lower Elkhorn with Qr/Rg < 1 other basins in our study region have Qr/R ∼ 1 [Schaller and Fan, 2009].
 In a recent modeling study, Stanton et al.  used a suit of numerical models including MODFLOW 2000 [Harbaugh, 2005] to examine the regional groundwater dynamics and watershed hydrology of the Loup and Elkhorn River basins. Groundwater elevations along the boundaries of the study domain were determined from water table observations collected in 1995 and were fixed during the simulation. Stanton et al.  found the model-predicted annual base flow and runoff to be consistent with observations. During the simulation period (1940–2005), roughly 2% of the total water entering and 5% of the total water leaving the modeled aquifer was found to flow across the fixed water level boundaries. This suggests that regional groundwater flow was responsible for a net loss of roughly 3% of the total water input to the aquifer.
 In another study utilizing MODFLOW 2000, Chen and Chen  developed a groundwater flow model for a smaller area in the central NSH (for the period 1979–1990), including the headwaters of the Middle Loup and Dismal Rivers. The modeled groundwater fluxes into and out of the study domain (across the model boundaries) were found to be approximately 43% and 27% of the total water fluxes to the aquifer, respectively. As a result, there was a net gain of roughly 13% of the total modeled inflow to the domain [Chen and Chen, 2004, p. 426, Table 3]. This led to an increase in water table elevations, which was found to be consistent with regional water table monitoring records. Chen and Chen  further suggested that groundwater inflow and outflow nearly balance as the regional domain size gets larger. These independent modeling studies present consistent results with each other.
 Besides the regional flow discussed above, local groundwater flow within basin boundaries largely follow topography. Hydrologically, the rivers and wetlands in this region tend to focus the flow toward them, with the surrounding landscapes acting as recharge zones [Gosselin et al., 1999; Harvey et al., 2007]. As a result, in most cases, water converges toward channels and valleys and diverges along basin divides. This is illustrated in Figure 1c, which shows a map of the 1995 groundwater table (based on well data for the NSH obtained from the University of Nebraska School of Natural Resources; http://snr.unl.edu/data/geographygis/NebrGISwater.asp). The map shows that groundwater mounds tend to follow basin divides along the boundaries of the Loup River watershed (and its subbasins). These lines of evidence suggest predominantly local topographic controls on groundwater flow patterns, with minimal net gain or loss due to regional groundwater flow (relative to other water balance components) especially as basin area gets large. Thus, in this study we use watershed divides to define basin control volumes for examining hydrologic fluxes and we neglect any net regional groundwater flow contribution. It is not our intent here to quantitatively assess the validity of this assumption to a higher degree of precision than the studies reported above, which would be difficult using available observations. Rather, our intent is to provide a conceptual framework for interpreting and diagnosing the observed trends and patterns in annual runoff response. If regional groundwater plays a significant role on annual water balance, then we anticipate that this would present itself in our analysis, and can be detected by our framework. However, given the significance of this assumption for understanding the regional water balance, the implications, as they relate to this study, are discussed in greater detail in section 5.
3.1. The Budyko Hypothesis and Mean Annual Water Balance in the NSH
 When changes in basin water storage can be neglected (e.g., over sufficiently long time scales) and convergence/divergence of regional groundwater flow is also negligible, the basin water balance can be written in normalized form as
where R, ETa, and P are annual runoff, evapotranspiration, and precipitation (mm yr−1), respectively. The BH relates the evapotranspiration ratio ( ) to climate through an aridity index, , where ETp is potential evapotranspiration [Budyko, 1974]. This relationship takes the following functional form:
where F(φ) is an empirical asymptotic function that could take various forms (see review by Arora ). The respective limits of F(φ) for humid ( ) and arid ( ) climates are
where w is an empirical coefficient used to relate the effects of vegetation and soil texture on the slope of F(φ), Wang et al.  studied the mean annual water balance of major NSH basins. Wang et al.  found a negative dependence between (estimated from the mean annual water balance using equation (1)) and basin sand coverage. For a given φ, values for the NSH basins were up to 15% lower than those outside the NSH region (Figure 2). In equation (5), this difference is represented by a smaller w value as the areal fraction of sandy soil grows in a basin. The decrease in with increasing sand coverage is related to the highly permeable nature of sand and its relatively smaller water holding capacity, resulting in rapid movement of water through the soil column and recharge to the water table in the NSH [Wang et al., 2009].
3.2. Annual Runoff Response in the NSH
 Besides the differences in mean annual ET and R, annual basin hydrology in the NSH shows distinct patterns in terms of soil texture and base flow contribution to total runoff. To illustrate this, four basins with varying soil textures and base flow contributions were analyzed. These include two basins in the NSH: Middle Loup at Dunning (ML-D) and North Loup at Taylor (NL-T), both having nearly uniform sandy soils; and two outside the NSH: Ponca Creek at Verdel (PC-V), in the Niobrara River basin, and Maple Creek at Nickerson (MC-N), in the Elkhorn River basin (Figure 1d). The soil texture in PC-V is composed of various types of silt, loam, and very little sand, while soil texture in MC-N is mostly silt clay loam in hillslopes and silt loam in the valleys.
 To quantify trends in annual climate and runoff and the differences in annual hydrology, various hydrologic descriptors are calculated for each basin and are reported in Table 1. These include the mean annual precipitation (P), runoff (R), potential evapotranspiration (ETp), aridity index (φ), base flow index (BFI), Hurst exponent (H), lag 1 autocorrelation coefficient, and precipitation elasticity to streamflow (εp). BFI is approximated using a recursive digital filter [Nathan and McMahon, 1990], which has been used in the NSH in earlier studies [Szilagyi et al., 2003; Wang et al., 2009]. We use the parameter values of the recursive digital filter from Szilagyi et al. . The nonparametric Mann-Kendall (M-K) test [Hirsh et al., 1982] for trend detection (α = 0.05) is used for detecting trends in P, R, ETp, and φ. The types and sources of hydroclimatological data are described in section 4. In general terms, annual P shows a very subtle and statistically insignificant increase in the basins (Z > 0), and does not have a long-term memory. Note that φ has a negative trend in all basins, suggesting a general reduction in aridity (due to both lower ETp and higher P), but only MC-N shows a statistically significant trend in φ (Table 1). Annual R shows a positive trend in all catchments that is significant in every instance except PC-V. In the NSH basins, R clearly shows stronger positive trends than the basins outside the NSH.
Table 1. Some Basin Characteristics and Hydrologic Indices for the Four Selected Basins Used to Illustrate the Assortment of Hydrologic Response in the Nebraska Sand Hills
Middle Loup at Dunning (ML-D)
North Loup at Taylor (NL-T)
Ponca Creek at Verdel (PC-V)
Maple Creek at Nickerson (MC-N)
Drainage area (km2)
Mean elevation (m above sea level)
Main channel slope (%)
Sand Hills coverage (%)
Base flow index
R elasticity to P, ε
silt loam and loam
silt clay loam
P (mm yr−1)
Lag 1 autocorrelation
Two-tail Mann-Kendall test
Z = 0.368, p = 0.713, insignificant
Z = 1.240, p = 0.215, insignificant
Z = 1.454, p = 0.146, insignificant
Z = 1.329, p = 0.184, insignificant
ETp (mm yr−1)
Lag 1 autocorrelation
Two-tail Mann-Kendall test
Z = −3.310, p < 0.001, decreasing
Z = −0.948, p = 0.343, insignificant
Z = −2.014, p = 0.044, decreasing
Z = −4.991, p < 0.001, decreasing
R (mm yr−1)
Lag 1 autocorrelation
Two-tail Mann-Kendall test
Z = 7.574, p < 0.001, increasing
Z = 4.706, p < 0.001, increasing
Z = 0.861, p = 0.389, insignificant
Z = 2.430, p = 0.015, increasing
Lag 1 autocorrelation
Two-tail Mann-Kendall test
Z = −0.854, p = 0.393, insignificant
Z = −1.259, p = 0.208, insignificant
Z = −1.673, p = 0.094, insignificant
Z = −2.182, p = 0.029, decreasing
 In order to examine the differences in the annual hydrologic response in these basins, time series of annual P and R and scatterplots of R as a function of P are illustrated in Figure 3. Also shown are plots of the estimated evapotranspiration ratio (equation (1)), , as a function of . The first two basins (Figures 3a–3h) are located in the NSH (ML-D and NL-T), while the latter two (PC-V and MC-N; Figures 3i–3p) are located outside the NSH. With the fining of soil texture from left to right (Table 1), the BFI grows progressively smaller. As noted above, there is a clear positive and statistically significant (p < 0.001) trend in annual R in the NSH basins and a positive but less significant trend (p = 0.015) in MC-N (Table 1 and Figure 3). It is also clear from Figure 3 that annual R in the NSH basins is not closely tied to annual P, but instead exhibits a nearly “flat” relationship (Figures 3c and 3g). In contrast, annual R in the non-NSH basins (PC-V and MC-N) shows much higher interannual variability (Figures 3j and 3n), a stronger positive dependence on P (Figures 3k and 3o), and a less pronounced long-term trend. Interesting differences are also evident in the (P − R)/P versus φ plots. The non-NSH basins (with lower BFIs) follow the overall shape of the Budyko curve (Figures 3l and 3p), showing the “expected” climatological control on basin ETa in response to varying aridity. NSH basins, on the other hand, clearly show a relationship that is counter to what would be predicted from the Budyko curve (Figures 3d and 3h). Zhang et al.'s equation (5) is used in Figures 3d, 3h, 3l, and 3p to illustrate the shape of the Budyko curve for various commonly used w values found in the literature (based on a range of soil and vegetation types).
 Differences in the annual hydrology of the basins are manifested in a number of the hydrologic indices, particularly H and εp, as shown in Table 1. Persistence in hydrologic and climatic systems can be quantified by the Hurst coefficient, H [Hurst, 1951]. It is generally agreed that when 0.5 < H ≤ 1, the process has long-range dependence or memory, when H = 0.5, the process is purely random, and when 0 < H < 0.5, the process is antipersistent [Hurst, 1951; Sakalauskiene, 2003]. For streamflow in the NSH, Wu et al.  used 0.4 < H < 0.59 for white noise and H > 0.59 for long-term persistence.
 While all climatic variables (P, ETp, φ) show some degree of persistence, with P having relatively lower values of H, R generally exhibits a higher degree of persistence than most of the climatic variables. This is particularly true for the NSH basins, which have higher BFIs and show much stronger long-term persistence than the two basins located outside the NSH domain, as indicated by the higher H values (H = 0.96 for ML-D and H = 0.86 for NL-T) and higher lag 1 autocorrelation coefficients (Table 1).
 The sensitivity of streamflow to precipitation can be quantified by the precipitation elasticity of streamflow (εp) [Sankarasubramanian et al., 2001]. A value of εp > 1 indicates that a 1% change in P is associated with a greater than 1% change in R. Across the conterminous United States, the general reported range of εp is 1.5 – 2.5, and higher εp values are generally associated with semiarid and arid climates [Sankarasubramanian and Vogel, 2003]. With respect to the εp values reported in the literature, elasticity numbers for the NSH basins (εp = 0.19 for ML-D and εp = 0.25 for NL-T) are among the lowest found in the United States [Sankarasubramanian and Vogel, 2003]. Given the semiarid to subhumid climate in the NSH, this suggests that base flow contributions to runoff override the influence of interannual climate variability in this region of sandy soil. This is further supported by the fact that the non-NSH basins show some of the highest elasticity values in the United States (εp = 3.19 for PC-V and εp = 2.27 for MC-N), despite being located in a regional climate that is nearly identical to that of the NSH basins.
 The role of basin storage in altering streamflow variability and persistence has been well documented, both in observational and modeling studies [Winter, 2007; Shun and Duffy, 1999; Koch and Markovic, 2007]. An aquifer may act as a low-pass filter, taking nonfractal (white noise) recharge signals and producing fractal (persistent) water table fluctuations and base flow variations. In general, base flow persistence grows proportionally with the size of the aquifer and is inversely proportional to aquifer transmissivity [Duffy and Gelhar, 1986; Shun and Duffy, 1999; Zhang and Schilling, 2004; Zhang and Li, 2005]. Our observations of higher Hurst coefficients (H) and lower streamflow elasticities (εp) in the NSH basins (as compared to those located outside the NSH) are consistent with the earlier work cited above. While the filtering effect of aquifers has been discussed and modeled in the aforementioned studies, relatively little is known about how climate change signals (e.g., trends, extreme events) will alter recharge variability, propagate into the groundwater memory, and influence streamflow variability and persistence over time. Examining this question requires a hydrological perspective, by relating recharge to observed climate [e.g., Koch and Markovic, 2007]. To this end, our observations in Figure 3 naturally lead to the following critical questions, which relate runoff trends to basin storage and water balance: (1) Why are the positive R trends strongest in the NSH basins, and is this due to the amplification of climate signals through the groundwater reservoir or instead to the local convergence of regional groundwater? (2) What is the cause of the negative relationship between (P − R)/P and φ, and is it related to the observed persistence in annual R? (3) Can we estimate annual changes in basin water storage and recharge-release dynamics in relation to climate using annual P, R, and φ?
 The overall premise of this paper is to develop a simple framework to diagnose, evaluate, and predict the annual hydrologic response to climate (e.g., runoff, recharge, and evapotranspiration) for basins with significant groundwater influence. For this purpose, we first present our BH analysis with basin evapotranspiration calculated from water balance, followed by the development of a simple model, for simulating and interpreting the observed patterns in the hydrologic data to address the research questions posed above.
4. Role of Basin Water Storage in the Budyko Hypothesis
4.1. Incorporating Storage Change Into the Annual Water Balance
 The hydrologic indices discussed above highlight the influence of base flow in shaping a basin's annual hydrologic response. As such, before interpreting the causes of long-term trends in annual runoff using concepts of the basin water balance, it is first important to understand why is negatively related to annual in the NSH basins, as well as why this negative relationship gradually transforms into a positive relationship as BFI drops (i.e., in better general agreement with the BH; Figures 3d, 3h, 3l, and 3p).
 Following Wang et al. , we interpret the influence of changes in basin water storage in the context of the BH as illustrated in Figure 4. Here we only consider changes in groundwater storage, since interannual changes in water content within the unsaturated zone in sand-dominated basins is likely to be relatively small. Furthermore, the simple framework considered here is not intended to be capable of evaluating changes in unsaturated storage. In a base flow-dominated basin with highly permeable soils, R in a dry year (high φ, low P) is largely derived from basin groundwater storage, S, causing a net loss in S (ΔS < 0) and a higher R than predicted by the BH (Figure 4b). As such, when P − R is used to estimate ETa (which assumes , equation (1)), this would lead to an underestimation of basin ETa and (P − R)/P < F(φ) because of the high base flow contribution to R (Figure 4a). In contrast, during a wet year (low φ, high P), enhanced groundwater recharge due to high infiltration rates leads to a net gain in storage (ΔS > 0). This water “loss” to storage results in a lower R than predicted by the BH. If the gain in S (ΔS > 0) is neglected, P − R would then overestimate the basin ETa, causing (P − R)/P > F(φ) (Figure 4a). Therefore, when changes in basin groundwater storage are not negligible, equations (1) and (2) can be modified to include a storage term (ΔS) for closing the water balance:
It is important to note that similar to equation (1), equation (6) still assumes that the net convergence/divergence of regional groundwater flow into/out of the basin is negligible. In other words, any measured change in storage, ΔS, is assumed to be entirely due to an imbalance between P – ETa and R. However, as we will discuss in section 6, a net flux of groundwater can be easily incorporated in this equation. Because the amount of net regional groundwater influence is highly uncertain in the study region, we first explored the hypothesis without this effect. In section 4.2, we explore this modified method of applying the BH (i.e., equation (6)) for the case of the North Loup River Basin (NLRB), where long-term observations of groundwater fluctuations are available.
4.2. Testing the Role of Storage in the BH Using Regional Hydrologic Data
 The model represented by equation (6) (and shown conceptually in Figure 4) is tested in the North Loup River Basin (NLRB) in central Nebraska (Figure 1e). The drainage area of the basin is ∼11,200 km2 at the USGS NL-Saint Paul (NL-SP) gauging station, which is located near the confluence of the North and Middle Loup Rivers and is used as the outlet of the NLRB in the reminder of the paper. The NLRB has two main subbasins: NL-Taylor (NL-T, southern tributary), and NL-Calamus (NL-C, northern tributary), both with historic streamflow records. The confluence of NL-C and NL-T is located near Burwell, NE (Figure 1e). Approximately 80% of the drainage area of the NLRB, including the two tributaries, is located within the NSH. The basin area below the NL-C and NL-T confluence is outside the NSH, where the soil is mostly silt loam, and the vegetation is predominantly grass, with some cropland (corn and soybean) along the main river valley. Because of this difference, we test equation (6) separately for the entire NLRB and its two main tributaries.
4.2.1. Data Sources and Methods
 Annual P, R, ETp, and ΔS are needed to implement equation (6). Annual P was obtained from the Parameter-elevation Regressions on Independent Slopes Model (PRISM) data set, which contains monthly and annual P at a spatial resolution of 4 × 4 km across the conterminous United States [Daly et al., 1994, 2002]. Maps of annual P for each basin are first generated from the PRISM data set and then spatially averaged across the basin to estimate basin mean annual P. Annual R is obtained from the USGS National Water Information System (http://waterdata.usgs.gov/nwis). The gauging stations at NL-SP (main basin) and NL-T (subbasin) have streamflow records since 1929 and 1938, respectively. A dam was completed near the outlet of the NL-C subbasin in 1986 (known as the Calamus Dam). There are two USGS gauging stations in the NL-C subbasin, namely the Harrop station (upstream of the Calamus Dam, with streamflow records from 1979 to 2003) and the Burwell station (downstream of the Calamus Dam, with streamflow records from 1941 to 2003). These stations have similar annual R values during the overlapping period before the completion of the dam (i.e., 1979–1985). Therefore, to obtain a complete data set reflecting the hydrologic changes over time in the NL-C subbasin, we simply combine the streamflow data from both stations. Before 1986, data from the Burwell station are used, while data from the Harrop station are used from 1986 onward.
 To calculate annual φ, estimates of ETp are needed. Two weather monitoring networks exist in the NSH: (1) the Automated Weather Data Network (AWDN), which is managed by the High Plains Regional Climate Center and (2) the NOAA National Weather Service Cooperative (COOP) network (see Figure 1e for station locations). There are 16 COOP stations within the NLRB which record daily precipitation and minimum and maximum air temperatures (Tmin and Tmax). COOP stations typically have long-term weather data beginning as early as the 1880s in this region. By comparison, AWDN stations have relatively shorter data records but measure more meteorological variables, including daily precipitation, Tmin and Tmax, net radiation, relative humidity, and wind speed. The AWDN stations also report calculated reference grass evapotranspiration (ETr) [e.g., Allen et al., 1998]. However, only two AWDN stations exist in the NLRB, and the available data cover the period 2000 to 2008.
 To make full use of the available hydrometeorological data in this region, the temperature-based Hargreaves equation was used to calculate ETp using data from the 16 COOP stations [Hargreaves and Samani, 1982]. Wang et al.  found that the original Hargreaves equation generally underestimates ETr in this region, despite the fact that ETp and ETr should essentially be identical for a grassland site. Therefore, to improve the accuracy of the Hargreaves equation, we first calibrate it using daily ETr reported by the two AWDN stations (following the method described by Wang et al. ), which leads to a slight modification of the constant in the original Hargreaves equation. This modification is then used to calculate daily ETp at each COOP station within the NLRB (Figure 1e). The daily values are summed up to obtain annual total ETp, followed by an arithmetic mean of all annual ETp values within each subbasin to obtain the basin-scale annual potential evapotranspiration. In the end, the modified Hargreaves equation is found to underpredict annual ETp by only 2% in comparison to annual total ETr (as reported for the AWDN stations).
 Groundwater data are crucial for this study, as the modified application of the BH (equation (6)) directly involves quantifying changes in annual basin water storage. Depth to water table (DWT) data are obtained from multiple monitoring wells within the USGS National Water Information System. As of 2008, there were 86 registered wells in the NLRB. Groundwater monitoring in the NLRB started with only a few wells in 1935. Between 1955 and 1958, the number of wells rose to 35 and subsequently to 84 between 1975 and 1978. After 1995, the total number of wells varied each year, with a minimum of 20 and a maximum of 86. In the NL-T subbasin there were 26 monitoring wells as of 2008. Continuous groundwater monitoring in this subbasin began in 1958, before which only 2 years of observations were available. In the NL-C subbasin, there are 11 wells with data since 1969, some with occasional gaps. Excluding the years with missing data, DWT data for the NL-C subbasin are available for the following periods: 1969–1973, 1976–1996, and 1998–2003, for a total of 32 years of data.
 To examine if any of the observed hydrologic trends may be attributed to changes in planted crop area, we compiled land use data for the Loup River basin (LRB) from the National Agricultural Statistics Service of the U.S. Department of Agriculture [http://www.nass.usda.gov/#top]. Cropland in the basin is dominated by winter wheat, followed by corn and sorghum (both for grain and silage), and then by barley and alfalfa. These predominantly dryland crops require only limited irrigation in a small portion of the cropping area during the peak growing season. County-level data on annual planted acreage for each crop were summed up to obtain the total annual planted acreage within the LRB. Overall, the crop area in the LRB is relatively small, growing from approximately 10% of the LRB drainage area in 1939 to 30% in 2008. The trend is not linear through time, but instead shows step-like changes in the late 1940s (increasing from ∼10% crop coverage to ∼20%) and the early 1990s (increasing from ∼20% to ∼30%), after which the total crop area remained relatively constant. Compared to natural grasslands, field crops (especially corn) transpire more water during the main growing season [e.g., Irmak et al., 2008], and this could potentially reduce R in regions with greater crop coverage. Before we examine the land use data in relation to the storage change component in the BH framework (equation (6)), we first examine the observed trends in annual P, R, and DWT for the NLRB.
4.2.2. Variations in Annual Hydrology
 Various hydrologic descriptors for characterizing trends in the water balance of the NLRB are reported in Table 2. Figure 5 presents time series of annual P and R, basin mean DWT, and the number of monitoring wells. Although statistically insignificant (α = 0.05), annual P shows a positive trend of ∼0.99 mm yr−1 for the 1929–2008 period in the NLRB (at NL-SP). Annual R shows significant positive trends of 0.20 mm yr−1 (NL-SP), 0.23 mm yr−1 (NL-T), and 0.55 mm yr−1 (NL-C). Annual R also shows much higher Hurst exponents (H; Table 2) than the hydrometeorological variables in the basins, with NL-C showing the highest H of the three.
Table 2. Some Hydrologic Statistics for the North Loup River Basin and Its Subbasins
Drainage area (km2)
Base flow index
Sand Hills coverage (%)
Precipitation (mm yr−1)
Lag 1 autocorrelation
Two-tail Mann-Kendall test
Z = 1.9402, p = 0.0524, insignificant
Z = 1.2396, p = 0.2151, insignificant
Z = 0.8990, p = 0.3687, insignificant
ETp (mm yr−1)
Lag 1 autocorrelation
Two-tail Mann-Kendall test
Z = −2.7296, p = 0.0063, decreasing
Z = −0.4424, p = 0.6582, insignificant
Z = 1.0744, p = 0.2826, insignificant
Runoff (mm yr−1)
Lag 1 autocorrelation
Two-tail Mann-Kendall test
Z = 3.8264, p < 0.0001, increasing
Z = 4.7592, p < 0.0001, increasing
Z = 6.1591, p < 0.0001, increasing
Lag 1 autocorrelation
Two-tail Mann-Kendall test
Z = −2.0732, p = 0.0382, decreasing
Z = −1.1327, p = 0.2574, insignificant
Z = −0.5517, p = 0.5811, insignificant
 Although the positive annual P trend is statistically insignificant, considerable interannual variability is present, with 3 to 6 year droughts that are typically followed by wet periods of similar duration. Major dry spells include the Dustbowl era in the 1930s (1931–1937), a less severe and shorter drought in the 1950s (1952–1956), and the periods of 1974–1976, 1989–1991, and 2002–2003. In addition to these, below-average precipitation prevailed from the mid 1960s until the late 1970s. This was followed by a generally wet period from 1977–2001 (interspersed with a few dry years). Consistent with the precipitation records, a decadal drop in annual R during the 1930s drought and relatively lower R during the mid 1950s and 1970s can be seen. From 1976 until 2000, annual R experienced a positive trend in the basin, with the NL-C showing the most significant increase (Figure 5). Thus, there is considerable correspondence between P and R on interannual to decadal time scales, despite the weak relationship in terms of the overall, long-term trend.
 Given the high BFI values in the NLRB (Table 2), one would expect to see a close correspondence between annual R and mean DWT. During the period 1957–1976, DWT was monitored at only a few wells in the NL-T and NL-C subbasins (total ∼10, with some annual variations). However, the number of monitoring wells grew significantly from 1973 to 1976 (to ∼10 for NL-C, greater than 20 for NL-T, and greater than 70 for the entire NLRB; Figure 5d). The majority of the wells (more than 40) are located downstream of the NL-C and NL-T confluence (Figure 1d). Realizing that the sudden increase in the number of wells in the early to mid 1970s may influence the calculated mean DWT for the basin, we focus on the period 1976–1999, which provides a relatively stable number of monitoring wells for the analysis. As noted above, this period also coincides with generally above-normal annual P (Figure 5a).
 During this 24 year period, both annual R and mean water table elevation rose in the NLRB (i.e., a decrease in mean DWT; Figure 5c). The mean DWT in the NL-C subbasin decreased from ∼7.5 m in 1976 to less than 3 m around 1996 (Figure 5c). In the NL-T subbasin, the water table first rose from about 10 m below the ground surface (around 1976) to a depth of about 8 m (in the mid 1980s), after which the mean DWT stayed relatively constant. Overall, the NLRB experienced an even greater (∼7 m) decrease in mean DWT, from ∼14 m below ground surface in 1976 to ∼7 m in 1996. Annual R generally mirrors this rise in water table. The NL-C subbasin experienced the highest long-term increase in annual R (as much as 60% from 1975 to 1999), while the increases in R for the NL-T subbasin and entire NLRB were around 30%. These observed positive trends in streamflow and water table elevation for the NLRB are consistent with the aforementioned wetter climate after 1976 (i.e., higher annual P).
4.2.3. Effects of Groundwater Storage Change on ETa Estimates in the BH
 In order to implement equation (6), interannual changes in basin water storage need to be quantified. We make the assumption that in the highly permeable sandy soils of the NSH, interannual changes in unsaturated soil moisture storage can be neglected and that annual changes in basin water storage, ΔS, are therefore dominated by variability in groundwater storage. ΔS is calculated for each year by spatially averaging the annual changes in local groundwater storage at each available monitoring well in a given year according to
where i is the well number, n is the number of wells in the basin in a given year j, and Sy is the specific yield (dimensionless) of the unconfined aquifer. Sy can be defined as the volume of stored groundwater released per unit surface area of the aquifer per unit decline of water table [Dingman, 2002]. In this study, Sy is assumed to be 0.16 for the NLRB on the basis of field measurements conducted by geologists at the University of Nebraska–Lincoln (L. Howard, personal communication, 2010). In applying equation (7a), well measurements conducted in the spring of each year were used. Along with observed annual P and R, the values of ΔS calculated from equation (7b) are then used in the normalized water balance equation (6) to calculate the annual basin evapotranspiration ratio, ETa/P.
 With the inclusion of ΔS in equation (6), it is expected that the estimated annual ETa/P will more closely follow the relationship predicted by the BH, including the functional form of F(ϕ,w) given by equation (5). To examine this, we first present our results for the entire NLRB in Figure 6 in four different plots. These include, with the x axis being the annual aridity index (φ), estimates of ETa/P from (P – R)/P (i.e., equation (1), which ignores any influence from groundwater storage; Figure 6a) and estimates of ETa/P from equation (6), which includes contributions from ΔS (Figure 6b). Also shown in Figures 6c and 6d are the annual values of ΔS and ΔS/P, respectively (as a function of φ). To illustrate the expected range of ETa/P for a given φ, equation (5) is plotted for commonly used values of w [Zhang et al., 2001; Wang et al., 2009] in Figures 6a and 6b. To take full advantage of the groundwater monitoring data, all years with groundwater observations are included in Figure 6. However, data for the 1976–1999 period (which has the highest and most stable number of groundwater monitoring wells) are highlighted in Figure 6 (red circles) to allow one to assess any potential impacts from changes in well density.
 Consistent with the observations presented earlier in Figures 3a and 3b (for the Middle and North Loup basins, respectively), Figure 6a shows an inverse relationship between ETa/P and φ that does not follow the expected form of the BH. However, when the effects of groundwater storage, ΔS, are incorporated into the estimates of annual ETa/P (using equation (6)), the resulting relationship closely follows the BH (Figure 6b), with most ETa/P values falling within the typical range for equation (5) [Zhang et al., 2001]. This significant improvement in the ETa/P versus φ relationship underscores the dominant control of climate on ETa and suggests that the BH can provide a useful framework for estimating ETa, even in base flow-dominated regions, so long as groundwater storage effects are properly accounted for. These findings lead to the following questions: (1) How much influence does groundwater have on regional ETa? (2) What insights can we gain from observations about the general basin recharge-release behavior in relation to climate? (3) Is ΔS primarily driven by local drainage or regional groundwater flow, and can this be determined by means of the BH?
 To gain some insight into the potential connection between ΔS and the regional hydroclimatic variability, we classified the observed annual ΔS values according to the aridity index, φ. More specifically, the ΔS-φ and ΔS/P-ϕ domains shown in Figures 6c and 6d were divided into four regions. Data above the horizontal zero line show a net annual gain of basin groundwater storage, denoted in the upper two quadrants by GG. Data below the horizontal zero line show a net annual loss of basin groundwater storage, denoted by LG. The vertical lines in the middle of Figures 6c and 6d correspond to the long-term mean aridity index. To the left and right of the line are wetter and drier annual climatic anomalies, denoted by WA and DA, respectively. Overall, this results in four potential climatic classifications of groundwater response. As would be expected, the observed ΔS values concentrate primarily in two regions, namely, WA-GG and DA-LG. This is consistent with the intuition that groundwater in the basin is stored during wetter years and released during drier years. In addition, the clear relationship between φ and ΔS (and ΔS/P), as well as the distinct switch point for gain-loss behavior (centered around the mean annual φ), suggest a fundamental role of climate in basin storage-release behavior.
 The above analysis is repeated for the two tributaries of the NLRB, and the results are shown in Figure 7 for NL-Calamus (NL-C, Figures 7a–7d) and NL-Taylor (NL-T, Figures 7e–7h). The 1976–1999 period with the highest number of monitoring wells is distinguished by the red circles. NL-C has a slightly lower mean annual φ, but a higher BFI than NL-T. Similar to the basin-wide results, when ΔS is neglected in the water balance estimates for ETa/P (i.e., (P − R)/P), the calculated values deviate from the predicted Budyko curve in both basins (Figures 7a and 7b). Accounting for annual ΔS in equation (6) using groundwater observations clearly brings the data more in line with the BH. These observations for the two upper tributaries further confirm our findings at the whole-basin scale that annual variations in ETa are strongly dictated by climatic aridity for these high base flow basins.
 The relationship between φ and ΔS (and ΔS/P) continues to be strong for the NL-C and NL-T subbasins, with switch points between GG and LG clearly falling on the mean annual φ line. ΔS has a noticeably steeper negative dependence on φ in the wet anomaly domain (WA) as compared to dry anomalies (DA). The mean annual φ acts as a breakpoint between steep and less steep storage change responses to aridity (Figures 7c and 7g). This indicates that for the same absolute deviation from the mean φ (which is proportional to 1/P), a wetter year leads to a larger gain in groundwater storage, ΔS, than the amount of loss in a drier year. The normalized form of this relationship (i.e., ΔS/P versus φ; Figures 7d and 7h) shows a more linear response to annual φ, since the deviations from the horizontal (i.e., no-change) line are reduced as P increases (decreases) with lower (higher) aridity. Lower evapotranspiration ratios (ETa/P) associated with the wetter anomalies in the Budyko curve would also amplify the recharge contribution to storage.
 In addition to interannual variations in ΔS playing an important role in the calculated ETa/P response to changes in aridity (Figures 6 and 7), ΔS is also found to have an influence on long-term mean estimates of annual ETa/P. In the North Loup basin, during 1976 and 2008 (period with the highest number of wells) ( mm), the calculated mean annual from equation (6) is found to be , which is slightly lower than obtained from . This difference between the two estimates of ETa comes from the fact that over the 33 years studied, there has been an average of ∼13.4 mm yr−1 increase in groundwater storage (i.e., mm yr−1), with a cumulative water storage of ∼442 mm. When the full period of observations is considered between 1957 and 2008 ( mm), groundwater storage is found to have increased by roughly 8 mm yr−1, leading to a cumulative water storage of ∼522 mm. In the NL-T and NL-C subbasins, is estimated to be 4.97 mm yr−1 and 6.18 mm yr−1, respectively, during the most data extensive period from 1976 onward. These positive values of indicate that at least during the periods over which the mean annual water balance has been calculated, basin groundwater storage has shown a net increase, leading to subtle differences in the calculated values of (i.e., with and without included in the water balance).
 The net change in could be due to a combination of factors including the positive annual P trend, decreasing aridity, and regional groundwater contribution. From the values reported above we estimate the net mean annual change in the basin storage for the entire North Loup river as ∼17% (for the 1976–2008 period) and ∼10% (for the 1957–2008 period) of the mean annual basin runoff (∼80 mm yr−1). In NL-T and NL-C, values were relatively smaller, leading to approximately net changes of 6% and 8% of mean annual runoff, respectively. These empirical findings are consistent with the model experiments of Chen and Chen , who found a net gain of ∼13% of the total influx to groundwater in the headwaters of the Middle Loup basin. Both neighboring basins receive flow from the same regional groundwater system, and have similar morphology, soil and vegetation types. However we have to note that to more accurately estimate the contribution of groundwater on basin evapotranspiration, the net groundwater flux should be included in equation (6). Nevertheless our analysis show that over the studied time period (1957–2008), if the role of other factors on are neglected (such as increasing annual P), the net contribution of groundwater to evapotranspiration could be as high as approximately 2.3% of mean annual P and 13% of mean annual R.
5. Annual Water Balance Model (AWBM)
 The above results were incorporated into a simple numerical model to investigate the role of groundwater storage in the annual water balance. The model calculates annual R as the sum of direct runoff (Rd) and base flow (Rb). Rd is assumed to be a constant fraction of annual P. Rb is represented by a linear reservoir model. Recharge (or leakage, L) to the base flow reservoir is calculated by subtracting annual ETa and Rd from annual P. A net groundwater flux component (GW) into the base flow reservoir is included. A key assumption in our water balance model, consistent with Figures 6 and 7, is that annual ETa in the basin is controlled by climate only (i.e., annual P and φ). Using equation (5) to represent ETa/P as a function of φ and w, the basin annual ETa is estimated as
Direct runoff, Rd (due to impervious and saturated areas of the basin), is calculated using a constant runoff coefficient, Fs, according to
The amount of water available for groundwater recharge, L, is then obtained from the land surface water balance (ignoring soil moisture storage effects between years):
The rate of base flow, Rb, is directly proportional to the amount of groundwater stored in the catchment:
where S is the volume of groundwater stored per unit area of the basin upstream of the outlet [L], and T is the characteristic time scale of the basin groundwater drainage process. In this model, the half time of the groundwater storage is 0.693T [Brutsaert, 2008]. In addition to L, a constant net regional groundwater flux component GW [L/T] is considered as input to S, and conservation of mass leads to
Assuming a constant rate of L, equations (11) and (12) can be solved analytically for Rb from an initial value, Rbo, resulting in
where Rbo can be determined from an initial storage value, So (at t = 0), according to .
 In the numerical application of this model, we first calculate annual values of ETa and Rd (from equations (8) and (9)) and then use these values in equation (10) to obtain L. Assuming that the annual recharge and net groundwater gain occurs at a constant rate throughout a given year, L and GW are then used in equation (13) to calculate base flow.
 In the reminder of this study we use the model outlined above to address the research questions posed earlier in the paper. As a first step, however, it is important to evaluate the model performance and calibrate the available parameters (T, w, and Fs). As noted earlier, annual runoff at the outlet of the NLRB shows strong, long-term persistence, high lag 1 autocorrelation, and an overall increasing trend (Table 2), which makes the NLRB basin an ideal test case for the model.
5.1. AWBM Application to the NLRB
 The AWBM was run for the NLRB during the 1895–2008 period forced by the identical annual P (from PRISM) and annual ETp data (based on Tmin and Tmax of COOP stations) used to examine the Budyko hypothesis (BH) in the basins. In addition to the three parameters noted above, the model requires an initial estimate for So. But given the early start year of 1895, the selection of So does not have a significant effect on the modeled annual runoff during the main observation period (1929–2008). Thus, we simply selected a value for So that yields an annual R for 1895 that is approximately equal to the observed long-term mean annual R. Model calibration was performed manually (to match observed R) using the Nash-Sutcliffe (NS) model efficiency coefficient. The most critical calibration parameter in the model is w, which predominantly controls the mean annual ETa and, therefore, L. Over the long term, L + GW + Rd gives the mean annual R for the basin. The characteristic time scale, T, for the groundwater reservoir drainage is responsible for and builds the memory (i.e., persistence) in base flow. Finally, Fs controls the higher-frequency (but low-amplitude) interannual fluctuations in runoff (superimposed on low-frequency, high-amplitude base flow).
 The model is calibrated for two end-member cases, first with no groundwater influence (GW = 0), and second with a constant rate of net groundwater gain. This constant rate is set to 13 mm yr−1 (GW = 13 mm yr−1), which is the mean annual net increase in basin storage calculated in the basin for the 1976–2008 period, with the highest number of monitoring wells. This assumption implies that groundwater was the source of all “extra” water that led to a net increase in basin storage discussed above. Model predictions under the two scenarios are presented in Figure 8. Although we anticipate interannual variations in the net water input to the basin, however, groundwater movement in the soil is relatively slow and representing its slow movement with a constant net flux seemed relevant at annual scales.
 For the first case with GW = 0, final values of the calibrated parameters were found to be w = 0.325, T = 9.5 yr, and Fs = 0.04, which leads to a NS coefficient of 0.67 (NS = 0.67). Although reasonably high, this NS value is lower than the “well-calibrated” value of 0.95 (or higher) that has been recommended by James and Burges  for hydrologic models. This calibration gave calculated mean annual evapotranspiration fraction as . In the second calibration we set GW = 13 mm yr−1, used the same T and Fs values from the first simulation, and calibrated the model using only w. With this calibration w was found to be w = 0.49. This recalibration of w was needed because the net gain of storage due to groundwater led to higher annual runoff in the basin. With GW = 13 mm yr−1 and w = 0.49, we obtained NS = 0.67 (same as the no groundwater influence case), however with a slightly higher mean annual evapotranspiration ratio, , while both simulations yielded an identical mean annual R of 78.4 mm yr−1. As expected, in the latter simulation, the additional groundwater input to the basin is balanced with increased evapotranspiration, to maintain the observed runoff amounts under a constant net flux.
Figure 8 presents time series of observed annual P for the period 1895–2008 as well as the modeled and observed R, ΔS, and scatterplots of ΔS and φ (the latter presented only for the GW = 0, as GW = 13 case gives nearly identical results). In Figure 8a, two trend lines are plotted for annual P. The solid blue line shows the trend for the whole data period (1895–2008), while the red dashed line shows the trend for the 1929–2008 period. While there is no discernible trend in the full data set, the 1929–2008 period shows a ∼0.99 mm yr−1 increase in annual P. Both model runs capture the significant drop in runoff during the 1930s drought, subsequent runoff recovery until the early 1950s, and the droughts in the 1970s and early 2000s. The trend in modeled annual R is 0.17 mm yr−1 when a net rate of groundwater input is considered, and 0.19 mm yr−1 when groundwater is neglected. The trend in observed annual R is 0.2 mm yr−1, and is statistically significant at p < 0.0001. Modeled R has a slightly higher Hurst coefficient (H ∼ 0.92) than that of the observed (H = 0.814) and consistently shows a high degree of long-term persistence.
 The agreement between modeled and observed ΔS is very good (Figures 8c and 8d), despite the varying number of monitoring wells during the study period (Figure 5d; with 1976–1999 having the highest number of wells). During the 1958–2008 period with continuous monitoring well data, the observed and modeled ΔS show the following insignificant trends, 0.19 mm yr−1 for observations and ∼0.1 mm yr−1 (GW = 0.0) and 0.09 mm yr−1 (GW = 13 mm yr−1) with the two models. During the 1958–2008 periods, annual P and R grew 0.97 and 0.17 mm yr−1, respectively. Similar to the observations, the relationship between modeled ΔS and φ (Figure 8d) shows a clear gradation in ΔS from steep groundwater gain during wet anomalies (quadrant WA-GG) to less steep groundwater loss during dry anomalies (quadrant DA-LG).
 The two end-member simulations suggest that even when the net input of regional groundwater is not negligible, the interannual variability of storage change is controlled by annual precipitation variability. Because of the close comparisons between the two simulations, we explored the model under no groundwater flux component in the subsequent simulations where the model is forced by generated annual precipitation over a long period for sensitivity analysis. A constant net gain of groundwater storage would not be a realistic assumption over the long term.
5.2. Evaluation of Modeled ETa and R + ΔS
 Before examining controls on the ΔS–φ relationship shown in Figure 8d (and whether this behavior leads to any trends in R), we first verify the predicted annual ETa and subsequent water available for runoff, P – ETa = R + ΔS (which is equal to annual R when storage effects are absent). Annual ETp within the NLRB exhibits a general regional decrease (Table 2). According to the complementary relationship, lower ETp is often associated with an increase in ETa [Brutsaert and Parlange, 1998]. To verify annual ETa predictions in the AWBM (i.e., equation (8)), as well as to examine trends, we used two other independent estimates of annual ETa. The first estimate comes from a numerical grassland ecohydrology model (known as the Bucket Grassland Model, or BGM), which has been calibrated and tested in the NSH [Istanbulluoglu et al., 2012]. BGM solves the root zone average soil moisture, estimates daily ETa as a function of soil moisture, and simulates grass biomass at a daily time step. The soil moisture component of the model is presented in Appendix 1. Model details and its extensive application across the NSH are presented by Istanbulluoglu et al. . BGM is run as a lumped-parameter model in the NLRB. BGM requires forcing variables of daily P and ETp. We used the daily ETp data developed for this study using the calibrated Hargreaves equation (described in section 4.2.1). Daily P for the basin is estimated by averaging the daily P data from the COOP (16 stations) and AWDN (only 2 stations, available after 2000) weather stations. At the annual time scales over the catchment, the station-averaged data compares very well with the PRISM data used for the AWBM.
 A second, observationally based, estimate of annual ETa was determined for the period 1958–2008 from the basin water balance equation:
where observations of annual P, R, and ΔS for the NLRB were used to estimate ETa as a residual.
 As shown in Figure 9a, time series of annual ETa from the AWBM (i.e., estimated from the BH using equation (8)) compare reasonably well with both the observations (equation (14)) and the results of the BGM. In general, the AWBM predictions are in better agreement with the observational estimates than the BGM, which tends to slightly overestimate ETa in some years. This analysis further confirms the dominant control of climate on annual ETa in this region. The linear trend in annual ETa observations, as estimated from equation (14), shows an increase of 0.63 mm yr−1 (Figure 9a) during the period 1958–2008. The AWBM-predicted ETa shows an increasing trend of 0.56 mm yr−1 for the same time period. This increasing trend in ETa is consistent with what would be expected from the complementary relationship (i.e., decreasing ETp), as well as the observed increase in annual P from 1929–2008 (at a rate of ∼0.99 mm yr−1, although the trend is not statistically significant at α = 0.05 using the Mann-Kendall test). Milly and Dunne  also reported increase in ETa in the Mississippi River basin from 1949 to 1997.
 As noted earlier, the observed trend in annual runoff for the basin is ∼0.2 mm yr−1 and is statistically significant (p < 0.0001). The model simulates a similar trend in runoff (∼0.19 mm yr−1 when groundwater input is neglected) (Figure 8b) that is also statistically significant. Given that both P and ETa show positive, but insignificant trends, the source of the significant upward trend in R is not immediately clear. We examine here whether there has been a trend in P – ETa, a quantity which we refer to as the annual available water for runoff (AAWR). AAWR can also be calculated as the sum of annual leakage, L, and direct runoff, Rd:
where R is the annual total runoff (i.e., sum of base flow and direct runoff, R = Rb + Rd). Figure 9b shows annual values of AAWR calculated from observations (R + ΔS), as well as from observed P and calculated ETa (using equation (8), referred to as AWBM in Figure 9b). Observed annual R for the NLRB is also shown in Figure 9b for comparison. The two independent estimates of AAWR are in good agreement with each other, both in terms of the interannual variability and the long-term trend. The trend line in Figure 9b is fit to observed AAWR (R + ΔS). As might be expected, AAWR is highly variable (i.e., more so than R), since it is strongly controlled by interannual climate variability. Under negligible annual groundwater storage, annual R would be the same as AAWR (see equation (15)).
 Differences between AAWR and annual R illustrate the important role of groundwater storage in the NLRB. In a given year, a higher (lower) AAWR than R indicates storage gain (loss). Figure 9b provides a temporal context to the ΔS–φ relations presented earlier in Figures 6–8, and also illustrates how annual R would behave under negligible storage. AAWR estimates from both methods show positive trends (0.43 mm yr−1 for AWBM (P – ETa) where ETa was calculated, and 0.37 mm yr−1 for (R + ΔS) from observations), which are consistent with the positive trend in annual R. However, neither of the AAWR trends are statistically significant according to the Mann-Kendall test (p = 0.52 for P – ETa, and p = 0.74 for R + ΔS). This reduced level of statistical significance (as compared to that for annual R) is at least partly due to the smaller variance and higher persistence in annual R. For example, the Hurst exponent for AAWR predicted from P – ETa (R + ΔS) is 0.75 (0.69), while for annual R, H = 0.81.
 On the basis of the analysis above, we present the observed and calculated trends in the NLRB in Table 3. While besides R the trends are not statistically significant, the direction of trends suggest that the ∼0.99 mm yr−1 increase in P is compensated by ∼0.6 mm yr−1 increase in ETa, 0.19 mm yr−1 increase in ΔS, and 0.2 mm yr−1 increase in R. Despite the high uncertainty of the observations and the simplicity of our calculation, the trends reported above would roughly balance each other in the water balance equation. The observed and simulated annual R, both exhibit statistically significant trends and a higher degree of long-term persistence. Thus, Figure 9b clearly illustrates the role of groundwater storage in acting as a low-pass filter which smooths the climate-driven, high-frequency, high-amplitude fluctuations in P – ETa into low-frequency and low-amplitude responses in runoff (primarily base flow). The key evidence for this filtering process is provided in the observed ΔS–φ relationships (Figures 6–8). In section 5.3, we explore these relationships further using the AWBM.
Table 3. Observed and Modeled Trends for Annual Precipitation P, Runoff R, Change in Basin Storage ΔS, Evapotranspiration ETa, and Annual Available Water for Runoff (AAWR)
Observed Trend (mm yr−1)
Modeled Trend (mm yr−1)
GW = 0
GW = 13 mm yr−1
5.3. Model Simulations
 The AWBM is used to explore the role of base flow in the observed ΔS–φ relation and to investigate the annual hydrologic response patterns. To focus solely on the role of base flow, we generated annual P forcing using a stationary probability density function. Examining the cumulative probability plot of annual P between 1895–2008 using various parametric distributions including normal, lognormal, and gamma, we found that the lognormal distribution best represented the variability in P. Parameters of the lognormal distribution were obtained following Chow . This approach can be justified, since there is no trend in annual P during the overall period 1895–2008 (see Figure 8a), and since the lag 1 autocorrelation of the time series is also negligible (less than 0.01).
 The model also requires an annual value for the aridity index, φ. Although φ is dependent on both ETp and P, interannual variations in φ tend to be most strongly associated with variations in P, rather than ETp (which is generally less variable). For example, Wang et al.  found a strong dependence between P and φ in the NSH basins that avoided the need for an independent estimate of annual ETp. In a similar fashion, we found the following close relationship between φ and P for the NLRB:
This relationship is subsequently used to calculate annual φ from statistically generated values of annual P.
 A series of model experiments are reported below, in which model parameters for the NLRB are used in all simulations (i.e., identical to those noted previously). The initial storage value is varied on the basis of the selected T as described below. In each run, the model is forced with the same 700 year long annual P time series (generated by the lognormal distribution). This model time frame was chosen since (1) contemporary flood and paleoflood records for most rivers only date back centuries and (2) simulations over centuries are relevant to the discussion of streamflow persistence recorded on human time scales.
5.3.1. Effects of Basin Drainage Time Scale, T
 It was noted earlier in Figures 2 and 3 that the observed patterns in annual R and ETa/P are related to differences in soil texture and regional geology among the basins. The NSH basins that exhibit larger Hurst coefficients for annual runoff overlie a deep groundwater reservoir (i.e., the High Plains Aquifer), while the Ponca Creek and Maple Creek basins are outside the NSH, with much shallower soil reservoirs. If, for the sake of argument, we neglect the effect of soil texture on ETa, the only difference among the basins reduces to geological constraints that determine the size of the base flow reservoir in each basin. Under a similar regional climate, the observed differences in response can be directly linked to the role of groundwater storage. Our data analysis in section 4 suggested that the ΔS–φ relationship plays a strong role on the annual hydrologic response of a basin. Here, to better diagnose the various observed responses in the basins in Figure 3, we used the AWBM. Within the AWBM model, the time scale parameter, T, can be used to adjust the size of the active water storage in the basin. For example, under stationary P, mean annual S can be calculated from equations (9) and (11) as
The term in parentheses gives the amount of base flow contribution, while T has been related to a number of different basin-wide soil and geological properties, including hydraulic conductivity, relief, and drainage density [e.g., Brutsaert, 2008]. In our model, T accounts for aquifer/groundwater release processes throughout the year in a fashion similar to Eltahir and Yeh  and Atkinson et al. , and is not limited to low-flow conditions, as is typically considered for recession curve analysis [Brutsaert, 2008]. Compiling estimated T values from base flow recession analysis for medium to large basins ranging from several thousand to less than a 100,000 km2, Brutsaert  suggested a typical range to be days (i.e., roughly 0.08–0.17 years) on the basis of hydrograph recession in dry seasons with negligible drainage input. In numerical models employing the linear reservoir equation to represent base flow input to streamflow throughout the year, T has been reported to take on values ranging from several months to years [e.g., Atkinson et al., 2002]. For the NLRB, our model calibration gave T = 9.5 years, which suggests ∼6.5 years for the half-life of the reservoir.
 To examine the role of T in the AWBM, we ran the model by varying T between 0.1 and 50 years. In each simulation the initial value of storage, So, is calculated from equation (17). This forces the base flow in the first year of the model to be equivalent to its long-term value. Since T was the only variable allowed to change, the annual ETa, L, and long-term mean R are identical in all simulations. Figure 10 presents modeled runoff time series for various T values, along with their corresponding Hurst coefficients (H), the relationships between ΔS and φ, estimated ETa/P ratios from water balance closure (P − R)/P (i.e., under the assumption of negligible ΔS), and annual R as a function of annual P. Each column shows outputs for a given T value, in the following order from left to right: T = 0.1, 1, 10, and 50 years. As indicated in the first row of Figure 10, H grows with increasing T, suggesting that T is a variable that can be related to the long-term persistence of annual R. This argument can be better examined in connection with the modeled annual ΔS–φ and L–φ relations. The L–φ relation shows the net amount of recharge that goes to groundwater storage in each model year (as a function of φ), as calculated from equation (10). The difference between L and ΔS (for a given φ) yields the amount of annual base flow (Rb).
 A small T corresponds to a shallow base flow reservoir (T = 0.1 yr, mean S = 5.2 mm), leading to a negligible storage influence on annual R. As a result, the following observations can be made: (1) There is little to no memory in the system passed on from one year to the next, as evident from the low value of H. (2) There is a strong positive dependence between annual R and P (Figure 10, bottom left). (3) The estimated ETa/P (i.e., ignoring ΔS) follows the form of the Budyko hypothesis. Basins with shallow soils and/or steep slopes where soil water residence is negligible would behave in this fashion (i.e., similar to the response of Maple Creek in Figure 3).
 As T grows, the size of the groundwater reservoir (and associated base flow) increases to maintain the input-output balance over the long term (i.e., Rb = L), under a slow drainage process. As a result, during wet years (WA) the drainage process cannot keep up with recharge (L > Rb), leading to ΔS > 0. During dry years (DA), the system maintains base flow from available groundwater, leading to a net loss in the basin storage, ΔS < 0 (Figure 10, second, third, and fourth columns). Consistent with our field observations ΔS shows a steeper response to changes in φ during wet anomalies (WA). As noted by equation (12), the upper limit to this ΔS–φ relation is the L–φ curve minus base flow, Rb. This upper limit essentially follows the inverse of the BH curve. By retaining a significant portion of L during wet years, systems with large T build a longer memory of the past climate variability (i.e., higher H).
 For systems with larger groundwater storage, base flow becomes less variable, and the large S buffers annual runoff against interannual climate fluctuations. This manifests itself in the BH domain (Figure 10, third row) by forming an inverse relationship between (P – R)/P and the aridity index. In this context, P – R no longer represents basin annual ETa (due to the large influence of ΔS). Instead, the climatological signature in annual runoff (and, therefore, estimates of ETa from P – R) is overwhelmed by the overriding effects of base flow. This finding is consistent with our initial discussion of patterns observed in data from the NSH. For example, in Figure 3, data from ML-Dunning, NL-Taylor, and the entire NLRB show behavior similar to that predicted by the model for year (Figure 10). These results further demonstrate a physical basis for the emergence of hydrologic persistence in runoff for different basins that are under the same climatic influence, but different storage conditions.
 The relationship between the Hurst exponent (H) calculated for each 700 year modeled annual R time series and the corresponding T value used in the model is shown in Figure 11. Also highlighted in Figure 11 is the observed range of H for the NSH (i.e., 0.70–0.95 on the basis of annual R data). As one would expect, S increases with increasing T, leading to a long-term persistence in the system, as reflected in the positive H–T relation for years (Figure 11). Beyond T = ∼50 years, however, H decreases with T (i.e., as S gets large). We have not further explored the factors that cause the downturn in the H–T relationship. However, one plausible explanation we can offer is that, as T grows larger (>50 years), the amplitude of low-frequency fluctuations in base flow becomes dramatically reduced, leading to a nearly constant base flow. As such, the variability in annual R then becomes directly correlated with annual P, generating direct runoff from saturated portions of the landscape. Since P is essentially simulated as random noise, H decreases as base flow becomes less variable (i.e., as T grows beyond a certain value), with direct runoff therefore becoming a more prominent driver of variability.
5.3.2. Effects of Vegetation and Soil Texture
 The ΔS–φ relationship, which affects the interpretation and application of the BH, is controlled by the interplay between recharge and base flow release. In section 5.3.1, we examined the effects of T, which controls the rate of base flow release. On the other hand, recent studies have also demonstrated the significant role of vegetation in the BH across various basin spatial and temporal scales [Zhang et al., 2001; Donohue et al., 2007, 2010]. In section 5.3.2 we examine the potential role of vegetation and soil texture on ETa and the ΔS–φ relationship using the AWBM. In the literature, w (equation (5)) is typically used as a calibration parameter to account for the effects of soil and vegetation differences on ETa [Zhang et al., 2001; Wang et al., 2009]. Typical w values are between ∼0.0 and ∼2.0 for many basins around the globe, with w increasing in the order of grass, shrub, and forest vegetation.
 To examine the implications of vegetation change for basin hydrology and runoff response, the AWBM was run for three different w values (0.0, 1.0, and 2.0) using a fixed base flow time scale of T = 10 years. The results are shown in Figure 12. Under identical annual P forcing, annual R was found to undergo a roughly fivefold decrease as w increased from 0.0 (e.g., sparse grasses) to 2.0 (forests). In addition, under higher annual ETa (i.e., higher w), interannual climate fluctuations are found to have a smaller influence on both runoff variability (Figure 12a) and modeled storage change (Figure 12b). As would be expected, the reduced runoff fluctuations are also associated with lower Hurst coefficients (e.g., H = 0.70 at w = 2.0, as compared to H = 0.82 at w = 0.0). These model results imply that under sparsely vegetated conditions (i.e., low w), climatic fluctuations could have a larger influence on changes in basin groundwater storage, and higher streamflow persistence.
 In a different simulation, we ran the model using a constant F(φ), set to the mean annual evapotranspiration ratio of 0.856 of the basin. This suggests that every year only a constant fraction of precipitation will form runoff regardless of the annual climate aridity. This scenario simplifies the land surface component of the AWBM. The model results were decreased interannual variability of runoff and reduced H (H = 0.77) compared to the base simulation with w = 0.0 (H = 0.82), in which the evapotranspiration ratio followed the BH as a function of annual aridity index. This finding is a manifestation of the shape of the Budyko curve, which sharply drops as climate gets wetter. A higher amount of recharge during wet years builds a longer base flow memory in the system.
6. Summary and Conclusions
 The research presented in this paper was motivated by the observed runoff trends in four basins located in central Nebraska, United States. Two of these basins are located within the Nebraska Sand Hills (NSH) overlying the High Plains Aquifer, and the other two are located outside the NSH. The basins are characterized by a nearly identical climate and topography, but with significant variations in annual hydrologic response, driven by differences in soil texture, groundwater hydrogeology, and varying base flow contributions. The Budyko hypothesis (BH), which positively relates basin annual evapotranspiration to precipitation ratio (ETa/P) to climate aridity (defined by aridity index, φ, basin potential evapotranspiration, ETp, to P ratio), is used to examine the modulating role of groundwater on annual ETa and runoff (R).
 In the study basins in the NSH we note that as base flow contribution becomes more significant: (1) the annual R persistence grows and the positive R trend gets stronger; (2) annual R becomes less sensitive to annual precipitation; and (3) a positive relationship between the estimated annual ETa/P and φ (consistent with the BH) lends itself to a negative relation, inconsistent with the Budyko hypothesis. In this first implementation of the BH, ETa was approximated from water balance closure, ETa = P − R, as typically done in most applications of the BH. Instead of “rejecting” the BH in groundwater-driven basins, we improved the accuracy of ETa calculations by incorporating the annual changes in basin water storage (ETa = P − R − ΔS), quantified from observed basin-averaged water table fluctuations. Adding the role of interannual fluctuations in basin water storage led to the estimated ETa/P and φ relationship of the basin to become consistent with the form of the BH.
 This finding has three important implications. First, it provides evidence for the validity of the BH for approximating basin annual ETa in streams driven by groundwater. Therefore, when basin annual R is also available, along with calculated ETa, changes in water storage (ΔS) can be approximated in the basin. Furthermore, in our study basin, relating yearly changes in groundwater storage obtained from data to climate (through φ) revealed the mean climate of the basin as a critical climatological variable that control groundwater dynamics. In the data we saw that net groundwater gains take place during wetter anomalies and groundwater losses occur during dryer anomalies. The switch from a gaining to losing pattern appears to be around the value of the mean annual aridity index, suggesting that the annual base flow amount is always lower than recharge during wet years, and higher than recharge during dry years. The observed close relation between storage change and climate in the basins indicates that interannual changes in basin water storage are strongly controlled by climate fluctuations, even in basins where annual runoff is persistent, relatively steady, and driven by groundwater (as in the NSH basins). The understanding of the groundwater-climate coupling is critical to improve hydrologic models as well as climate change predictions in regions where groundwater plays an important role in runoff generation.
 Our data analysis in the NSH basins led to the development of a parsimonious (three-parameter) model of annual water balance (AWBM). The predicted annual runoff using the AWBM in the NLRB between 1929 and 2008 agreed with observed runoff relatively well. We further confirmed AWBM predictions against estimated ETa (from observations using ETa = P − R − ΔS) and ΔS, obtained directly from water table fluctuations. The AWBM was capable of explaining the factors that caused the observed trends in annual runoff in the North Loup River Basin (NLRB), and underscore the role of storage, acting as a low-pass filter, on the persistence of annual runoff time series. While previous analytical and numerical modeling studies have illustrated the role of groundwater as a low-pass filter, they often used random recharge to groundwater. By tying the role of climate into the estimation of groundwater recharge in a simple model provides a simple framework to link the stochastic properties of climate with those of basin storage and runoff. We argued that the key to this filtering process seems to be the observed ΔS–φ relation, and the stochastic nature of the climate. As demonstrated in the vegetation change simulations (Figure 12), a higher recharge in grass vegetation compared to forest have led to a higher Hurst coefficient. The Hurst coefficient dropped when the BH (equation (8)) is replaced by a constant ratio in the AWBM.
 Using the AWBM we explored how the size of the groundwater reservoir influence the R and ETa response in the Budyko hypothesis domain. The model was forced by sampled annual P from a fitted lognormal distribution to P data, and groundwater reservoir size is represented by changing the reservoir time scale parameter, T in equation 17 in different simulations. In examining the simulations' results in the BH domain, ETa is approximated only from (P − R) as typically done in the absence of other storage related data. In the model results, for systems with larger groundwater storage (low T), base flow becomes less variable, and the large S buffers annual runoff against interannual climate fluctuations. This manifests itself in the BH domain by forming an inverse relationship between (P – R)/P and the φ. In this context, P – R no longer represents basin annual ETa (due to the large influence of ΔS). Instead, the climatological signature in annual runoff (and, therefore, estimates of ETa from P – R) is overwhelmed by the overriding effects of base flow. However, the model results suggest that basin water storage effectively absorbs the fluctuations in annual P, and therefore annual recharge in a basin may be predicted reasonably well with the water balance equation with ETa calculated using the Budyko hypothesis.
 Model sensitivity analyses show dependence between the size of the groundwater reservoir and the long-term (multidecadal) persistence in annual runoff. When viewed over a period of typical hydrologic measurement time scales, such fluctuations might be erroneously interpreted as trends. These findings also underscore the importance of evaluating apparent trends in any system variable in a complete water budget context. Our simulations were only concerned with the hydroclimatological conditions of the specific region examined. As such, results of our simulation model cannot be generalized to other climates. Simple top-down approaches, similar to the one described in this paper, could help to identify at least the first-order controls of the systems and develop hypotheses that can be tested using observations and more complex numerical models.
Appendix A:: Bucket Grassland Model
 The ecohydrological point model consists of both hydrologic and dynamic vegetation components [Istanbulluoglu et al., 2012]. A brief description of the governing equations used by the model is given here. The water balance at a point for a homogeneous soil can be given as [e.g., Laio et al., 2001]
where n (dimensionless) is porosity, Zr [L] is effective rooting depth, s (dimensionless) is saturation degree of soil moisture, t [T] is time, Ia [L/T] is infiltration rate, ETa [L/T] is actual evapotranspiration rate, and D [L/T] is drainage rate. When soil is unsaturated, Ia is the smaller value of rainfall intensity and soil infiltration capacity that is assumed to equal KS, or when soil is saturated, Ia equals the drainage rate D. At the lower boundary of the root zone, the unit hydraulic gradient assumption is used for calculating drainage following [Campbell, 1974]
where K(s) [L/T] is unsaturated hydraulic conductivity, b (dimensionless) is an empirical parameter, and sfc is the saturation degree at field capacity. Actual evapotranspiration ETa is calculated by
where ETp [L/T] is potential evapotranspiration, and βs (dimensionless) is a lumped parameter representing the effect of soil moisture on ETa [Laio et al., 2001]:
where sh is the saturation degree at soil hygroscopic capacity and sw and s* are saturation degrees corresponding to plant water potentials at the wilting point and incipient stomata closure, respectively. The Campbell retention model was used to calculate sfc, s*, sw, and sh for sand in this study [Campbell, 1974; Laio et al., 2001]. Daily ETp is based on the Penman-Monteith equation [Allen et al., 1998] and the fraction of grass cover. The model has a dynamic vegetation component that tracks aboveground live and dead biomass and root biomass. The model calibrated and tested in the NSH [Istanbulluoglu et al., 2012].
 We appreciate comments from Steve Burges on an earlier draft of this paper, which improved the presentation of this paper. The research reported herein was supported in part by the NE Department of Natural Resources, NE Environmental Trust (project 08-141-2), and the NE Game and Parks Commission. This paper greatly benefited from the comments of three anonymous reviewers.