Statistical estimation of streamflow depletion from irrigation wells



[1] A multiple regression model was applied to annual time series data (1936–1998) in an attempt to quantify the influence of irrigation wells on annual streamflows of Frenchman Creek in Southwestern Nebraska where intensive well development has taken place since 1950. A strong statistical relationship was found between the logarithm of streamflow and number of wells, current and lagged annual precipitation, and two variables that are the geometric mean of precipitation and number of wells in the current year and the year before last. Estimated mean streamflow from the statistical model in 1998 is approximately one third of that in 1950. A simpler model without the interaction terms between number of wells and precipitation was estimated for the Republican River near its final entry into Kansas from Nebraska, and an even greater decline in mean annual streamflow was found. The present mean is approximately one quarter of the 1950 level.

1. Introduction

[2] Intensive groundwater development for irrigation is typically accompanied by local streamflow depletion. “When pumping of a well located near a stream or surface water body starts, the well initially obtains its supply of water from aquifer storage. The resulting decline of groundwater levels around the well creates gradients which capture some of the ambient groundwater flow that otherwise would have discharged as base flow to the stream. Eventually the cone of depression of the well intercepts the stream, thus inducing flow out of the stream into the aquifer, and the aquifer drawdown comes to an equilibrium, with the streamflow reduced by the rate at which the well is pumping. The sum of these two effects leads to streamflow depletion.” [Sophocleous et al., 1995]. In a basin-wide development, both the primary and tributary streams are depleted through this general principle.

[3] Computer modeling of hydrological relationships can now be conducted with considerable sophistication under the rubric of inverse modeling in which modern statistical methods are applied [Cooley, 1977, 1979; Yeh, 1986; Carrera and Neuman, 1986; Finsterle and Najita, 1998; Hill et al., 1998], and inexpensive high-speed computing technology has removed a major obstacle to such research. However, direct statistical estimation from time series data of streamflow depletion caused by irrigation wells is an alternative that shows considerable promise with respect to quantifying the effects of irrigation wells on mean annual streamflow. Results of such a statistical model could provide a useful comparison with the implied steady state of a comprehensive simulation model, besides providing information directly at a relatively small cost. Although nonlinear multiple regression and maximum likelihood methods are the foundation of inverse modeling, direct applications of multiple regression methods to streamflow data appear to have been limited to short-term forecasting problems [e.g., Garen, 1992; Fernandez and Salas, 1990; Tasker, 1980].

[4] Water rights to streamflow in the Republican River are currently in dispute between Kansas and Nebraska under a federal interstate river compact. Kansas has filed suit against Nebraska in the U.S. Supreme Court with the contention that [Bennet and Howe, 1998, p. 486] “decreases in annual estimates of virgin streamflows are not due to natural phenomena but are due to increases in groundwater withdrawals.” The Special Master, appointed by the court, ruled in January 2000 that groundwater pumping in Nebraska and its effect on Republican River flows will be considered in the alleged violation of the interstate compact, and thus the compact is not limited to stream withdrawals only as Nebraska contended.

[5] Nebraska water law provides a decentralized system of control to accommodate scarcity of supplies, both with respect to surface and groundwater. The state is divided into 23 natural resource districts (NRD) that have a great deal of autonomy. Approval is not required from the Nebraska Department of Water resources (DWR) for an NRD to establish a “groundwater control area”, and the DWR cannot itself establish one. The regulations imposed by an NRD can include well spacing, groundwater withdrawals, rotation of pumping, and a moratorium on new wells [Aiken, 1998].

[6] This paper reports results from an application of multiple regression analysis to streamflow depletion in Southwestern Nebraska where there has been intensive development of groundwater for irrigation during the past fifty years. Relatively detailed results that include nonadditive terms (interactions) for number of irrigation wells and annual precipitation were obtained for Frenchman Creek, which is a tributary of the Republican River. The same model was applied to annual streamflows of the Republican River at the gauging station south of Hardy, Nebraska, which is located just over the border into Kansas about 322 km east of the Colorado-Nebraska border. This latter analysis demonstrates a strong negative relationship between annual streamflows at the Hardy gauging station and upstream irrigation well development over the past 50 years. However, the authors were unsuccessful in estimating an interaction effect between annual precipitation and the number of wells, but this is not surprising for such a large, heterogeneous watershed.

2. Frenchman Creek Analysis

[7] Frenchman Creek with annual streamflows measured at Culbertson, Nebraska (U.S. Geological Survey streamflow gauging number 6835500) is the stream to which detailed regression analysis was applied by using an annual time series sample with precipitation and number of large-bore wells to explain streamflows. Stinking Water Creek merges into Frenchman Creek (southern stream with reservoir in Figure 1), which flows into the Republican River at Culbertson. At their juncture, Frenchman Creek had approximately twice the mean annual flow of Stinking Water Creek during 1952–1989 [Peckenpaugh et al., 1995]. Major portions of Chase and Perkins Counties and a small wedge shaped section from southwest Hayes and Northwest Hitchcock counties comprise the study area. The irregular boundaries are a consequence of using surface water hydrology as the primary criterion to delineate the Frenchman Creek watershed within Nebraska. The western edge of Figure 1 is the border between Colorado and Nebraska.

Figure 1.

Frenchman Creek Study Area.

[8] The South Platte and Republican River valleys lie to the north and south, respectively, of the boundaries in Figure 1. The Upper Republican NRD in Figure 2 is comprised of Perkins, Chase and Dundy Counties, in which irrigation wells are metered and an annual groundwater pumping quota per unit area has been imposed since 1978; the quota has been 37 cm since 1980 [Aiken, 1998]. Only the southeastern neck of the study area in Figure 1 is excluded from the NRD where these regulations apply.

Figure 2.

Upper Republican Natural Resources District.

[9] The U.S. Bureau of Reclamation has used less than 19.3 km (12 miles) from a perennial stream as the criterion for counting irrigation wells that influence a stream, a choice that is convenient when using the survey unit of townships, which is 93.3 km2 (36 mi2) on U.S. maps. By this criterion the northwest corner of Figure 1 is farther than 19.3 km from a perennial stream, but earlier dated maps from the Bureau of Reclamation show Frenchman and Stinking Water Creeks as perennial streams extending over into Colorado [U.S. Department of Interior, 1996, p. 4]. Therefore all of the wells in Figure 1 were included in the time series analysis where the sample begins in 1936.

[10] Declining groundwater levels associated with irrigation wells in the Frenchman Creek and similar areas in the upper Republican River basin appear to be associated with declining streamflows in the area, because many of the streams in the basin receive a portion of their flow from the aquifers. As groundwater levels decline, hydraulic gradients toward the streams are reduced, thus reducing aquifer-to-stream discharge. Nearby wells can lower the water table to the point where the hydraulic gradient to the stream is reversed, thus causing streamflow depletion.

[11] According to the U.S. Department of Interior [1996], “A good example showing the probable connection between a declining water table and its effects on streamflows would be Frenchman Creek above Enders Reservoir. A majority of the flow in Frenchman Creek is derived from the adjacent alluvial and Ogallala aquifer system… Historically, the upstream point where Frenchman Creek was believed to have a perennial flow was located several miles west of the Nebraska-Colorado state line. Now the point of “perenniality” appears to be several miles to the east of the state line.”

[12] Wells of more than 10.16 cm in diameter drilled in Nebraska must be registered with the State within one year of completion. The date of registration therefore provides an excellent estimate of the time at which pumping of a well began. The registration information also includes the legal description of the quarter section in which the well is located, and a well must pump at least 3.218 m3/min to be classified as an irrigation well. This together with the mapping of groundwater aquifers was used to identify those wells that were likely to influence Frenchman Creek streamflow at Culbertson; well locations are shown in Figure 1. Information on the wells within the townships that covered the groundwater aquifer was retrieved from the electronic database maintained by the Nebraska Natural Resources Commission; this information comes from well registration forms that are filed with the Nebraska Department of Water Resources.

[13] The first recorded irrigation wells were drilled in 1951 and development continued rapidly, but with fluctuations that followed farm commodity price cycles and improvements in irrigation technology. The total number of wells that had been registered through 1998 (last year of the sample) was 1238. The 15 years from 1936–1950 were included to serve as a “control” in the implicit experimental design of the data, i.e., no irrigation wells had been registered; thus the sample is comprised of 63 years, 1936–1998. This 15-year predevelopment period not only improves statistical precision by adding degrees of freedom, but substantially reduces the possibility of a spurious regression between streamflow and number of irrigation wells, both of which are strongly trended after 1950. Annual precipitation measurements at two locations, Imperial and Culbertson, Nebraska (see Figure 1), were found to be important explanatory variables in the analysis. Streamflow data are from U.S. Geological Survey [1936–1998b], and precipitation data are from U.S. National Climatic Data Center [1936–1998].

[14] The multiple linear regression model can be written as

equation image

where α and β1, β2, …βk are fixed unknown parameters. Classical assumptions for the random disturbance, εt, are that it is temporally uncorrelated, has expectation zero, and has constant variance σ2. The subscript t denotes a particular observation, which in this application is an observed year in the time series sample; Yt is annual streamflow; and the {Xit} are the independent variable set. The sample of 63 observations is sufficiently large that the generalized central limit theorem justifies application of traditional statistical inference procedures in the linear regression model based on the normal distribution [Theil, 1971, p. 378; Jennrich, 1969].

[15] Visual inspection of the time series streamflow data suggested nonstationarity with a strong downward trend. A linear regression using the logarithm of Frenchman Creek streamflows was estimated with current and lagged precipitation as explanatory variables, but neither variable was close to significant at traditional levels, and the Durbin-Watson statistic calculated from the regression residuals was 0.2 (expected value with independent residuals is approximately 2.10 [Maddala, 1992, p. 231]). Reestimation with a first order autoregressive disturbance gave good statistical precision on the two precipitation variables; the autoregression coefficient on the lagged residual was 1.00, which yields a first-difference model, for which the Durbin-Watson statistic was 2.5. This exercise clearly indicated nonstationarity of Frenchman Creek annual streamflow.

[16] The analysis of the effect of irrigation wells on streamflow was begun with a simple linear regression where annual streamflow was regressed on the existing number of wells. A statistically significant result was obtained, but the plotted residuals exhibited an irregular pattern with some large outlier observations. The pattern suggested that the assumption of a constant variance for εt is violated, and that the variance becomes smaller as the mean decreases. Reflection on the physical situation in which well numbers increase and average streamflow declines, suggests that the standard deviation of streamflow might be proportional to its mean. Insofar as irrigation wells decrease the average water table, this would tend to create a buffering effect on streamflows, and thus reduce variability of the flow. Only seasonal and not annual variation might be affected, but nevertheless, the time series data suggested that the standard deviation of annual streamflow is proportional to the mean. Using a logarithmic transformation of the dependent variable creates a model with this property.

[17] The above discussion leads to the following multiple regression model,

equation image

where vt is the multiplicative random disturbance. Taking logarithms of both sides of this equation yields the linear in parameters model,

equation image

where εt equals ln(vt). When the model was fitted to streamflow data, the convex function that is implied for this relationship between mean streamflow and number of irrigation wells exhibited a better fit to the data than did a statistically more complex model that forced a linear relationship between mean streamflow and irrigation wells but with a multiplicative disturbance. This latter model is nonlinear in the parameters and requires a nonlinear-least squares algorithm. In this application we have a fortuitous situation where the logarithmic transformation of the dependent variable removes the problem of heteroscedasticty in the disturbance term, and also improves specification of the functional form of the mean relationship between the dependent and independent variables.

2.1. Additive Model

[18] In principle, a complete statistical model should allow for dynamic factors that might affect streamflow, such as precipitation and number of irrigation wells in previous years. A larger number of wells in previous years would affect the water table (at least seasonally) and this could affect the amount and timing of water entering the stream from springs. Precipitation in previous years could also affect present streamflow, and there could be an interaction between the wells and precipitation with respect to their joint effects. An interaction between two factors implies nonadditive net effects within the context of the linear statistical model [Scheffé, 1959]. The statistical analysis was begun without any allowance for interactions between precipitation and number of wells.

[19] The unit of measure for streamflow is m3/s d (86,400 m3), and the unit for precipitation is centimeters. The observation for 1960 had a very large residual, and when it was removed from the sample by means of an indicator variable (zero for all observations except 1960 where it was assigned a value of one), the t ratio was 3.6 which is significant at the 0.001 level. An examination of monthly precipitation in 1960 showed March precipitation to be four times normal, and substitution of the sample mean for March made the annual total nearly normal. Therefore the 1960 indicator variable was left in the regression equation in order to obtain more reliable statistical inferences.

[20] The analysis was begun with a separate precipitation variable for each of the two locations, but the respective coefficients on the current and lagged precipitation variables were so nearly equal that the adjusted R2 was larger in a constrained model that used average precipitation over the two locations. Results for this parsimonious model are given in the first equation of Table 1, where the numbers in parentheses are t ratios of the regression coefficients. All coefficients are significantly different from zero at the 0.01 level. Sample means of ln(Y), P, and W are 6.7202, 50.40, and 485, respectively.

Table 1. Regression Equationsa
Explanatory VariableEquation Number
  • a

    Numbers in parentheses are t ratios. The columns 2–4 are for Frenchman Creek, and column 5 is for the Republican River.

Intercept6.2920 (58.3)6.2556 (63.4)6.2285 (60.2)6.7626 (12.5)
Precipitation(t)0.01074 (7.37).01221 (8.86)0.0126 (8.97)0.0391 (6.38)
Precipitation(t-1)0.00632 (4.28)0.00606 (4.52)0.00633 (4.76)0.0157 (2.47)
Wells(t)−0.000907 (−29.3)−0.000919 (−32.6)−0.00328 (−3.13)−.000111 (−4.73)
Wells(t-1)  0.00479 (2.31) 
Wells(t-2)  −0.00243 (−2.17) 
Sqrt(wells*precip(t-1)) −0.00248 (−3.69)−0.00293 (−4.17) 
Sqrt(wells*precip(t-2)) 0.00248 (3.69)0.00293 (4.17) 
Standard error of estimate0.1090.098950.096150.45
Adjusted R20.9360.9480.95050.64
Degrees of freedom58575552
Durbin-Watson statistic1.731.841.801.73

[21] The primary variable of interest in this regression equation is the number of wells, which is significant at the 0.001 level with a t statistic equal to −29.3. Since the dependent variable is the logarithm of streamflow, the standard error of the estimate can be thought of as a proportional measure with respect to mean streamflow expressed in the original unit, and the precision with which the regression equation will predict streamflow is proportional to its mean in a given year. A conditional prediction of annual streamflow based on the first regression equation in Table 1 is profoundly more accurate than using long-run mean of streamflow over the sample period, 1936–1998. This point is illustrated by using 1998 as an example; from the sample mean of the logarithm and the predicted logarithm of streamflows in 1998, the respective streamflows are 829 and 401 m3/s d.

2.2. Interactions Between Wells and Precipitation

[22] Nonadditivity in the net effects of precipitation and number of wells is expected because the seasonal cones of depression in the water table caused by irrigation wells would be interacting with surface water emanating from precipitation and irrigation return flows, and thus affect the amount and timing of water reaching the stream. Two different functional forms for the interaction were tested, (1) the product of precipitation and number of wells and (2) the geometric mean of these two variables. The latter was definitely the better choice based on statistical fitting to the data, and appeared to be more plausible based on a priori reasoning about the structure of response, i.e., the partial derivatives of the regression equation with respect to number of wells and precipitation. Much better results were obtained by using precipitation at the Imperial location in the interaction terms than the average at Culbertson and Imperial; a subscript i on P denotes the Imperial location. It would appear that these contemporaneous and lagged interaction terms for nonadditivity should sum to zero because the dynamics involve precipitation which is a strictly short-period phenomenon, even though number of wells is not.

[23] The estimated regression equation from the 1936–1998 sample with the coefficients on current and lagged interactions forced to sum to zero is the second equation in Table 1, where the t ratios of the regression coefficients demonstrate excellent precision. The additive model represented by equation (1), and the interaction model by equation (2), are contrasted statistically by a formal test of the hypothesis that there are no interactions. Because the coefficients on the two interaction terms are constrained to sum to zero, the single t ratio for the two interaction terms provides the test statistic which is significant at the 0.005 level. The mathematical structure of the regression equation is analyzed in Appendix A.

[24] A statistical test of the hypothesis that a total of four interaction terms constrained to sum to zero (periods t, t-1, t-2 and t-3) is the correct model instead of only two (t and t-2) was not significant at the 25% level (P value = 0.32). A split sample test was performed to evaluate specification of the regression model as the second equation in Table 1, where the subsamples were 1936–1967 and 1968–1998. The F statistic with 4 and 53 degrees of freedom was 0.55, which is not close to being significant at conventional levels, thus supporting the model represented by equation 2 in Table 1.

[25] A graph comparing observed and predicted logarithms of streamflow over the 1936–1998 sample period is given in Figure 3. Close agreement between the model and the data is apparent in this graph as well as the graph of residuals given in the lower part of Figure 3. The graph in Figure 4 presenting observed and predicted streamflows in their natural units provides another view of the data vis-à-vis the model. The multiplicative disturbance term in conjunction with the decreasing mean streamflow produces a declining variance of the random disturbance over time that is apparent in the graph of residuals. The same phenomenon appears in the time path of the observed and predicted streamflows too; notice how they tend to converge. Mean streamflow is constant between 1936 and 1950 before groundwater development started; it then declines steadily throughout the remainder of the sample period. The spike in the graph for predicted streamflow in 1960, with observed and predicted flows equal, is the result of treating 1960 as an outlier observation by means of an indicator variable.

Figure 3.

Natural logarithm of Frenchman Creek annual streamflows (m3/s d): observed, predicted, and residual, 1936–1998.

Figure 4.

Frenchman Creek annual streamflows (m3/s d): observed, predicted, and residual, 1936–1998.

[26] The ability of a time series regression equation to accurately forecast outside the sample used for statistical estimation of the unknown parameters is arguably the best test of its validity and usefulness. The test is performed by truncating the sample used in estimation and then observing the forecast accuracy of the fitted relationship on the remaining observations. This exercise was performed on the Frenchman Creek data by deleting the last 20 years of the sample, 1979–1998; the forecast residuals are presented in Figure 5. The largest residual in absolute value equaled −0.24 with standard error 0.118 and occurred in 1991; this is the only residual that reached the boundary of a 95% confidence interval. The mean and standard deviation of the 20 postsample forecast residuals are −0.032 and 0.0992, respectively; these statistics for the same observations when the entire sample is used for estimation are −0.045 and 0.0913. Overall, this is remarkable consistency for a time series regression on nonexperimental data.

Figure 5.

Postsample forecast residuals of Frenchman Creek annual streamflows with 95% confidence band (logarithm m3/s d), 1979–1998.

[27] These postsample forecast results demonstrate that the pumping quota imposed on irrigators within the Upper Republican NRD since 1980 has had little influence on annual streamflows in Frenchman Creek, probably because consumptive use was nearly the same with or without the pumping quota. Irrigation return flows were apparently the variable that adjusted in the contrast between the two periods with and without the pumping allotment.

[28] However, it would appear that either the variance of the logarithm of Frenchman Creek streamflows has decreased and/or the 1979–1998 period is by chance “unrepresentative”, because the mean of the standard errors of forecast for these 20 observations is 0.125 in contrast to the computed standard deviation of the observed residuals which is 0.0992. Normally, a reverse ordering in magnitude would be expected for these two measures because of at least some specification error in the model which would inflate the standard deviation of the observed residuals. The assumptions underlying the regression model used here are that the mean is changing in relation to the independent variables, but the variance of the stochastic disturbance term is constant over the sample period. Hydrologically, the following question is raised by these reported results: Are the irrigation wells creating a buffer effect on annual streamflows of Frenchman Creek such that there is less variability compared to its virgin state?

[29] A statistical test of the hypothesis that the variances of the disturbance term are equal over the entire sample period, 1936–1998, was made against the alternative that the variances are unequal among the three 21-year period subsamples, but a common mean exists over the entire sample, i.e., the regression equation. The differences in variances were not close to significant at conventional levels, but as noted earlier for this model, the variance of streamflow as contrasted to the variance of its logarithm is declining. If the additive model represented by equation (1) in Table 1 is used (well-precipitation interactions excluded), the mean and standard deviation of the 20 postsample forecast residuals are 0.027 and 0.1153, respectively. The interactions between precipitation and number of wells definitely improve the post sample forecasts; the standard error of forecast is reduced 14%.

[30] Irrigation wells close to the stream should have more influence on annual streamflows than those more distant, particularly with respect to interactions between precipitation and number of irrigation wells. This possibility was explored by classifying the original set of wells into “close” and “far” well categories, where the former excluded wells in Perkins County (see Figure 1). Because the economic factors influencing well development over the sample period were the same for both areas in the dichotomy of close and far wells, the inclusion of a separate variable for each category created what is commonly called the multicollinearity problem in multiple regression [Helsel and Hirsch, 1992]. The regression coefficients associated with close and far wells were so severely confounded that neither was independently significant.

[31] Another model was evaluated with total number of wells used to estimate the additive effect of wells, and the number of close wells was used to define the interaction between wells and precipitation. The model using close wells in the definition of the interactions gave an R2 equal to 0.9482 compared to 0.9477 from the model that used total wells. Qualitatively, this ordering is in the direction expected, but this is clearly a trivial difference and not significant at conventional levels, although the lack of precision is not particularly surprising because of the faulty “experimental design” with which we must work, i.e., the high correlation between the two variables defined as numbers of close and far wells. Formally, this situation results in low power for the statistical test to discern any difference between the two alternative models.

[32] This exercise with respect to close and far wells helps clarify what can and cannot be learned from this type of statistical analysis. Our analysis has provided essentially no information about how the distance of well development from a stream quantitatively affects streamflow because the development pattern over time has been so nearly the same over wide areas. What it does unequivocally tell us is that the historic pattern of well development in the Frenchman Creek basin has resulted in profound decreases in average streamflow. Traditional hydrological modeling will have to give us details on the process [Peckenpaugh et al., 1995].

[33] Nevertheless, this type of statistical analysis shows promise in the validation of a dynamic hydrological model of the region. Such a model should in principle be able to duplicate the estimated mean streamflow of Frenchman Creek that was obtained here for a given period using the statistical model. However, the recent Peckenpaugh study was focused on simulating future groundwater pumping depths and reported only cursory results related to Frenchman Creek streamflow, primarily seasonal in nature. The greatest projected increase in pumping depths was for northwestern Chase county (see Figure 1), which was above Frenchman Creek and its tributaries more than 19.3 km in 1998 but not that far in the early years of the sample. The Peckenpaugh study area was bounded on the north by the South Platte River, on the south by the Republican River, and went east and west of Chase and Perkins Counties about 10 km (Figure 2). Estimated mean streamflow as a function of number of irrigation wells is given in Figure 6. There were 1238 irrigation wells in 1998, for which the graph indicates a mean streamflow equal to 424 m3/s d; this is a little less than one third of the mean in 1950 before groundwater development for irrigation. Extrapolation beyond the range of data used in a regression analysis is always tenuous, but these results strongly suggest a continued decline in mean streamflow as more wells are drilled. Good predictive performance by the equation estimated here might depend on how future new wells were spatially distributed.

Figure 6.

Frenchman Creek mean annual streamflows (m3/s d).

[34] A general analysis of the mathematical structure of the second regression equation in Table 1 is given in Appendix A. The equation has negative slope and convex curvature with respect to number of wells in the current period t. The slope is positive with respect to number of wells in period t-2, and the curvature is concave. The positive response of streamflow in year t with respect to number of wells in year t-2 can be attributed to interyear dynamics of irrigation return flows. In trying to understand the dynamics of year-to-year streamflows, as they are affected by added irrigation wells in the basin, it is important to recognize both the space and time dimensions in the physical processes involved. The number of wells has been increasing in an irregular way both with respect to time and space.

[35] Although the sign of ∂ϕt/∂Pit is ambiguous in the simple qualitative analysis in equation (A4), it has a positive outcome for the empirical results. An important contributing factor is the spacial correlation between precipitation at Culbertson and Imperial, which means that ∂ϕt/∂Pit must include (∂ϕt/∂Pct)(∂Pct/∂Pit) as was done in using equation (A6) implicitly to derive equation (A7) in Appendix A. A larger and larger proportion of the runoff from precipitation will tend to reach the stream as precipitation increases, given a fixed number of wells, which explains the positive second order partial derivative in equation (A4). Precipitation at Imperial in period t-2 increases streamflow in period t at a decreasing rate, i.e., the response is a positively sloped concave function, and is associated with the interaction term in equation (2), Table 1 for year t-2.

2.3. Lagged Wells Model

[36] The above model was generalized to include lagged irrigation wells separately from the lagged interaction terms for wells and precipitation jointly, thus allowing for a more complex dynamic response with respect to number of irrigation wells. Results of this regression are given as equation (3) in Table 1, where it is seen that wells in the current and two preceding years have significant effects on mean annual streamflow. The net effects of current and lagged well numbers alternate in sign from year to year, starting with a negative effect in the current year. The net long-run effect given by the sum of the three coefficients on wells is −0.000919, which is exactly the same as the coefficient on wells in equation (2), Table 1 to three significant digits. The t statistic for the interaction term is larger than that in equation (2), primarily because of the larger parameter estimate, 0.00293 instead of 0.00248, while the respective standard errors are 0.00070 and 0.00067.

[37] The coefficients on the current and lagged number of wells are alternating in sign with the current time period negative as in equation (2). In the aggregate, existing wells in year t have a negative effect on streamflow in that year; then return flows apparently dominate the following year to give a positive effect, and the negative effects again dominate in the third year. This pattern of alternating signs suggests that a 2-year cycle dominates the interyear dynamics of groundwater movements, which is consistent with the alternating signs of the two interaction terms.

[38] The relatively weak precision of the individual regression coefficients associated with Wt, Wt-1, and Wt-2 is a result of the intercorrelations among these variables in the time series sample. For adjacent time periods, the correlation between the regression coefficients is −0.96, and for Wt and Wt-2, it is −0.83. This is a common phenomenon in nonexperimental data where orthogonality is absent in the set of independent variables, but the standard error of the estimate in equation (3) is smaller than without the lagged wells. Nevertheless, the standard errors of forecast tend to be considerably larger than those from equation (2). This forecasting weakness is probably caused by the irregular way in which wells were added over time and space, particularly with respect to distance from Frenchman Creek. Generalization of the model to include lagged irrigation wells separately from the interaction terms demonstrates another aspect of the dynamics of surface/groundwater interrelationships, but the lagged response of streamflow to new irrigation wells is not temporally consistent enough to improve forecasting performance.

3. Application to the Republican River

[39] Frenchman Creek is an important tributary of the Republican River, and the success of the above analysis encouraged an exploration of the possibilities for a similar analysis of the Republican River. Streamflows at the gauging station near Hardy, NE slightly south of the Kansas-Nebraska border where the river makes its final entry into Kansas are used in the analysis (U.S. Geological Survey streamflow gauging number 06853500), and the data are from U.S. Geological Survey [1936–1998a]. The statistical properties of the Republican River are quite different from those of Frenchman Creek. For the period 1936–1998, mean annual streamflows are 10,349 and 936 m3/s d, and the coefficients of variation are 0.7236 and 0.3940, for the Republican River and Frenchman Creek, respectively. The means differ by a factor of 11 and the coefficients of variation by a factor of 1.84, making the Republican River a much larger and more variable stream.

[40] Since the statistical analysis is performed on a logarithmic transformation of both data series, the two time series are also compared in that form. For the Republican River and Frenchman Creek, respective means of logarithms of annual streamflow are 8.9350 and 6.7202, while the coefficients of variation are 0.0600 and 0.0403. Respective geometric means of streamflow are 7593 and 829 m3/s d. The geometric mean of the highly variable Republican River is substantially less than the arithmetic mean, viz., 27% less, while the comparable figure for Frenchman Creek is 11%.

[41] A statistical procedure similar to that used on Frenchman Creek was applied to the Republican River to estimate the decline in mean annual streamflow that has occurred in response to large-bore wells dominated by irrigation development using groundwater. Data were obtained from the U.S. Bureau of Reclamation on the number of registered irrigation, municipal, and industrial wells in the Republican River basin above the Hardy, Nebraska gauging station for the period 1930–1993. These wells are “the approximate annual number of registered irrigation, municipal, and industrial wells within 19.3 km of a perennial stream within the study area (annual delineation was based on water right date).” The data appeared as a graph in the work of U.S. Department of the Interior [1996, attachment B,], and more recent data are not available. Although the variable for wells includes other large-bore wells besides irrigation wells, these other wells would be trivial in relative numbers in this predominantly rural region.

[42] A three-step process was used to update the relationship between the number of large bore wells and Republican River streamflow for forecast purposes, and simultaneously test the regression model for specification error as well as test the accuracy of projected number of wells drilled during 1994–1998.

  1. The two annual time series of new irrigation wells drilled in the Frenchman Creek and Republican River basins during 1936–1993 were used in a simple regression with the Republican wells as the dependent variable. Logically, the same economic and technological forces that were associated with drilling new irrigation wells are present with about equal force in both basins. The equation was estimated with the constant term suppressed, i.e., homogeneous form.
  2. The equation estimated in step 1 was used to forecast the unknown number of large-bore wells drilled annually in the Republican River basin during 1994–1998.
  3. In order to evaluate the accuracy of the forecasted wells in step 2, the Republican River regression equation for 1936–1993 (equation (4) in Table 1) was used to forecast the logarithms of streamflow during the 5-year post sample period, 1994–1998 (using forecasted wells from step 2), which were then compared to actual streamflows in the Republican River during that period. The standard errors of these forecasts were, surprisingly, less than the standard error of the estimate in the 1936–1993 regression.

[43] The estimated regression equation for the Republican River (1936–1993) is the fourth equation in Table 1, where the precipitation variable is the same as used for Frenchman Creek. Respective sample means of annual precipitation, number of wells, and the logarithm of annual streamflows are 50.19, 4633, and 8.9869, respectively. The precipitation variable is the same as used for Frenchman Creek, and the numbers in parentheses are t ratios. All coefficients are significant at the 0.01 level and the range of well numbers is 123 to 11,453.

[44] The consequences of the joint effects of precipitation outcomes and number of irrigation wells on conditional mean streamflow of the Republican River are illustrated by the comparisons given in Table 2. The nine outcomes for conditional mean annual streamflow are from three categories each of number of irrigation wells and annual precipitation. The three precipitation levels are the minimum and maximum observed values and the sample mean for 1936–1998 (location average for Culbertson and Imperial, Nebraska).

Table 2. Republican River Conditional Mean Annual Flows
Precipitation, cmNumber of WellsMean Streamflow, m3/s d

[45] Substitution of mean precipitation (1936–1998) into equation (4), Table 1 yields

equation image

as the equation for the mean of the logarithm of streamflow. The graph of Y implied by equation (4) is given in Figure 7. Estimated number of wells in 1998 is approximately 12,500, which when substituted into equation (4) yields 3380 m3/s for the geometric mean of streamflow, and for the 123 wells in 1936 the result is 13,350 m3/s. By 1998 the predevelopment geometric mean of annual streamflow in the Republican River had been reduced to one quarter of its original amount. A reduction in the mean of this magnitude is a serious problem for downstream users, but the relatively large annual variation of streamflow compounds the problem.

Figure 7.

Republican River mean annual streamflows (m3/s d).

[46] The following set of conditional prediction intervals at the 90% level demonstrates the consequences of reduced mean annual streamflow in the Republican River vis-à-vis its large variability. Letting Y denote streamflow, the 90% prediction intervals (m3/s) for 123, 6305, and 12,487 wells are respectively:

equation image

These are asymmetric intervals constructed from the corresponding symmetric prediction intervals of the logarithm of streamflow. Groundwater development has reduced both the mean and the 5% probability lower bound on streamflow to about one quarter of their respective values in an undeveloped state. In calculating the prediction intervals in equation (5), the variance of ln(Yt) was obtained by a direct application of the variance operator to the fourth equation in Table 1 and treating current and lagged precipitation as independent random variables, while taking the parameter estimates as given and adding the variance of the disturbance term. The variance of annual precipitation averaged across the two locations was calculated from annual data, 1936–1998, and the standard deviation is 9.768 cm.

[47] The unconditional nature of the prediction intervals in equation (5) make them directly and intuitively applicable to the practical problems and costs associated with streamflow depletion in the Republican River. Since the years of unusually low streamflow are the primary concern of agriculture and other industries dependent on Republican River streamflow for their water supply, the risks to viable economic activity dependent on the River are self-evident. The change in relative risk to agricultural crop production will force farmers to plant crops that are less subject to drought damage, but produce lower average value product per acre than when irrigation water is dependable, e.g., grain sorghum would tend to replace corn as a field crop.

4. Other Sources of Stream Depletion

[48] The U.S. Bureau of Reclamation has estimated the influence of irrigation wells and other sources of “development” on Republican River streamflow by using statistical methods [Lane et al., 1995]. A “basin factor” that aggregates soil and water conservation practices with number of irrigation wells was defined, and quantitatively measured by the number of irrigation wells within 19.3 km of a stream. This definition and measure of the basin factor tacitly assumes perfect correlation between irrigation wells and soil and water conservation practices and is therefore incapable of providing information about the relative importance of irrigation wells versus conservation practices. There were technical problems with respect to the way in which the multiple regression coefficients were calculated, but the above definition of the “basin factor” precludes multiple regression from providing any useful information about whether soil and water conservation practices have had any significant effect on streamflows.

[49] In reference to the Republican River, it has been asserted that [U.S. Department of Interior, 1996, p. 14] “Soil and water conservation practices (residue management, terracing, and farm ponds) contribute the largest depletion to the basin water supply.” This opinion also appears to be not that uncommon among federal and state soil and water professionals in Nebraska. Therefore the authors performed statistical tests to shed some light on this question.

[50] After accounting for net effects of the increased number of irrigation wells, the intercept of the regression equation for Frenchman Creek would be expected to shift downward during the last two or three decades in the sample if these conservation practices have an important negative effect on streamflows. Such a downward shift was tested statistically by adding an indicator (dummy) variable to the regression equation in equation (5). This variable assumed the value of zero for all observations except the last 20 years of the sample, where it was assigned unity. The coefficient on the “conservation variable” was negative, but insignificant at any level of interest (t value = −0.5). A set of three indicator variables of 10 years each was also tested, and the set was insignificant at the 10% level.

[51] Statistically, probably the most rigorous evidence that the conservation effect is relatively trivial is provided in Figure 5. Out of 20 postsample residuals, only one reached the 95% confidence boundary. Without any specification error in the regression equation, i.e., assuming that no error was committed by ignoring soil and water conservation practices, the expected number of residuals in the forecast to reach or fall below the lower confidence boundary is 0.025 × 20 = 0.50, thus making the observed outcome in Figure 5 not at all improbable.

[52] Additional evidence that the water conservation effect is relatively unimportant is provided in the results of the “split-sample” regressions described earlier to test for specification error; the postsample residuals of the regression equation fitted to data in the first half of the sample had no negative trend. These 31 postsample forecast residuals appeared random with respect to signs and magnitude. In summary, the data do not support the claim that changes in farming practices are an important cause of the decreased average annual streamflow in Frenchman Creek, nor do they support the claim for the Republican River either in view of the evidence presented earlier with respect to the parallel between these two streams' historic depletion.

5. Conclusions

[53] Statistical models of annual streamflow in relation to irrigation well development and precipitation would appear quite promising based on results reported here for Frenchman Creek in Western Nebraska and the Republican River slightly upstream from the Kansas-Nebraska border. Such models provide the basis for making probabilistic statements about annual streamflow at various stages of development defined by total number of existing irrigation wells. These statements can be constructed for conditional mean annual streamflow, observed streamflow in a future year taking parameter estimates as given, or allowing for all sources of variation (including parameter estimators).

[54] The probabilistic statements in equation (5) for the Republican River are an example of such an application of the empirical regression equation. In this case the parameter estimates are taken as given, and the variation is associated with both annual precipitation and the random disturbance of the regression equation. Such probabilistic statements of projected future streamflows are much more useful for policy analysis and formation than merely projections of mean streamflows, particularly with a highly variable stream such as the Republican River. Although no statistical analysis was attempted on shorter periods than annual in this study, the positive results here suggest that similar analyses might be useful based on shorter periods; for example, a particular month when irrigation water supply is critical for successful crop production.

Appendix A

[55] The mathematical structure of the primary regression equation for Frenchman Creek (second equation in Table 1) is analyzed in this appendix, particularly the signs of partial derivatives of the regression equation. Parameters b1, b2, b3, and c, are defined to be positive, and the appropriate signs are prefixed to denote the parameter estimates in the regression equation. Thus equation (2) in Table 1 becomes

equation image

where Y is streamflow, P is average precipitation at Culbertson and Imperial, Pi is precipitation at Imperial, Pc is precipitation at Culbertson, and W is number of wells, where the subscript t denotes a particular year.

[56] Let ϕ(•) denote the right hand side of equation (A1). Then Y = exp(ϕ(•)), and if X denotes an argument in ϕ( ), ∂Y/∂X = Y(∂ϕ/∂X). Therefore the partial derivative of ϕ(•) in equation (A1) with respect to one of the variables gives the proportional effect on streamflow associated with an increment in that variable. Nonzero partial derivatives of ϕ(•) with respect to number of irrigation wells are given below in the form of inequalities that designate their respective signs. A subscript t is applied to ϕ to clarify the relative dating of variables:

equation image
equation image

The corresponding partial derivatives with respect to precipitation at Imperial are

equation image
equation image
equation image

[57] Although the sign of ∂ϕt/∂Pit is apparently in question, the spacial correlation of precipitation at the two locations needs to be recognized in an analysis of this equation, i.e., the variable Pct should be taken as implicitly a function of Pit in deriving ∂ϕt/∂Pit which makes equation (A4) incomplete. Results of the simple regression of Pct on Pit gave an r2 equal to 0.40 and standard error of the estimate 3.2. The estimated equation is (t ratios in parentheses)

equation image

and taking ∂ϕt/∂Pit while recognizing this implicit relation between Pct and Pit yields

equation image

which is equivalent to equation (A4) with 0.587b1/2 added to it. For convenient reference, b1 = 0.01221 and c = 0.00248. The maximum number of wells in the sample is 1238 in 1998, which when substituted into the above inequality together with minimum precipitation at Imperial (26.5 cm) yields ∂ϕt/∂Pit = 0.0024 > 0, and therefore ∂ϕt/∂Pit is positive for all observations in the sample.

[58] In using regression methods on nonexperimental data, especially time series, the analysis is limited by the implicit experimental design associated with the independent variable data set. Since the years of least precipitation tended to occur before any irrigation wells existed, we might have expected a problem in this application. With the time series sample mean of annual precipitation equal to 50.4 cm (averaged over the two locations), the three smallest values, 31.4, 34.3, and 34.4, were before 1951 when the first wells were drilled. The next three smallest observations, the largest of which was 36.1, occurred before the number of wells exceeded 100. After the number of wells had reached 900, the smallest precipitation observed was 40.2. This unfortunate distribution of precipitation causes relatively low precision on the coefficient of the two terms for interactions in the empirical regression equation, and particularly increases the standard error for an estimate of mean streamflow in the domain of the independent variables where the number of irrigation wells is large jointly with relatively low precipitation. Therefore updating the sample after a relatively dry year has been experienced could produce a nontrivial change in the estimate of the parameter c associated with the interaction terms, because the number of wells will equal at least 1238, which is the largest number in the present sample.


[59] The authors are grateful to David J. Aiken, Arthur M. Havenner, Richard K. Perrin, and Raymond J. Supalla for their constructive comments. Mark A. Phillips and Fred Wood with the U.S. Bureau of Reclamation provided historic time series data on the number of large bore wells in the upper Republican River basin. The research, conclusions, and opinions contained in this article are not the official position of the State of Nebraska but only those of the authors. This is University of Nebraska Agricultural Research Division Journal Series 13625.