We estimated long-term annual evapotranspiration (ETQ) at the watershed scale by combining continuous daily streamflow (Q) records, a simplified watershed water balance, and a nonlinear reservoir model. Our analysis used Q measured from 11 watersheds (area ranged from 12 to 1386 km2) from the uppermost section of the Neuse River Basin in North Carolina, USA. In this area, forests and agriculture dominate the land cover and the spatial variation in climatic drivers is small. About 30% of the interannual variation in the basin-averaged ETQ was explained by the variation in precipitation (P), while ETQ showed a minor inverse correlation with pan evaporation. The sum of annual Q and ETQ was consistent with the independently measured P. Our analysis shows that records of Q can provide approximate, continuous estimates of long-term ET and, thereby, bounds for modeling regional fluxes of water and of other closely coupled elements, such as carbon.
 Although seasonal vegetation dynamics have large effects on ET at fine temporal scales [Donohue et al., 2007], at annual and longer time scales, an approximate balance exists between groundwater inflow and outflow, and the difference between precipitation (P) and streamflow (Q) must be balanced by evapotranspiration (ET) [Brutsaert, 1982]. Thus, at long time scales and large catchments, the hydrological role of vegetation may be indirectly inferred based on this steady state assumption. The water balance at long time scales is useful for testing hydrologic and climatologic models with reconstructed hydrologic fluxes from the past. Recent studies suggest that despite increasing upstream consumption of water over the past 100 years, continental and global Q has increased in a manner inconsistent with the changes in P [Labat et al., 2004; Milly et al., 2005; Gedney et al., 2006; Piao et al., 2007; but see Peel and McMahon, 2006]. Whether this is an artifact of uncertainties in scaling point measurements of P or a true pattern attributable to a decrease in ET has important social implications for the amount of usable water in the future [Foley et al., 2005].
 A number of factors can contribute to reduce global ET. Some studies report decreases in continental and subcontinental pan evaporation rates suggesting decreasing drying power of the atmosphere [Peterson et al., 1995; Ramanathan et al., 2001]. The observed changes in pan evaporation rates and in diurnal temperature have been shown to be consistent with the cyclic variations in the transparency of the atmosphere resulting in solar “dimming” and “brightening” [Wild et al., 2005; Roderick, 2006; Wild, 2009]. The decreased drying power of the atmosphere may also be attributed to reduced near-surface wind or “global stilling” [Roderick et al., 2007]. Declines in near-surface wind speeds has been reported at many terrestrial midlatitude sites in both hemispheres over the past 30–50 years [Roderick et al., 2007, and the references therein; McVicar et al., 2008; Pryor et al., 2009; Jiang et al., 2010]. Others propose that ET is decreasing due to reduction in stomatal conductance with increasing atmospheric [CO2] [Jackson et al., 2001; Gedney et al., 2006; but see Piao et al., 2007] or due to global deforestation trends [Foley et al., 2005]. In addition to potential global effects, these two factors would likely have regionally variable effects. For instance, experimental results show decreased stomatal conductance under elevated [CO2] only in some species, with the stomatal responses having a marginal effect on the canopy scale fluxes [e.g., Ellsworth et al., 1995; Pataki et al., 1998; Field et al., 1995; Wullschleger et al., 2002; Reid et al., 2003; Schäfer et al., 2002; Ainsworth and Rogers, 2007]. In addition, although the global trend of decreasing forest cover over the past years is indisputable [Foley et al., 2005], a given watershed may experience changes to a number of land covers with no net effect on ET. For example, a portion of a grassland watershed may undergo a partial conversion to coniferous plantation with higher ET rates [Jackson et al., 2005], while other parts are being developed and, thus, show reduced ET. At this time, the cause, magnitude, and direction of the historical changes in the components of the global water cycle remain uncertain. One reason is that none of the causes for the hypothetical reduction in ET can be directly validated at global scales because independent and direct estimates of long-term global ET are unavailable.
 To narrow this knowledge gap and provide constraints to hydrologic and climatologic models, methods for estimating long-term watershed scale ET are needed. Approaches have been proposed for estimating watershed scale ET based only on Q records. Daniel  first presented a method for estimating intra-annual variation in ET based on streamflow recession hydrographs and a nonlinear storage-outflow relationship. Recently, Kirchner  presented an inverse modeling method to compute seasonal and diurnal fluctuations in ET. The basic premise in the approach presented here is similar to that in the work of Daniel  and Szilagyi et al. , combining continuous Q records, a “zero-dimensional” watershed water balance (i.e., no spatially explicit routing component), and a nonlinear reservoir model modified from Brutsaert and Nieber . The water stored in the watershed is modeled as water residing in a simple reservoir and/or aquifer (Figure 1). During periods of no P (i.e., no inflow), water leaves the aquifer storage (S) either as ET or Q. An estimate of ET (ETQ) can then be calculated from Q by assuming a functional relationship between S and Q. Unlike the traditional water balance approach and forward hydrologic models, this method does not use P as input and, hence, can provide estimates of ET independent of the rainfall records.
 The parameters of the storage-outflow relationship depend on the physical dimensions and bulk hydraulic properties of the aquifer and the soil storage. The values of the parameters can be inferred from the dynamics of Q following rain events using recession curve analysis methods, i.e., by analyzing the part of the hydrograph where the flow rate decreases in time [Brutsaert and Nieber, 1977]. Motivated by recent studies by Szilagyi et al.  and Kirchner , the objective here was to estimate annual ET from watershed to basin scale and provide a reliable constraint for modeling regional water and carbon fluxes. The study focused on the upper portion of the Neuse River Basin in North Carolina, southeastern United States. In North Carolina, despite the general increase in urbanization, forested land area has changed little since the early 1900s. Forests covered 59% of the state in the early 1900s, 65% in the 1960s, and 59% in the beginning of the 21st century (http://www.fia.fs.fed.us/). In the upper portion of the northernmost section of the Neuse River Basin, our study area (1751 km2; hereafter UNRB), the land cover under woody, agricultural, or herbaceous vegetation was ∼90% in 1999 [Lunetta et al., 2003]. We estimated ETQ based on streamflow data from the 11 USGS gauging stations in UNRB.
 We used three methods to evaluate the ETQ estimates. First, we compared the sum of measured Q and ETQ estimated based on the proposed inverse modeling approach to independent estimates of P, providing a first check of the accuracy of ETQ. This check is based on a similar assumption used for the traditional estimate of annual watershed ET as the balance between the annual values of P and Q, assuming all other terms are negligible. Second, ETQ was compared to eddy-covariance–based ET (ETEC) scaled from measurements performed over three different vegetation cover types (old field, pine plantation, and hardwood forest) nearby at the Duke Forest AmeriFlux sites [Stoy et al., 2006a, 2006b]. Third, we compared ETQ to modeled ET (ETBBGC) based on output from Biome-BGC [Running and Coughlan, 1988]. One novel application of this “basin scale” flux integration approach is the possibility of providing independent constraints on regional carbon fluxes through ET, which we also assessed based on Biome-BGC output.
2. Data and Analyses
2.1. Study Area
 The study area UNRB covers parts of the Piedmont region of North Carolina (1751 km2; Figure 2). In this region, summers are warm (July mean temperature is 25.6°C) and winters are moderate (January mean temperature is 3.3°C). Long-term mean annual precipitation (from 1926 to 2004) is 1151 mm, with a fairly even seasonal distribution. The Piedmont is characterized by highly erodible clay soils, rolling topography, and low-gradient streams [NC DEHNR, 1993]. The mean elevation for the subbasin is 148 m and ranges from 57 to 270 m. The mean slope (± standard deviation, SD) is 2.7° ± 1.8°. Soils are underlain by a fractured rock formation with limited water storage capacity that offers only a limited supply of groundwater [NC DEHNR, 1993]. We do not have information on residence times in our watersheds, but Michel  studied sections of the Neuse River Basin using tritium concentrations at the outflow and suggested that about 75% of the Neuse River's outflow consists of water residing in the basin for a year or less, indicating shallow watersheds.
 To describe the land cover in this area, we classified Landsat images from 1998 to 1999 [Lunetta et al., 2003], summarized within counties and hydrological subunits (13-digit U. S. Geological Survey (USGS) HUCODE). We recombined the original 30 land cover classes in Lunetta et al.  into five classes. The land was classified as 8% urban, 24% agriculture or herbaceous vegetation, 44% woody deciduous, 22% woody evergreen and mixed species forests, and 2% water. Across the USGS hydrologic subunits (average area ± SD of 70 ± 38 km2) where the gauging stations were located, the land area varied 3–23% under urban (including suburban) and 65–78% under woody vegetation. Note that the drainage area of an individual gauging station may be smaller or larger than the USGS hydrologic subunit, and therefore, the land cover fractions presented here are only approximations. However, with the exception of the gauging stations with the smallest drainage area (12 km2), the drainage areas are relatively large compared to the hydrologic subunits, and the area is almost uniformly dominated by forests and, thus, the error caused by this approximation is likely to be small.
2.2. Streamflow and Weather Data
 To represent the watersheds in UNRB, we selected 11 USGS measuring stations in the study area (Figure 2 and Table 1; http://waterdata.usgs.gov/nwis/sw). For these stations, the length of the continuous measurement record of daily mean streamflow (Q) was at least 9 years long, and each drainage area was ≥12 km2 and covered mostly with forests and agricultural land.
Table 1. Streamflow Measurement Stations and Estimates of Storage-Outflow Model Parametersa
Drainage Area (km2)
Information on the streamflow gauge stations, associated catchments, the length of the continuous data record (except for 11 where there is no data from 1972 through 84) used in the study, and the parameters (standard error) of the storage-outflow relationship obtained from the recession analysis (slope and intercept, see Figure 4).
Little Lick Creek
Seven Mile Creek
 To study the variation in annual Q (Figure 3b) and streamflow-based ET (ETQ) with climate, we constructed time series of annual precipitation (P) from 1926 to 2004 to match the longest streamflow record (Figure 3a) and pan evaporation (ETP; from 1964 to 2004). Weather data were obtained from nine local weather stations within or close to the study area from the National Climatic Data Center (http://www.ncdc.noaa.gov) and from Duke Forest FACE (FACTS-I) experiment (http://face.env.duke.edu). There was a slight increasing trend in the annual P over the 79 year period (slope = 1.7 mm yr−1, p = 0.02).
 Daily pan evaporation measurements were available at one weather station close to the study area (Chapel Hill 2 W; 150 m ASL; ∼8 km south of the Duke Forest AmeriFlux sites marked in Figure 2). ETP is the free water evaporation derived from the pan measurements using a pan coefficient of 0.7 following Kohler et al. . Note that year-round pan evaporation was measured only through the first 11 years of the record (1964–1974), after which only warm season (April 1 – October 31) measurements were available. On the basis of the early part of the record, we calculated a ratio between the annual and the growing season ETP, where growing season was defined as from April through September. The mean ratio of 1.46 (coefficient of variation, CV = 2%) was used to estimate annual ETP when only warm season values were available.
2.3. Estimation of ET From Streamflow Time Series
 In the following sections, we first describe the foundation of the proposed approach: the simple watershed water balance combined with a nonlinear storage-outflow relationship, followed by explaining the steps in the data analysis, the parameterization of the storage-outflow function, and the calculation of annual streamflow-based ETQ.
2.3.1. Watershed-Scale Water Balance Model
 Watershed-scale water balance, over a given time period, is the balance between the water inflow to and outflow from the watershed. The change in the watershed water storage S can be expressed as
where S may be further separated into two parts, an unsaturated subsurface storage and a saturated subsurface storage. QGOUT and QGIN are groundwater outflow and inflow, respectively, across the watershed boundaries. During and immediately after a rainfall event, Q is dominated by overland flow and lateral subsurface flow (quick flow). Between rain events, Q is dominated by groundwater discharge (base flow).
 In the “zero-dimensional” watershed water-balance model, the watershed (hereafter the terms watershed and reservoir are used interchangeably) includes the contributing drainage area of a gauging station (Figure 1). The main assumptions of the model are (1) the net groundwater flow across the reservoir boundaries is zero (QGIN − QGOUT = 0) and, hence, the only inflow of water is P and the only outflows are ET and Q; (2) the saturated and unsaturated storages are lumped in a single term (S); and (3) the entire water storage is accessible by roots of the vegetation cover in the watershed.
 For a given watershed with a drainage area of Aws (m2), the change in storage is the difference between inflows and outflows [Brutsaert, 1982],
where S is given in m3, t is time in days, and the units for P, ET*, and Q are m3 d−1. Hence, ET (mm d−1) = 1000. To determine daily ET* from Q, denoted ETQ, the mass balance for days without rainfall (P = 0) is considered. Days without rainfall are defined as those where dQ/dt < 0, and therefore, they can be identified without using the P record. Note that for large watersheds dQ/dt can be less than zero, depending on where in the watershed it rained and how intense the rainfall was. In this study, the storage-outflow relationship is described with a nonlinear reservoir model of the form S = a′Qb′ [Brutsaert and Nieber, 1977]. Strictly speaking, this functional relationship is related to the saturated storage, but Szilagyi  and Szilagyi et al.  showed that such relationship can be maintained between outflow and the water volume stored in the unsaturated and saturated zones combined. ETQ can then be expressed as
This approach allows ET to be derived from Q but independent of P. Similar models have been used to simulate annual, seasonal, and diurnal fluctuations of ET [Szilagyi et al., 2007; Kirchner, 2009].
2.3.2. Parameterization of the Storage-Outflow Model
 To estimate the storage-outflow model parameters a′ and b′ (equation (3)), we selected data representing conditions in which ET is minimal [Brutsaert and Nieber, 1977; Szilagyi et al., 2007]. During interstorm periods, ET is smallest when temperature or atmospheric water vapor deficit is low or water availability is limiting. Under such conditions, equation (3) can be further simplified to
The lower envelope of the data in a log-log scatterplot of −dQ/dt to Q (see Figure 4) reflects the smallest changes in streamflow (|dQ/dt|) and therefore minimal ET at a given Q [Brutsaert and Nieber, 1977]. The intercept of this relationship (a = 1/(a′b′)) is a function of the physical dimensions and hydraulic properties of the aquifer [Brutsaert and Nieber, 1977]. On the basis of the solutions for the nonlinear Boussinesq equation, the slope (b = −b′ + 2) can be constrained theoretically for an unconfined, homogenous, horizontal aquifer [Brutsaert and Nieber, 1977]. Accordingly, in the early part of the recession (early-time drawdown and high Q) b = 3, but declines to 1.5 during the late-time drawdown and at low to moderate Q (dashed lines in Figure 4a). A linearized solution of the Boussinesq equation predicts a slope of unity for the late-time domain [Brutsaert and Nieber, 1977; Brutsaert and Lopez, 1998; Szilagyi et al., 2007]. Once the parameters are estimated, they can be used to estimate ET (equation (3)) under other conditions given that the range of Q used for estimation corresponds to that used for parameterization.
 Next, before describing the parameter fitting procedure, we explain how the numerical expression of the terms in equation (3) were defined to minimize the potential bias caused by measurement precision following Rupp and Selker [2006a].
22.214.171.124. Using Variable Δt to Decrease Discretization in the Data
 The terms in equation (4) can be numerically expressed as = and Q = , where Δt is typically constant. It can be set to equal 1 day [Brutsaert and Nieber, 1977; Szilagyi et al., 2007] or to 5 or 15 min where data frequency permits [Kirchner, 2009; Rupp and Selker, 2006a]. A log-log scatterplot of −dQ/dt to Q typical of a constant Δt of 1 day is shown in Figure 4a. Most data sets showed a similar orientation of observations along horizontal lines, particularly at low values of −dQ/dt. These “lines” of observations are produced when successive measurements differ by integer multiples of the measurement precision creating an apparent discretization of the data. Rupp and Selker [2006a] first showed that using a constant Δt in the parameter estimation may lead to such a phenomenon and bias the parameters' values. They proposed an improved method, in which, instead of using a constant Δt for each observation in time, Δt is scaled to the observed discharge in ΔQ. This is done by defining a threshold for the difference (Q(t + Δt) − Q(t)) depending on the measurement precision and then calculating the difference by increasing Δt until the threshold is met.
 We adopted the “scaled Δt” analysis method with a few adjustments. First, in the work of Rupp and Selker [2006a], Q was measured every 5 min, while our data set consisted of daily mean Q, which sets the lower limit of Δt to be 1 day. We set 5 days as the upper limit for Δt. Second, instead of forward or backward difference, we used the numerically more accurate central difference approximation: = . Third, in the absence of information about the measurement precision of stage height (the height of the water surface) or flow rate (calculated based on stage-streamflow relationship), we set the threshold to 0.001 × based on the results from watersheds studied by Rupp and Selker [2006b]. Data discretization was clearly reduced when variable Δt was used (Figure 4b); thus variable Δt was generally employed.
126.96.36.199. Estimation of the Parameters for the Late-Time Lower Envelope
 In all streamflow datasets, the two time domains (early and late) were recognizable (see Figure 4), but the parameter estimation was most robust for the low-to-moderate range in Q (solid lines in Figure 4). Therefore, parameters obtained from this range only were used in the subsequent calculations of ETQ (described in section 2.3.3). We delineated the lower envelope of the −dQ/dt to Q plot using a boundary line analysis [Schäfer et al., 2000]: data were divided into six Q bins, and for each bin, the “low value” of ∣dQ/dt∣ was calculated as the mean over the observations that were farther than a preset multiple (c) of standard deviations (SD) away from the mean ∣dQ/dt∣ in that bin. The six “low mean values” of ∣dQ/dt∣ were then regressed against the mean Q of each bin and the parameter values obtained from least-square linear fits. Although this method leaves some observations “on the wrong side” of the envelope (i.e., resulting in negative estimates of ET), it is more statistically robust than using the single lowest value for each bin. To set the value of c, we used data from one gauging station and tested the effect of varying c (from 1 to 4 at 0.25 intervals) on the regression parameters. The slope decreased with c, stabilizing when c ≈ 2; we thus used c = 2 for analyses of data from all watersheds. Approximately 70% of the data used in calculating the “low value” of ∣dQ/dt∣ for each bin was measured during nongrowing season (1 October through 31 March) and thus during conditions of minimum ET.
 Estimates of the parameters (slope and intercept) for the late-time regime of all gauging station are given in Table 1. The fitted value of slope b ranged from 1.1 to 1.6 and averaged 1.3. We do not have the detailed aquifer information (e.g., average saturation depth, depth to the impervious rock layers, or whether the stream channels are fully incised) to assess characteristics that may play a role in varying the storage-outflow relationships [Szilagyi, 2003]. However, the area is generally characterized by gentle slopes, and the range of values of b is intermediate among plausible solutions (including those for sloping aquifers) reported in the work of Rupp and Selker [2006b]. Typical values of b estimated based on sreamflow data range from ∼1 to somewhat higher than 3 [Rupp and Selker, 2006b, and references therein].
2.3.3. Calculation of Annual ETQ
188.8.131.52. Data Filtering
 We used equation (3) with the storage-outflow parameters as given in Table 1 to calculate daily estimates of ET (ETQ), from which daily mean ETQ and annual ETQ were generated. Equation (3) implies that when ET > 0, the diurnal rate of change in Q must be larger than predicted by the lower envelope of the −dQ/dt to Q relationship. This is because the lower envelope represents conditions where ET ≈ 0 (Figure 4) [see also Szilagyi et al., 2007, their equation (6)]. From equation (3), the estimated ETQ for a given Q increases with increasing ∣dQ/dt∣. For a given ∣dQ/dt∣, ETQ can either decrease (when b′ − 1 < 0) or increase (when b′ − 1 > 0) with increasing Q.
 Across the studied watersheds, during the late-time drawdown, the estimated slope (b = −b′+2) was ∼1.3 (see Table 1). When equation (3) is applied to high values of Q, this parameterization predictably resulted in much higher estimates of ETQ when compared to using the equivalent, early-time parameters. To account for the two time domains requires parameterization of the two lower envelopes [see Szilagyi et al., 2007]. However, as discussed earlier, in most cases defining the early-time lower envelope was highly uncertain. We therefore estimated ETQ based only on the late-time parameters and excluded unreasonably high values of ETQ generated at high Q values. Thus, daily ETQ estimates smaller than zero or larger than the maximum measured daily mean pan evaporation (ETP) of the month were not included in the calculation of annual mean ETQ. The upper cutoff point, the monthly maximum ETP was calculated as the mean of the highest values of daily ETP in each month over the pan evaporation record. Note that even when using the two envelopes, some daily ETQ values would still be excluded, implying that hydrologic processes not incorporated in the model are important at high Q and ∣dQ/dt∣.
184.108.40.206. Gap-filling, Scaling Up, and Comparisons With Other ET Estimates
 Given the number of rainless days in which dQ/dt < 0 (averaging 256 days annually) and the filtering procedure described above, 180 ± 15 days yr−1 were available for calculating annual ETQ of each gauging station. The daily mean ETQ was multiplied by the number of days in the year. Thus the datasets of daily ETQ were gap-filled with the mean daily ETQ of the year. We did not assume that ETQ = 0 for rainy days, because in this region, large portion of the summer time P is convective late afternoon showers, thus assuming zero ET for days with P > 0 is often not justified [Juang et al., 2007]. On the basis of the EC data used in this study [Stoy et al., 2006a, 2006b], the mean daily ETEC averaged over days where P > 0 mm was 75% of that over days where P = 0 mm. Similarly, the daily ETP averaged over days where P > 0 mm was 82% of that over days with P = 0 mm. We also did not assume that ETQ = ETP for the quick flow days, because summertime atmospheric vapor pressure deficit is high, soil surface dries quickly, and ET rarely equals potential ET. Nevertheless, had we gap-filled based on both assumptions, the annual ETQ estimates would have been, on average, ∼5% lower than our current estimates. Compared to the effect of gap-filling on estimates of mean annual ETQ, the estimates are relatively more sensitive to variation in the parameter values of the storage-outflow relationship. For example, analyzing the data from gauging station 6 (see Figure 4 and Table 1) and using the ETP-based filtering scheme described above showed that a 10% decrease in the slope (accompanied by an increase in the intercept) decreases the mean ETQ by ∼40%, while a 10% increase in b (accompanied by a decrease in the intercept) increases the mean ETQ by ∼30%.
 Finally, only 1–4% of the area in each watershed was classified as “urban, high density,” mostly impervious land and the rest of the urban land was suburban, from which the energy-driven ET was assumed be similar to that from vegetated areas [Grimmond and Oke, 1999]. Thus, ETQ estimates were not corrected by the fraction of urban land cover, ETQ for UNRB was obtained as a simple drainage area–weighed average over the gauging stations.
 To assess how well the traditional annual watershed balance model and the proposed approach agree, we compared the annual sum of ETQ and Q scaled for UNRB with P. Precipitation was independently estimated simply by averaging data from nine weather stations in the area (see details above) and was not used as an input in the model. In addition, over a 4 year period from 2001 to 2004, ETQ was compared to eddy-covariance–based ET (ETEC) measured nearby at the three Duke Forest AmeriFlux sites (location marked in Figure 2) over (1) an old field (OF, abandoned from agricultural use), (2) a maturing pine plantation (PP; 18 years old in 2001), and (3) a mature hardwood forest (HW; 80–100 years old) [Stoy et al., 2006a] and scaled to UNRB. The scaling was done by allocating the study area into three cover types (agricultural/grassland, evergreen coniferous forest, and deciduous broadleaf forest) represented by the three AmeriFlux sites. Another modeled estimate of the ET (ETBBGC) for the UNRB was based on output from Biome-BGC [Running and Coughlan, 1988]. Biome-BGC runs are described in the next section.
2.4. Simulations of Ecosystem Gas Exchange Using Biome-BGC
 Biome-BGC is a biochemical and ecophysiological model that uses daily meteorological data and general soil information to model energy, carbon, water, and nitrogen cycling in various ecosystems. In Biome-BGC (v4.1.2), evaporation and transpiration are calculated using a modified Penman-Monteith approach [Kimball et al., 1997]. Available energy is partitioned between the canopy and soil surface, and evaporation is a function of time since soil wetting. Transpiration is a function of leaf area index and total canopy conductance. Stomatal conductance is estimated by reducing a maximum value based on the variation in environmental factors such as soil moisture availability, atmospheric vapor pressure deficit, leaf water potential, and air temperature. The amount of water lost in ET is subtracted from the soil water compartment creating a feedback to the calculation of stomatal conductance, ET, and carbon uptake. The general model structure and processes are documented elsewhere [Running and Coughlan, 1988; Kimball et al., 1997; Thornton et al., 2002; Churkina et al., 2003].
 For the modeling approach, we first computed ecosystem gas exchange to represent the Duke Forest AmeriFlux sites (described above). The three EC towers measure carbon and water fluxes over three vegetation types (and developmental stages) that represent the majority of the area found in UNRB: (1) old field, (2) pine plantation, and (3) mature hardwood forest. The model runs for these flux sites used site-specific soil and weather data and were adjusted for the time from last disturbance to represent the developmental state of the sites (Peter Thornton, personal communication). We ran the model using mostly the default ecophysiological parameters and those used in earlier studies (for PP) [Thornton et al., 2002; Siqueira et al., 2006]. For HW, we replaced some default parameter values for local ones [Oren and Pataki, 2001; Pataki and Oren, 2003] and used them for deciduous forests in UNRB as well. The default C3 grass parameterization was used for both the old field characterization and for agricultural land in UNRB. To estimate the uncertainty around the modeled flux estimates, a simple sensitivity analysis was performed. On the basis of the findings by White et al. , we selected three parameters (maximum stomatal conductance, fraction of nitrogen in Rubisco, and specific leaf area) that have a large effect on gross primary production owing to their impact on leaf area index and canopy conductance. The values of these parameters were varied by ±20% individually and in all possible combinations.
 We then simulated ecosystem fluxes in UNRB by creating distributions of different input regimes to represent the spatial variation in the weather (see above), soil, vegetation cover, and the age distributions of forest stands in the basin. The land cover classification (agricultural/grassland, coniferous forest, and deciduous broadleaf forest) was based on that by Lunetta et al. , the distribution of soil types (loam, loamy sand, and sandy loam) was obtained from the Soil Survey Staff (http://soildatamart.nrcs.usda.gov) and the forest stand age distribution (three age classes: 0–15, 16–49, and 50 years and older) from the USDA-Forest Service Forest Inventory Analysis Database (http://www.fia.fs.fed.us/). The model was run for the years 2001–2004 with all possible combinations of the input data: seven weather stations by three soil classes by three vegetation types by three age classes resulted in 189 different input regimes. The UNRB-scaled estimates of ecosystem fluxes were then calculated as weighed averages from the output distributions, where the weight for each input regime (whether station by soil type by vegetation type by age class) was determined by its proportional cover in the landscape.
3.1. Relationship Between Q, ETQ, and P
 Both annual streamflow (Q) and Q-based estimates of annual evapotranspiration (ETQ) increased with precipitation (P, 1926–2004, averaged over the five watersheds with the longest records; Table 1, Figures 5a and 5b). In contrast to the positive ETQ-P relationship, ETQ showed a slight inverse correlation with pan evaporation (ETP, r2 = 0.11, p = 0.01, inset in Figure 5b). The variation of the sum of Q and ETQ followed the variation of the mean annual P reasonably well (Figure 5c). Note, however, that plotting P as dependent on Q + ETQ produces a large intercept (480 mm, p < 0.01; not shown). The average sum (±SD) of ETQ and Q (340 + 756 = 1096 ± 217 mm yr−1) was similar (p = 0.08) to average P (1151 ± 167 mm yr−1). Annual ET calculated as P − Q and ETQ were weakly correlated (r2 = 0.01, p = 0.02, not shown) and the ratio of their cumulative sums (ETQ/P − Q) was 0.93 (Figure 5d).
3.2. Variation in ETQ With Land Cover
 The consistency of the relationships between P, Q, and ETQ indicated that the hydrologic properties of the watersheds changed little through time. We also analyzed the variation in the values of the storage-outflow relationship parameters, temporally from one 10-year period to another, and spatially across the watersheds. We found no directional change in either the values of the parameters or in the estimates of ETQ with time (Figure 6). The mean difference between the two estimates of ETQ through time, calculated using temporally varying parameter values versus fixed values, was <4% (30 mm yr−1, Student's t test, p = 0.56). Hence, temporal variability in the parameter values did not significantly affect the long-term mean ETQ, the pattern in the time series, or the variability among the watersheds (Figure 6b). Moreover, we detected no clear land use signal in either Q or ETQ (averaged over the period of 1988–2003) among 8 of the 11 watersheds (1, 2, 4–6, 9–11, Table 1) where the urban (mostly suburban) land cover was 10% on average and ranged from 3% (in 6) to 23% (in 9).
3.3. Comparison of ETQ With Other Modeled and Measured Estimates of ET
 Annual ETQ for UNRB, estimated for 2001–2004 using data from the five active gauging stations (1, 4–6, 11, Table 1), was similar to or higher than the eddy covariance-based estimates of ET scaled for UNRB (ETEC) (Figure 7a) [Stoy et al., 2006a] and ET simulated with Biome-BGC (ETBBGC). The mean ETEC/ETQ was 0.94 and the mean ETBBGC/ETQ was 0.72. While the differences in the 4-year average ET between the three methods were not statistically significant (t test, minimum p = 0.09), the estimates of ETBBGC were consistently lower than those based on Q and EC. Some explanations for the difference between ETBBGC and ETEC were found in the comparisons at the level of individual AmeriFlux sites (Figure 7b). The ratio of ETBBGC/ETEC was 0.90, 0.81, and 0.61 for the hardwood site (HW), the pine plantation (PP), and the old field (OF), respectively.
4.1. Strengths of Streamflow-Based ET Methods
 The method used in this study produces annual estimates of watershed scale ET that are based on measured Q and are independent of P. This is in contrast to the traditional annual water balance approach, where the mean annual ET is estimated as the difference between annual P and Q. It is also different from forward hydrologic models, where P is distributed and routed across the watershed to arrive at the runoff, and ET must be modeled a priori. Here the annual ETQ estimates were based on the recessions of the streamflow hydrographs of selected watersheds over days without rainfall. The recessions reflect the relationship between changes in the watershed water storage and Q, where part of the change in the storage is the loss as ET. The rest of the watershed-scale hydrologic processes are “invisible” to the model. The recession slopes are not identical but vary with the initial conditions, such as antecedent soil moisture [Rupp and Selker, 2006b]. To estimate the parameters of the storage-outflow relationship and “average out” the effects of variable initial conditions, we used a minimum of ten-year record of daily measurements of Q. In principle, if the rainless days are defined as dQ/dt < 0, no additional climate data are needed. In addition, changes in the watershed properties, such as vegetation cover, can be detected as changes in the storage-outflow relationship.
 Similar parsimonious approaches for extracting watershed scale ET from measured streamflow were recently proposed by Szilagyi et al.  and Kirchner . The main differences between these studies and ours are the way the storage-outflow relationship is defined and parameterized and how monthly, seasonal, and annual estimates of ET are aggregated. Szilagyi et al.  demonstrated, using numerical experiments, that a model based on a single storage-outflow relationship (i.e., a lumped storage model) reproduced values of daily ET reasonably well under ideal conditions (e.g., no measurement error, simple geometry, and aquifer properties) and idealized aquifer flow [see also Szilagyi, 2003]. When applied to data from real watersheds, the estimates worsened and the model was unable to capture the seasonal fluctuations in ET [Szilagyi et al., 2007]. Kirchner , on the other hand, computed diurnal and seasonal variations of ET that, at least semiquantitatively, followed other modeled estimates. It was concluded that even in cases where the streamflow-based ET fails to quantitatively predict the absolute rates of ET, the approach may be useful for estimating relative temporal changes.
 Our results showed that the annual estimates of ETQ were comparable to estimates obtained from traditional annual watershed water balance, eddy covariance measurements, and Biome-BGC model simulations (Figure 7). Taken together, these findings suggest that while the approach presented here may not replace traditional hydrologic models, especially at short time scales, it can be used for estimating annual ET, particularly when long-term Q is available so ETQ can be averaged over a period of a few years. Moreover, we found that despite differences in land cover types among the watersheds and decadal changes in the land cover of some, the parameters of the storage-outflow relationship obtained from Q were fairly conservative among the watersheds (see Figure 6). This is consistent with the finding of Stoy et al. [2006a] that ET was similar at three nearby AmeriFlux sites, an abandoned field, a pine plantation and a broadleaf forest, suggesting that land cover change may have a small effect on this region's energy-limited ET, as long as the area remains vegetated. The land cover in UNRB remained largely under forest and agricultural land (∼90% in 1999), and thus, our results seem to extend the stand-level finding to the watershed scale.
4.2. Uncertainties and Limitations
 In the following section, we briefly assess some of the uncertainties and limitations related to the ET estimation methods compared in this study. With regards to the Q-based methods, both Szilagyi et al.  and Kirchner  suggested a number of potential limitations to their respective approaches. While their observations and our earlier discussion on various methodological issues (in section 2) are not repeated here, we bring up some issues most pertinent to the present approach.
4.2.1. Pan Evaporation, Annual Watershed Water Balance, and Eddy Covariance–Based ET
 The watershed water balance approach assumes that the estimates of P and Q are unbiased. However, P is discontinuous, with many complex interacting factors governing its spatial and temporal distribution [Roe, 2005], and even small variation in altitude (∼50 m) can drive large differences (100%) in local P between hill tops and valley bottoms [Bergeron, 1961]. Moreover, underestimation is inherent in the standard P measurements due to undercatch [Legates and Willmott, 1990]. The global average of the underestimation (undercatch mostly due to snowfall and wind) in gauge-based P estimates is ∼11% [Legates and Willmott, 1990]. A summertime undercatch estimate of 4–6%, applicable to most of the United States [Legates and DeLiberty, 1993], is likely to represent our study area better than the global average. Accounting for underestimation in P, and assuming no bias in Q and no spatial bias in P, would increase (i.e., worsen) the difference between our ETQ estimate and P−Q.
 The eddy covariance–based method is also likely to underestimate ET. This is partly because the two instruments (sonic anemometer and infrared gas analyzer) must be thoroughly dry for proper operation. This requirement generates data gaps during and immediately following rain events, periods in which intercepted water is reevaporated [Stoy et al., 2006a]. The locally estimated interception losses are on the order of ≤20% of P [e.g., Oren et al., 1998; Schäfer et al., 2002; Oishi et al., 2008]. Similar to eddy-covariance measurements, interception losses are not accounted for in to streamflow-based ET estimates [Szilagyi et al., 2007, Kirchner, 2009], and hence, this bias has the same sign in both ETQ and ETEC estimates. Finally, the simple areal scaling scheme (from AmeriFlux sites to UNRB) may have caused a bias, the magnitude and sign of which are difficult to estimate.
4.2.2. Limitations to the Streamflow-Based ET Method
220.127.116.11. Parameterization of the Storage-Outflow Function
 Perhaps the most important limitation of the streamflow-based approach employed in this study is that it is difficult to independently validate the underlying assumptions of the lumped watershed response based on available data. For instance, as discussed in the work of Szilagyi et al. , this is reflected in the placement and parameterization of the storage-outflow function (relating −dQ/dt to Q when ET is minimal), which remains ambiguous due to the various ways watershed drainage can be influenced by processes not included in the model, such as snowmelt or overland flow [Rupp and Selker, 2006b; Szilagyi et al., 2007; Kirchner, 2009]. Among the watersheds in our study, the parameter estimation appears to be most robust when Q is low to moderate. However, using this parameterization throughout the Q range resulted in unreasonably high ETQ estimates at high Q. We used monthly maximum of daily ETP to exclude these high values, and although this may seem to limit the use of the method to areas where pan evaporation data are available, modeled potential evaporation could be used instead.
 An alternative approach to parameterization was adopted by Szilagyi et al. . To ensure that the estimated annual ET (calculated as the cumulative sum of daily ET over rainless days) remained within a reasonable range, the placement of the envelope was guided with information on ET estimated as P − Q. In contrast, Kirchner  analyzed streamflow data collected at 15 min intervals using nighttime measurements only for the parameterization of the sensitivity function (i.e., the storage-outflow relationship discussed here), thus ensuring that ET = 0. His approach has the advantage of needing no further guiding or filtering. Indeed, in areas where high-resolution data are not available, finding conditions where ET ≈ 0 may limit the applicability of the approach presented here. For example, in areas with high and temporally well-distributed P, short interstorm periods, and/or low seasonality in temperature, this approach may not be as useful.
18.104.22.168. Oversimplification of the Watershed Water Balance Description
 There is another possible reason why ETQ could underestimate ET. While a single storage term is assumed in this derivation of (equation (3)) [Szilagyi, 2003; Szilagyi et al., 2007], the nonlinear reservoir model (S = a′Qb′) is for saturated storage. To incorporate unsaturated storage as well, it could be written as S = S1 + a′Qb′, and equation (3) modified so that:
where dS1/dt describes the change in unsaturated storage. While this quantity is negligible on an annual basis, it is positive immediately following a storm event. This means that, because water for ET can originate from the unsaturated storage in the watershed, ETQ likely represents a lower bound for actual ET (i.e., ET ≥ ETQ). However, S1 is also likely to be time dependent, thus increasing the dimensionality of the problem because time-dependent parameters must be included. Our data do not allow us to further evaluate the relative magnitude of the two storage terms in any meaningful way.
4.3. Can Basin Scale Estimates of ET Be Used for Estimating Carbon Exchange?
 The broader “ecological implications” is often used as one of the motivations in studies that focus on estimating watershed-level ET [e.g., Dias and Kan, 1999; Szilagyi et al., 2007]. This can be done because of the importance of ET as an indicator of energy and mass transfer and photosynthetic activity of the catchment [Szilagyi et al., 2007]. High productivity is typically accompanied by high water use because stomata regulate both transpiration (T, the dominant term in ET of most vegetated land covers) and photosynthesis (i.e., gross primary production, GPP, the total canopy carbon uptake). Thus, if the “ecosystem water use efficiency” or the relationship between carbon uptake and T is known, estimates of T can be translated to carbon uptake. This type of relationship is utilized in many process-based ecosystem models, such as Biome-BGC [Running and Coughlan, 1988]. Indeed, recently Beer et al.  estimated carbon uptake for Europe based on its water balance, using the traditional watershed-wide estimates of ET (= P − Q) multiplied by the ratio of ecosystem carbon uptake and ET derived from EC measurements of the EuroFlux network. When calculated in this way, the uncertainties in the estimates of carbon uptake are related to the estimates of T (or ET) and/or the conversion of water to carbon. At the leaf scale, this conversion is defined as the water use efficiency (WUE) and varies with exogenous factors such as atmospheric CO2 concentration and vapor pressure deficit, given as WUE ≈ , where ca is the atmospheric CO2 concentration, D is the vapor pressure deficit, a ≈ 1.6 accounts for the ratio of the molecular diffusivities of CO2 to water vapor, and ci/ca is the effective intracellular-to-ambient CO2 concentration, which reflects the physiological attributes of the plant and may be treated as a constant for a given species type (at long time scales).
 We estimated carbon uptake for UNRB over the 2001–2004 period using two methods: Biome-BGC (for estimating GPP) and scaling of the ecosystem-level EC measurements based on vegetation cover and ETQ (for estimating gross ecosystem productivity, GEP) [Stoy et al., 2006a, 2006b]. Our simple scaling scheme was based on a vegetation-specific, constant ratio of T to ET, and an empirical linear relationship between T and GEP (= net ecosystem exchange + ecosystem respiration) (Figure 8) [Goulden et al., 1997; Stoy et al., 2006a, 2006b]. The interannual variability in both T and ET and their ratio was small in all our AmeriFlux sites (mean T/ET over 2001–2004 ± SD), 0.54 ± 0.14, 0.74 ± 0.02, 0.72 ± 0.001, for OF, PP, and HW, respectively [Stoy et al., 2006a]. In addition, we modeled net ecosystem exchange (NEE, here positive values indicate net uptake of carbon) with Biome-BGC and calculated it from the EC measurement based on the annual ratio of NEE to GEP (see inset in Figure 8). Logging and other losses of carbon from UNRB were not considered.
 Depending on the year, Biome-BCG-based estimate of T (BBGC; Figure 9) was similar or much lower than that obtained from the ETQ and the ratio of TEC/ETEC (ETQ-EC; Figure 9). The differences increased as T was converted first to GPP and then to NEE. Thus, based on our simple scaling approach, UNRB is a strong sink for carbon whereas the Biome-BGC simulations suggest it is a much weaker sink and even close to carbon neutral in some years. In addition to differences of T estimates from both methods, GEP (or GPP) estimates differed as result of different ecosystem water use efficiency (WUEE = GPP(or GEP)/T) generated (Biome-BGC) or used (ETQ-EC) by the two approaches. T and WUEE did not differ greatly for all land covers between the ETQ-EC scaling approach and the Biome-BGC model. For example, At HW, TEC/TBBGC was 0.97, but at PP and OF, the Biome-BGC-based T estimates were considerably less than the corresponding estimates from the Q-EC measurements with ratios of 0.52 and 0.56, respectively. The differences in WUEE compensated some at PP but increased the difference at OF, such that the resulting GEPQ-EC/GPPBBGC was 1.02 for HW, 0.70 for PP, but only 0.27 for OF.
 Finally, both methods estimated similar site-scale NEE (≈0) at OF and on average agreed reasonably well at the two forested sites; among years the variability of NEEEC/NEEBBGC was large, ranging at HW from 3.3 during a drought year to 0.8, and at PP from 2.1 in a wet year to 0.9. This reflects Biome-BGC's too high sensitivity of broadleaved forests to drought and too low capacity of pine forests to take advantage of ample water. Despite a reasonable agreement of the site scale averages of the two methods, the agreement between NEE estimates at the subbasin scale degraded even more than the agreement between GPP and GEP estimates (compare the lower panels in Figure 9). In part the reason for the difference was related to the ∼20% of the study area that was covered by young forest stands (ages between 0 and 15 years). In these areas, Biome-BGC simulated unrealistically negative NEE, reflecting a strong source of carbon for several years during the regeneration-establishment phase following harvest [Lai et al., 2002; Thornton et al., 2002], whereas the ETQ-EC–based method estimated an unrealistically strong sink for carbon in these young stands.
 We used a streamflow-based approach, formulated in a similar way to those by Szilagyi et al.  and Kirchner , to quantify long-term ET at large spatial scales and demonstrated that annual ET can be reasonably constrained with this method. The information obtained here may have important applications. For example, ETQ has a potential to be used as an alternative method to estimate carbon uptake if the spatial variation in ecosystem water use efficiency, which changes with vegetation type and developmental stage, can be quantified. As this information becomes more widely available from a combination of sources, including continuous forest inventory plots and remote sensing, ETQ may provide a complementary method for estimating, carbon uptake at regional scales.
 This study was supported by Duke University's Center on Global Change, National Science Foundation (NSF-EAR-06-28432 and 06-35787), and Office of Science (BER), U.S. Department of Energy, through the Southeast Regional Center of the National Institute for Global Environmental Change (SERC-NIGEC-04Duo13CR) and Southeastern National Institute for Climatic Change Research (NICCR). The climate data in the Neuse River Basin were provided by State Climate Office of North Carolina, North Carolina State University.