## 1. Introduction

[2] Efforts to accurately predict patterns of carbon dioxide exchange between terrestrial ecosystems and the atmosphere are currently limited by our ability to represent the relevant biogeochemical processes in unifying models, which typically parameterize fluxes as a function of environmental variables. Models of the global carbon cycle need to accurately capture the dynamics of terrestrial biosphere-atmosphere exchange at a range of timescales, because forcings and responses occur across a broad temporal spectrum, from seconds (e.g., light capture by leaves) to years (e.g., community dynamics). Field biometric studies have historically been used to validate model predictions at long timescales, and evaluation of the rapid ecophysiological mechanisms has been limited to important, but temporally sparse, leaf and soil chamber measurements.

[3] In the past decade, at several hundred locations around the world, eddy flux tower measurement programs have been established to quantify ecosystem-atmosphere CO_{2} exchange with high-frequency, near-continuous, multiyear measurements. These net ecosystem exchange (NEE) measurements provide another data source for ecosystem model evaluation. One primary advantage of using eddy flux data for process studies and model evaluation is the continuity of the measurements, with time intervals typically 0.5–1 hour. Many time series are now between 5 and 15 years in duration (e.g., Harvard Forest [*Wofsy et al.*, 1993], Walker Branch Watershed [*Balddocchi and Vogel*, 1996], and Howland Forest [*Hollinger et al.*, 2004]). Another advantage is that the measurements are associated with a growing and coordinated effort (e.g., AmeriFlux) to establish networks of towers that span a range of ecosystem types and environmental conditions. Also, eddy flux sites tend to be foci for a suite of other measurements including meteorological variables, biometry, and other types of flux measurements. The primary disadvantage, with respect to understanding terrestrial biogeochemistry, is that measurements of eddy flux do not themselves directly quantify specific ecosystem processes but rather the net result of several processes. Of secondary concern are occasional instrument failures and other normal data collection gaps and errors.

[4] Net ecosystem exchange observations record the typically small imbalances between the gross component fluxes of ecosystem respiration and photosynthesis [*Wofsy et al.*, 1993], and while NEE data can be compared to model predictions, it is often more desirable to validate modeled component fluxes independently. The gross fluxes individually reflect distinct sets of processes whose mechanisms might influence one another but are largely separable. The net flux does not constrain the overall dynamics as well as the component fluxes because the net flux could be mistakenly modeled by gross fluxes having large compensating errors. Furthermore, some models, for example those driven by remote sensing observations, focus on uptake by photosynthesis, also known as gross ecosystem exchange (GEE), with little or no attempt to predict respiration [e.g., *Prince and Goward*, 1995; *Xiao et al.*, 2004]. Models such as these require independent GEE estimates for validation, and eddy flux observations of NEE can be useful in estimating these independent GEE data sets.

[5] In principle, the eddy flux data, along with associated meteorological drivers (e.g., temperature, solar radiation, humidity) contain enough information that will allow separation of the net flux into its gross components [*Goulden et al.*, 1996a; *Braswell et al.*, 2005], though there is currently no agreed upon approach for doing so, and the underlying uncertainties are not well quantified. The basis for this disaggregation is the fact that nighttime NEE reflects respiration processes only, and to the extent that respiration can be predicted during the day on the basis of relationships with predictor variables at night, daytime GEE can be estimated essentially as the difference between NEE and modeled respiration. Thus GEE estimates rely heavily on model predictions for large contiguous intervals (i.e., all daylight hours). Like any statistical inference, this process carries with it some prediction uncertainty that should be quantified in order to compare tower-based GEE with independent observations or model predictions.

[6] An additional factor that must be considered in utilizing eddy flux data is the existence of missing data resulting from inevitable instrumental lapses. Also, periods of low atmospheric turbulence result in CO_{2} flux measurements that are not representative of the actual ecosystem-atmosphere exchange, and these data typically are removed prior to analysis [*Goulden et al.*, 1996b]. Altogether, the resulting gaps can be extensive and nonrandomly distributed in time. The implication for estimating GEE is that an additional model to fill daytime NEE gaps must be defined and parameterized, which adds some amount of quantifiable prediction uncertainty.

[7] One possible framework for constructing a time series of ecosystem uptake (GEE), given the data and a choice of models, is

where *G* is GEE, *F* is the observed net flux (NEE), and and are the modeled respiration and daytime NEE, respectively. Several previous studies have focused separately on issues related to “gap filling” [e.g., *Falge et al.*, 2001], i.e., defining and evaluating the model , as well as the general problems of disaggregating NEE into component fluxes, which has focused principally on choosing an appropriate regression model for [e.g., *Goulden et al.*, 1996a]. More recently, however, data assimilation techniques have been used to both fill gaps in flux records and disaggregate NEE into component fluxes [*Jarvis et al.*, 2004; *Gove and Hollinger*, 2006].

[8] To most appropriately use eddy flux derived GEE for comparison with process models, satellite data, or other field observations, the statistical uncertainties associated with the inference of daytime respiration and NEE during gaps should be quantified so that error bars can be applied at any given choice of timescale. Commonly used statistical approaches for providing error bounds using analytical formulas, such as the formula used to estimate the prediction interval for least squares regression predictions, are not applicable to these data because the underlying assumptions of these approaches do not hold [*Hollinger and Richardson*, 2005]. For example, eddy flux CO_{2} data and the predictions obtained from regressions using these data have (1) nonconstant variance, (2) nonindependence of residuals, (3) non-Gaussian noise, and (4) potential sampling bias due to the nonrandom distribution of data gaps. *Hollinger and Richardson* [2005] conclude that the first three properties listed above result from a combination of the stochastic nature of turbulence, occasional large instrument errors, and the nonuniform occurrences of environmental driving conditions (e.g. over 24 hours, there are far more instances of zero solar radiation than higher values).

[9] Monte Carlo based statistical techniques such as resampling with replacement (“bootstrapping”) [*Robert and Casella*, 1999] provide a computational solution to the problem of estimating statistical uncertainty in nonlinear model predictions and data with complicating features such as severe heteroscedasticity. Previous studies have utilized ad hoc approaches inspired by bootstrapping to estimate uncertainties of net CO_{2} exchange. Often, the technique is used to estimate uncertainty in a sum of flux estimates over time. The most common application includes the random simulation and filling of additional data gaps [*Falge et al.*, 2001; *Griffis et al.*, 2003]. Another Monte-Carlo technique applied to net flux data involves modeling and repeatedly resampling residuals to estimate uncertainty [*Saleska et al.*, 2003]. Uncertainty due to gaps has also been estimated by creating seasonal populations of daily carbon balance that are randomly sampled for comparison with actual fluxes [*Goulden et al.*, 1996b]. Quantification of the measurement uncertainty in flux observations has recently been addressed (this includes defining a suitable probability density function and some measure of the variance) [e.g., *Hollinger and Richardson*, 2005]. Following model parameter optimization using maximum likelihood techniques, random noise with the same statistical characteristics as the measurement uncertainty of the original data can be added back to the model output [*Press et al.*, 1993]. By using repeated simulation, as in a Monte Carlo approach, uncertainty limits can be estimated for model parameters, gap-filled values, or annual sums [e.g., *Richardson and Hollinger*, 2005].

[10] In this paper, we present an example of statistical uncertainty estimation and error analysis for a GEE time series, based on eddy flux data from the Howland Forest in Howland, Maine, USA. Our analysis differs from previous work in several ways. First, we are focusing on gross ecosystem exchange, a component flux that reflects a distinct set of ecosystem processes, as opposed to ecosystem respiration or net flux. Second, we account for uncertainty due to model parameterization as well as the uncertainty associated with the random nature of the flux observations (earlier studies have focused on one or the other). We recognize that uncertainty in ecosystem flux arises from sources other than the statistical modeling, including different choices of friction velocity thresholds for filtering, variability in tower footprint, and changes in the system (i.e., insect infestations, large tree blow downs, etc.). In this analysis, we estimate patterns of uncertainty that are related only to statistical inference. Third, our method does not require the generation of additional gaps and therefore allows us to estimate statistical uncertainty at any timescale, from half hour to multiyear. Last, we perform a sensitivity analysis of the uncertainty of half-hourly to annual GEE estimates using different modeling approaches and different statistical assumptions, in an attempt to understand the effect of model choice on the estimates. We examine and quantify the 90% prediction intervals for one site, but our discussion of the general implications of our results for the role of data and models in understanding ecosystem processes is not site specific.