High-elevation forests represent a large fraction of potential carbon uptake in North America, but this uptake is not well constrained by observations. Additionally, forests in the Rocky Mountains have recently been severely damaged by drought, fire, and insect outbreaks, which have been quantified at local scales but not assessed in terms of carbon uptake at regional scales. The Airborne Carbon in the Mountains Experiment was carried out in 2007 partly to assess carbon uptake in western U.S. mountain ecosystems. The magnitude and seasonal change of carbon uptake were quantified by (1) paired upwind-downwind airborne CO2 observations applied in a boundary layer budget, (2) a spatially explicit ecosystem model constrained using remote sensing and flux tower observations, and (3) a downscaled global tracer transport inversion. Top-down approaches had mean carbon uptake equivalent to flux tower observations at a subalpine forest, while the ecosystem model showed less. The techniques disagreed on temporal evolution. Regional carbon uptake was greatest in the early summer immediately following snowmelt and tended to lessen as the region experienced dry summer conditions. This reduction was more pronounced in the airborne budget and inversion than in flux tower or upscaling, possibly related to lower snow water availability in forests sampled by the aircraft, which were lower in elevation than the tower site. Changes in vegetative greenness associated with insect outbreaks were detected using satellite reflectance observations, but impacts on regional carbon cycling were unclear, highlighting the need to better quantify this emerging disturbance effect on montane forest carbon cycling.
 To improve prediction of how these ecosystems will respond to environmental change, there is a need to increase the observational coverage in mountain regions across the Western U.S. and identify areas of high and low carbon exchange. However, there are many logistical challenges in making long-term observations of surface-atmosphere exchange in complex mountainous terrain. Short-term field campaigns of airborne atmospheric CO2 budgets in complex terrain provide a complementary approach to longer-term local observations to better extrapolate the magnitude (absolute value of net ecosystem exchange) and variability of these fluxes across space. Here, we demonstrate the potential of airborne regional budgets in helping address this issue across the central Rocky Mountains in the United States.
 The carbon balance in the Rocky Mountains is primarily modulated by the responses of the dominant evergreen montane and subalpine forests to environmental change. Ecosystem heterogeneity at the landscape scale is associated with significant variability in topography, climate, and soil type. While widespread sampling of point-based observations (e.g., eddy covariance) has been useful in many ecosystems, deployment of these observations in the complex terrain of the Rocky Mountains, while technically tractable [Yi et al., 2008], would be logistically infeasible and cost prohibitive.
 Consequently, there are very few observations of carbon exchange over mountain evergreen forests [e.g., Kominami et al., 2003]. The Niwot Ridge AmeriFlux research site (NWT) located in a Rocky Mountain subalpine forest (Figure 1) is one of a few sites, and now has over 10 years of near-continuous carbon and water exchange observations [Monson et al., 2002, 2006] that provide a rich source of information on the temporal variability and climatic controls of net ecosystem exchange (NEE). At this site, the largest carbon uptake occurs in late spring and early summer, followed by significant reductions in midsummer, and a secondary increase of carbon uptake in late summer coincident with the onset of frequent convective storms associated with the North American monsoon [Monson et al., 2002; Sacks et al., 2006]. This secondary peak is typically weaker than the initial springtime peak. Isotopic evidence suggests that trees in the Niwot Ridge forest exploit snowmelt as a water source well into the late summer [Hu et al., 2010] and while late summer rainstorms can allow some uptake, snowmelt drives more gross primary productivity than rain annually, leading to complex interactions between hydrology and carbon cycling.
 It is unclear to what extent the pattern of NEE observed at the Niwot Ridge forest site reflects the seasonal patterns and controls across the entire region. For example, Blanken et al.  showed NEE at a nearby alpine tundra flux tower had a net uptake period half as short at the forest site. Cumulative growing season NEE at the tundra site was nearly an order of magnitude smaller. Consequently, we expect that tundra and alpine vegetated regions across the domain may behave quite differently than the subalpine forest flux tower. Still, the uptake of carbon in the region is likely to be dominated by subalpine forest as other areas are primarily high-altitude grassland or shrubland with less carbon uptake during the growing season. Thus, we expect the first-order climatic controls on NEE more likely to be similar to the Niwot Ridge forest than tundra or grassland sites.
 Superimposed on the climatic patterns, many of the conifer forests in the region are experiencing significant mortality due to mountain pine beetle attack [Raffa et al., 2008] (Figure 2), and its impact on regional carbon cycling are not well understood [Kurz et al., 2008]. Approaches to quantify the seasonal pattern at regional scales are needed to gain better traction on the effects of this outbreak on the carbon cycle. In many regions, top-down global tracer transport inversions and bottom-up satellite remote sensing based ecosystem models have been found to be powerful tools to estimate regional carbon budgets [e.g., Desai et al., 2010]. However, in complex terrain, significant uncertainty exists on the accuracy of these methods given the complexities of atmospheric transport for top-down models, and the large spatial variation in ecosystem structure, forest composition, and slope aspect affecting model accuracy.
 In and of itself, airborne flux budgets do not typically provide sufficient temporal sampling or coverage to fully quantify regional carbon budgets. Rather, these airborne derived boundary layer budget fluxes are more useful for evaluating the consistency of magnitudes and variability across other continuous observations of regional carbon fluxes, such as flux towers, top-down inversions, and bottom-up models [Dolman et al., 2006; Miglietta et al., 2007; Sellers et al., 1997], so as to better evaluate hypotheses on climatic controls of regional NEE.
 In this study, we integrated information about regional carbon exchange made during an intensive airborne field campaign over the central Rocky Mountains of Colorado and Wyoming, United States. Three approaches (Figure 3) were used to estimate daytime regional carbon uptake: an airborne boundary layer budget, a remote sensing calibrated ecosystem model, and a high-resolution atmospheric tracer transport inverse model. Regionally derived fluxes were also compared to direct eddy covariance based observations of NEE from the Niwot Ridge forest site. With these flux estimates, we asked the following: (1) Are estimates of regional carbon uptake magnitude consistent among the methods and how do they compare to direct eddy covariance-based fluxes measured at the Niwot Ridge AmeriFlux subalpine forest site? (2) What information about the seasonal pattern of growing season carbon uptake do the methods provide? (3) Can signals of environmental stress impacts (such as that from bark beetles) on regional carbon cycle be detected from analysis of regional flux methods?
2. Data and Methods
2.1. Airborne Carbon in the Mountains Experiment 2007 Field Campaign
 The Airborne Carbon in the Mountains Experiment 2007 (ACME07) occurred from April to August 2007 across a domain surrounding the central Rocky Mountains, spanning approximately 37.5°–42.5°N latitude and 105°–109°W longitude, though the core sampling for paired upwind-downwind flights occurred between June and August and from 39°N to 42°N latitude (Figure 1). This campaign was a follow on campaign to the ACME04 field campaign, which led to many advances in our understanding of tracer transport in terrain [Sun et al., 2010]. In ACME07, the NSF University of Wyoming King Air airplane was instrumented by the National Center for Atmospheric Research (NCAR) for high-accuracy observations of CO2, CO, and O2 along with the standard micrometeorological and radiation observations that are usually available on the King Air (see http://flights.uwyo.edu/for more information). A total of 18 flights were flown on 11 days and air masses were sampled from near the surface to approximately 7000 m above ground level across the domain.
 Airborne carbon dioxide concentrations were derived from a modified infrared gas analyzer developed at NCAR and based on the LI-COR Biosciences Inc. LI-6262 infrared gas analyzer. Air was pumped into the analyzer from a port located near nose of the aircraft. Routine preflight, onboard, and postflight calibrations were performed against known standard gases at multiple altitudes and pressure levels. Real-time calibration data were then used to convert absorption sample voltages into CO2 mole fraction, given cell pressure and temperature, and flight meteorological data. Accuracy was assessed to be 0.5 ppm when compared against in-flight surveillance CO2 standards. This value was above the expected 0.2 ppm target owing to unexpected sensitivities of the sampling system to pressure and inertial motion. An additional CO2 sensor within the oxygen instrument was available for the last three analyzed flight days. These data had higher accuracy and showed similar patterns to that observed by the previously described instrument. Further screening for data spikes were also performed, prior to the analysis done here. Uncertainty estimates were propagated into the flux calculations.
 Morning and afternoon paired flights were successfully performed on seven of the flight days (Table 1), which are used here for regional flux analysis (section 2.2). On paired flight days, we used an ensemble of forecast meteorology wind fields combined with computation of ensemble (Lagrangian) particle dispersion back trajectories for an afternoon particle release from five downwind receptor points (Figure 1b). These ensemble trajectories were derived from two particle models, STILT [Gerbig et al., 2006; Lin et al., 2003] and Flexpart-WRF [Fast and Easter, 2006; Stohl et al., 1998], coupled to a set of meteorological transport fields derived from the NOAA NCEP forecast models and three versions of the NCAR WRF models run at 3 km, 12 km, and 22 km resolution, respectively. The sampling approach was then targeted to morning sampling of upwind locations based on particle model output and afternoon sampling of the receptor locations. The approach followed that of Lin et al. , but to minimize sampling uncertainty in complex terrain due to spatial variability in trace gas concentration [De Wekker et al., 2009; Sun et al., 2010], multiple parallel upwind-downwind pairs were flown and flux budgets among those averaged, as described in more detail in section 2.2.
Table 1. Upwind and Downwind Paired ACME07 Flights Analyzed in This Study, Including Time of Flight Transit (Takeoff to Touchdown) and Approximate Latitude and Longitude of Sampling Box as Determined by Northern and Southern Extent of Airborne Flight Tracks and Lagrangian Air Parcel Trajectoriesa
Also shown is thermodynamic maximum PBL depth (Zmax).
1 June 2007
15 June 2007
21 June 2007
18 July 2007
1 August 2007
3 August 2007
9 August 2007
2.2. Boundary Layer Budget Regional Fluxes
 Boundary layer budget fluxes were derived from seven sets of multiple paired morning (upwind) and afternoon (downwind) flight tracks (Figure 3 and Table 1). The essence of this approach is to follow a column of air as it moves across a region and measure how its CO2 content responds to changes in vertical fluxes at the top and surface. Dates with strong model consistency of particle trajectories and no expectation of precipitation were chosen to maximize the signal-to-noise ratio, allow visual avoidance of terrain, and minimize potential errors arising from deep moist convection. On each day, morning flights sampled upwind particles by following a descending path parallel to the mean wind. On weak shear cases, spiral or crosswind profiles were flown. Morning flight times were scheduled for late morning to allow the breakup of valley cold air pools and subsequent mixing in the atmosphere aloft [De Wekker et al., 2009; Sun et al., 2010; Stewart et al., 2002]. Flights sampled air masses from nearly 7000 m to within 50 m of the surface, and even lower for missed approaches at airports, and mostly in and just above the boundary layer. Forward trajectory models were used prior to the afternoon flights to adjust receptor points to updated particle locations in the time between morning and afternoon flights. Afternoon flights then sampled CO2 from spiral descents over rural areas and low approaches at airports at the previously identified or updated receptor points (Figure 1).
 Once CO2 concentrations were calibrated for each flight, we determined which samples belong to which air masses. This identification was challenging, especially when trajectories overlapped or models diverged. CO2 profiles from morning air masses were constructed based on the afternoon receptor target and the predicted ensemble spread of backward particle trajectories (identifying the 4-D source region) as derived from mesoscale forecast models NCAR-WRF 3km and 22km, NCEP WRF 12km, NCEP RUC, GFS, and MM5 [Ahue, 2010]. Such an approach allows for a quasi-3-D sampling of the air mass which makes the method much more reliable compared to using simple vertical profiles. In many profiles, CO2 measurements were missing in the lowest 50–150 m above ground. We assumed constant CO2, which is likely an underestimate of total column CO2 for morning. However, since the near-surface layer encompasses only 10% or less of the height of the air column considered in the budget calculation, the assumed CO2 profile in this layer had a negligible effect (<1%) on the flux calculation. Once each vertical profile in the morning was identified with one of the receptors based on the ensemble particle trajectory, we computed column average CO2 for all representative profiles. Afternoon vertical profiles were directly taken from ascending or descending flight patterns over each receptor.
 In the budget equation presented below (equation (1)), the primary variable needed is the maximum planetary boundary layer depth during the afternoon sampling (zmax). By averaging both morning and afternoon CO2 profiles up to zmax, the net effect of free troposphere entrainment of CO2-rich air was incorporated into budget calculations without needing an explicit time-resolved entrainment parameterization or boundary layer depth [Stephens et al., 2000]. Ahue  compared three approaches to estimate zmax and found root mean square error (RMSE) uncertainty to be 15%, and seasonal variability among the three was strongly coherent. Here, we relied on the observational thermodynamic zmax (Table 1), since it is the most data-constrained approach.
 Maximum boundary layer depth using the thermodynamic approach involves plotting all afternoon airborne profiles of virtual potential temperature (θv) and moisture (q) against height and estimating boundary layer depth by visual inspection of the jump in these quantities between the well-mixed, cooler, and moister boundary layer and the stably stratified, warmer, and drier free troposphere. Variables θv and q were derived from observations of air temperature from an airborne mounted platinum resistance thermocouple (Model 102, Rosemount Analytical Inc.), dewpoint temperature from a chilled mirror dewpoint sensor (Model 137C3, Cambridge Systems Inc.), and atmospheric static pressure from a digital solid state absolute pressure transducer (Model 1501, Rosemount Analytical Inc.). Boundary layer depths were estimated for each source-receptor pair on each of the seven flight dates.
 Given the CO2 pressure-corrected column density of each source (Cmorn) and receptor (Caft) pair, the mean particle transit time as derived from difference in aircraft sample times (ttransit), and the maximum boundary layer depth within that domain (zmax), the surface-atmosphere net ecosystem CO2 exchange (NEE) was computed as
The core idea behind the multiple paired profiles is to minimize the uncertainty in NEE estimation from any one pair, given the spatial variability of transport in mountainous terrain. For example, given the variation in surface elevation and strong vertical wind shear, horizontal divergence of CO2 as it advects across terrain could be large, affecting assumptions used for the parameters of equation (1). But multiple profiles sampling an extensive spatial air volume tend to average out this error. Most cases in our study had three to five pairs.
 There is also an assumption that each source-receptor pair is sampling a roughly similar footprint, and thus each can be considered a sample of a population mean regional flux. As such, the average represents an estimate of the regional flux, and the standard error across the source-receptor pairs serves as an estimate of the uncertainty of this mean flux, though it cannot be separated from true spatial variability in flux across pairs. This multiple-profile standard error was added to the 0.5 ppm accuracy error in CO2 concentrations and the RMSE contribution in uncertainties in boundary layer (BL) height, in quadrature since we expect these errors to be independent, to derive the total flux uncertainty. Uncertainty in the paired-profile sampling was the dominant term, accounting for ∼60% of the total error, much larger than boundary layer uncertainty (∼15%) or observational uncertainty (∼25%).
 To optimally compare airborne regional fluxes, whose footprints change with location of upwind sources, to other models with fixed footprints, simple polygons were defined by the spatial extent of the primary (95%) particle clouds across all source-receptor pairs (Table 1). These polygons were then used for estimating mean regional fluxes from the ecosystem and inverse model as described below.
2.3. Regional Ecosystem Model
 Spatially explicit estimates of NEE over the domain were derived from the Simplified Photosynthesis and EvapoTranspiration (SiPNET Model, hereafter SiP) model operated at 224 32 × 32 km gridded locations throughout the entire domain. SiP has been described in detail in previous manuscripts [Sacks et al., 2006, 2007; Moore et al., 2008; Zobitz et al., 2008]. Here, we briefly review the most relevant details.
 SiP is based on the well-established Photosynthesis-EvapoTranspiration (PnET) model [Aber and Federer, 1992; Aber et al., 1995] but modified to allow ecosystem level measurements to constrain model parameters in a data assimilation framework [Braswell et al., 2005; Sacks et al., 2006, 2007] and simplified to reduce the number of parameters which need to be estimated. SiP contains leaf, wood and root vegetation carbon pools, soil carbon pools of varying recalcitrance and a microbial pool (see Braswell et al.  and later modifications by Zobitz et al. ). Leaf pools can be either deciduous or evergreen with a prescribed phenology and wood refers to the combined pool of boles, branches, coarse roots, and fine roots.
 The model as implemented in this experiment was driven by six climate variables: mean air temperature, soil temperature, relative humidity, photosynthetically active radiation (PAR), wind speed and precipitation. The values for these variables were derived from a reanalysis surface meteorology extracted from the National Center for Environmental Prediction North American Regional Reanalysis (NARR) [Mesinger et al., 2006]. NARR surface meteorology was derived from weather forecast model analysis run in data assimilation mode against surface, profile, and satellite observations of atmospheric state, providing a set of meteorological state variables optimally consistent with model structure and spatially disjoint observations. NARR 3-hourly, 32 km resolution surface meteorological fields were averaged by day for use in the model, which was run at daily time resolution, to minimize errors seen in diurnal cycle of surface meteorology in terrain from reanalyses.
 Given the ecosystem heterogeneity in the region, SiP was run for three vegetation land types and aggregated within each grid cell based on maps of land cover. Land cover across the domain was prescribed based on the U.S. Geological Survey (USGS) satellite-derived 30 m land cover data and specific cover classes were aggregated to distinguish evergreen forest, deciduous forest and grassland land cover types, which comprise the primary land covers in the region. This scale of data was required to avoid biases that can be introduced into regional carbon budgets by using coarser resolution (circa 1 km) land cover information [Quaife et al., 2008].
 SiP was parameterized for evergreen forest and deciduous forest, while grassland and shrubland were assumed to have a much smaller NEE and was therefore set to ensemble mean daytime flux from a nearby grassland flux tower in Fort Peck, Montana [Gilmanov et al., 2005]. Default model parameters based on the work of Braswell et al.  and global literature estimates were used for the deciduous forest cover, given the relatively smaller amount (7%) of deciduous cover compared to evergreen (24%). The remaining cover types consisted of water, ice, and barren and were all assumed to have zero NEE.
 For the evergreen forest land cover type, 9 years of carbon and water exchange estimated using the eddy covariance technique at the Niwot Ridge AmeriFlux site (see section 2.5) were used to constrain free parameters of the model applying a Monte Carlo Markov chain approach with a Metropolis-simulated annealing algorithm [cf. Braswell et al., 2005; Metropolis and Ulam, 1949]. After a spin-up period of 150,000 iterations the model was run forward 350,000 times. During each iteration a random change was made in one parameter and resultant NEE and evapotranspiration (ET) estimates were compared to measured NEE and ET. Parameter values that decreased the model data error (evaluated by calculating the log likelihood) were retained while others were ignored. Some “poor” parameter sets were occasionally retained during random walks to increase the chances of finding the true global maximum likelihood for model simulation of NEE and ET. This sequential process resulted in the parameters which best fit the available data [Braswell et al., 2005; Sacks et al., 2006].
 Spatial variation in leaf area index (LAI) was prescribed using MODIS remotely sensed 1 km2 LAI observations [Myneni et al., 2002] and these were aggregated to provide a single number per cover type for each location. The MODIS QA information was used to exclude any suboptimal retrievals from the final value. For forested sites, aboveground biomass was extrapolated using the relationship between LAI and biomass defined from a large set of intensively sampled plots [Bradford et al., 2008]. The model was then run at each of the 224 locations for each land cover type and fluxes were weighted based on the fractional area of the land cover type in each location.
 The spatial and temporal scales for the spatial modeling were chosen based on the availability and resolution of climate driver data. The spatial resolution of the NARR driver data set was chosen to be fine enough to account for topographic variation while minimizing artifacts introduced to precipitation and temperature estimates by statistical downscaling. The SiP model will run at subdaily time steps. Indeed, the current model structure exploited the day-night contrast to extract information from flux observation in assimilation mode [Sacks et al., 2006]. Even finer temporal resolution does not result in more information retrieval from eddy covariance data. A more complex biophysical subroutine would allow for intradaily variability to be exploited however this would increase the number of parameters to estimate and without additional complimentary data streams there would be substantial uncertainty in the retrieved parameters controlling diurnal variability. Forward runs of SiP using the same model parameters at daily and half daily time steps were equivalent and therefore we opted to use daily meteorology and instead resample SiP model output for comparison at shorter time scales.
 To compare these daily average gridded NEE against the boundary layer budget fluxes, temporal resampling and spatial averaging was performed on modeled daily NEE. Twenty-four hour average fluxes were downsampled to daytime (1000–1400 LT) averages using a 21 day moving window linear regression of flux tower observed 24 h to daytime average NEE, which showed a strong relationship in the growing season (r2 = 0.65–0.82 across the moving window). This choice was made instead of running the model at finer time resolution to reduce uncertainty of forcing data on the model.
 For each case analyzed, gridded estimates of mean regional daytime flux were subset based on the polygon defined by the approximate north-south extent of western upwind and eastern downwind airborne sampling and particle trajectories (Table 1). Mean and spatial standard deviation of modeled NEE were then derived within this polygon. This averaging technique allowed the regional flux estimates to be compared, while minimizing for artifacts of spatial and temporal sampling mismatch that would occur by taking model averages over the entire domain.
2.4. Inverse Model Regional Fluxes
 Inverse models constrain prior estimates of surface-atmosphere exchange against trace gas observations and transport information. We analyzed high-resolution (1° × 1°) regional fluxes from the NOAA ESRL CarbonTracker (CT) model, release 2009, a nested-grid global inverse model for CO2 flux [Peters et al., 2007]. Atmospheric CO2 observations from the NOAA ESRL Cooperative Air Sampling network, including those monitored at two mountaintop locations in the Rocky Mountains RACCOON network, along with modeled transport fields and an ecosystem model were used in data assimilation mode to optimize ecosystem flux parameters.
 In this model, fossil fuel and fire CO2 fluxes were prescribed from existing databases (Global Fire Emissions Database v. 2). Ocean and land fluxes were then adjusted to match the flask and in situ atmospheric CO2 observations. Terrestrial ecosystems were divided into 25 ecoregions based on continent and land cover, while the oceans were divided into 11 basins. The optimization approach adjusted weekly linear scaling factors for each basin or ecoregion using an Ensemble Kalman Filter approach [Peters et al., 2005]. Prior land fluxes were prescribed from the Carnegie Ames Stanford Approach (CASA) ecosystem model [Potter et al., 2007]. Weather model and satellite vegetation greenness information drove the biosphere fluxes of CASA, while the linear scaling factor adjusted the flux scaling for each ecoregion based on the atmospheric constraint.
 While CT was designed to estimate fluxes at the continental scale, variability in smaller regional fluxes will still be reflected in the information content of CO2 concentration measurements, especially from the mountaintop locations. We extracted the 3-hourly surface biosphere fluxes for 2007 within the ACME domain. These fluxes were interpolated to hourly time steps and further resampled to a 0.1° × 0.1° resolution by bilinear interpolation. For each case analyzed, mean regional daytime flux from CT was extracted based on the time and space polygon defined by the approximate extent of upwind and downwind airborne sampled particle clouds as described from the ecosystem model (section 2.3).
2.5. Eddy Covariance at Niwot Ridge Forest
 Eddy covariance is a well-established technique for long-term continuous observations of surface-atmosphere exchange [Baldocchi, 2008]. The Niwot Ridge (NWT) AmeriFlux eddy covariance tower (39.907°N, 105.883°W) is at 3050 m elevation, 8 km east of the continental divide in a subalpine mixed conifer forest (spruce, fir, lodgepole pine) [Monson et al., 2002]. The forest was extensively logged in the early 20th century and has been in recovery since then. Maximum leaf area index is 4.2 m2 m−2 and mean canopy height is 11.4 m.
 Long-term continuous observations of fluxes of CO2, H2O, energy, and momentum have been made since late 1998 (version 2011.04.20 of these flux data were used for our study) Given the complexities of flux measurement in terrain, significant research has been undertaken to quantify terrain-induced flow impacts on flux uncertainty [Yi et al., 2008], and rigorous data screening and gap-filling processes are used to estimate NEE at this site. In this study, half-hourly NEE from 1999 to 2010 were analyzed over the warm season (April–October). For comparison to regional fluxes, these fluxes were further temporally averaged to daytime periods based on mean times of upwind and downwind flight sampling (Table 1). Additionally, tower observations of above canopy air temperature and incoming precipitation were compared against NARR meteorology across the region.
3.1. Seasonal Climate
 Domain averaged (Figure 1) climate (Figure 4) for the 2007 field campaign reveals patterns typical of the southwestern U.S. Warm season (April–October, daily T > 0°C) temperature and precipitation showed four distinct periods. The first was in spring (April–mid-June) with regular cycles of frontal precipitation and steadily warming temperatures. At higher elevations, much of this precipitation continued to be snow, though over most of the domain, the snow was melting in this time period, providing a source of moisture for vegetation [Hu et al., 2010]. This regime was followed by a period of relatively high temperatures and limited precipitation from mid-June to late July, leading to midsummer drought conditions and vegetative productivity decline, though it should be noted that the Niwot Ridge area experienced a wetter than average July, followed by a drier than average August. Over the region, however, a respite from this drought occurred in the next period from late July toward late August, where warm temperatures continued, but regular convective precipitation occurred, in association with moisture transported along the eastern range of the Rocky Mountains, bringing moist air into the region from the Gulf of Mexico. Finally, cooling temperatures and the return of more synoptically forced precipitation brought the active growing season to a close. The seven flights analyzed here primarily covered the first three periods, with the first flight during cool, moist conditions, the next three during warm, dry phases, and the final three during the return of convective precipitation. For logistical reasons, all flights were flown during days with little or no precipitation.
 Also shown in Figure 4 is the 1σ spatial standard deviation of temperature as derived from NARR reanalysis. When the subalpine Niwot Ridge tower (at higher elevation than the domain-averaged elevation) was compared to the domain average values, April–October mean temperature was significantly cooler (8.1°C) and wetter (429 mm) than the domain mean April–October temperature (12.9° ± 3.5°C) and total precipitation (284 ± 75 mm). Seasonal variation in climate between the tower and the region was remarkably similar for warm season daily mean temperature (r2 = 0.89) and cumulative precipitation (r2 = 0.97).
 Elevation imparted a strong signature of temperature variation; the maximum-minimum difference in temperature across space can exceed nearly 10°C for daily averages in summer. Spatial variability across the domain generally followed a pattern of dryer and warmer conditions to the west (not shown). The largest spatial variation in temperature was found during the summer drought period, and consequently, we expected to find large spatial variation in carbon fluxes during this time.
3.2. Carbon Uptake at Niwot Ridge Forest
 The imprint of seasonal climate variation was evident in the Niwot Ridge forest flux tower NEE time series (Figure 5), shown for the same time period as the climate data. As mean temperature increased above 0°C in April, ensuing warm conditions in May and early June led to strong negative NEE (carbon uptake) at the tower. The first flight campaign occurred near the central part of this peak uptake period.
 Immediately following the May–June period, carbon uptake rapidly declined, though the site continued to uptake carbon both during the day and over the integrated day-night cycle (daytime photosynthesis, A, exceeded daily respiration, R) until late July, when 24 h mean fluxes approached zero (A = R). Three flight campaigns span this time period. The reduction of carbon uptake in daytime-only and 24 h average started out similarly in magnitude, but then daytime uptake remained steady while 24 h average uptake approaches zero by the end of July, implying an increase in both ecosystem respiration and photosynthesis during this period, though with a slightly larger impact on respiration [Moore et al., 2008].
 The long-term mean steadily declined from peak uptake to drought onset. In contrast, 2007 featured a short period of enhanced carbon uptake in late June–early July, followed by the regular procession of declining uptake. This pattern likely reflected the wetter than average conditions experienced during July, which followed drier than average conditions in June. Because of the stormy weather, no flights were conducted during this period.
 The final three flights occurred at the onset of the monsoon flow in early August, leading to a secondary maximum in carbon uptake at Niwot Ridge, which began in August and continued through October, peaking in late September. In 2007, a short period of carbon uptake weakening occurred in late August, leading to a pattern that featured four minima in NEE (early June, early July, mid-August, late September).
 Total NEE from the April–November time period was −276 gC m−2, and 2007 annual NEE was −220 gC m−2. Seasonal patterns averaged from 1999 to 2010 (Figure 5, shaded line) showed a very similar pattern in both magnitude and variability. Long-term mean cumulative NEE for April–November was −267 gC m−2 and annually was −216 gC m−2. Advection corrections in terrain have shown these fluxes may be approximately 10% underestimates of true NEE [Yi et al., 2008].
3.3. Magnitude of Regional Carbon Exchange
 Magnitudes of NEE on individual case study days ranged from −0.3 ± 3.4 to −12.5 ± 4.5 μmol CO2 m−2 s−1 among the four methods (Figure 6), with the largest mean uptake of −7.5 μmol CO2 m−2 s−1 in the boundary layer budget (BLB), followed by −7.3 μmol CO2 m−2 s−1 at the flux tower (NWT), −7.0 μmol CO2 m−2 s−1 from the inverse model (CT), and −4.6 μmol CO2 m−2 s−1 in the ecosystem model (SiP). The differences in rank of mean magnitude across method were not consistent across time, with CT having the largest uptake in the early part of the campaign, but then the weakest uptake by the end of the observation period.
 Despite the uncertainties between the methods, there was some level of similarity in mean NEE among the four methods, with equivalent case study average fluxes for BLB, CT, and NWT to within 0.5 μmol CO2 m−2 s−1, with significantly less uptake modeled by SiP. The general agreement on magnitude among the top-down methods to the flux towers is somewhat surprising, given the differences in methodologies, and for the case of NWT, a difference in flux footprint compared to the others and a likely underestimate of approximately 1 μmol CO2 m−2 s−1 from unaccounted advective fluxes [Yi et al., 2008]. NWT sampled primarily a stand of subalpine spruce-fir and lodgepole pine forest, while the other four methods sampled on average 47% ± 7% forest, and 44% ± 7% grassland and shrubland, in a domain where 12% ± 3% of the forests were impacted by mountain pine beetle. This distinction is best reflected in the difference between SiP, which was partly parameterized with NWT, and NWT. Uncertainty in BLB was large, often similar to the absolute magnitude of NEE, primarily reflecting the large difference in flux for each source-receptor pair. However, the uncertainty was only slightly larger than the spatial standard deviation in CT. Spatial standard deviation in SiP was more muted.
3.4. Seasonal Pattern of Carbon Uptake
 Flux temporal variability exhibited a similar pattern of relative variability in the four methods, but timing of seasonal patterns, in terms of peak uptake, reduction, and secondary uptake varied. Consequently, direct correlations between the methods were poor and not significant, except for that between SiP and NWT (r = 0.39) and SiP and CT (r = −0.58). A positive correlation between SiP and NWT might be expected given how SiP was parameterized. The correlation between SiP and CT, however, was negative. These differences reflect not only methodological error but differences in scale and so represent both information and uncertainty.
 Though temporal correlation was poor, there was a consistent pattern of peak NEE during early summer and eventual weakening of uptake in mid summer (Figure 6). The four techniques resolved net daytime carbon uptake that varied substantially between −4.3 to −12.5 μmol CO2 m−2 s−1 on the first campaign date, but all except SiP showed less uptake in the next flight, and then all showed decrease or leveling off in the next two flights. The final four flights had greater variability in relative response among techniques, but generally showed greater uptake (secondary peak) after the midsummer decline. SiP had the weakest peak uptake. Postpeak response was stronger at NWT, but more moderate in SiP. BLB had an earlier strong period of reduced uptake (15 June) that then quickly returned back to uptake similar to the first flight day except for reduced uptake on 3 August, though methodological uncertainty was large for this technique. CT was an outlier in this response pattern and displayed a later and very strong postpeak NEE uptake decline, nearly shutting off daytime uptake by late July, though this was likely an artifact of the way the CT inversion worked, as discussed in more detail in 4.3. Consequently, temporal standard deviation of mean NEE was largest in CT.
 NWT is a subalpine forest, while the other three methods represented a regional response dominated by (but not exclusively) subalpine forest, so it is useful to compare NWT (Figure 6b) to the other three methods together (Figures 6a, 6c, and 6d). The peak uptake using BLB was hard to detect visually. However, when NWT NEE was shifted by about 1 week earlier, the correlation to BLB increased significantly, suggesting that regionally, peak uptake occurred earlier than NWT. In contrast, SiP peak uptake occurred later than NWT and lasted longer, while CT had strong daytime uptake throughout June. Consequently, conclusions drawn from the different approaches about differences in timing of regional versus site level peak uptake are hard to reconcile. Similarly, it was unclear if a secondary peak uptake in NEE occurred in the region, with SiP showing a pattern similar to NWT, though with greater relative uptake in the second period, and BLB suggesting that reduced uptake period was limited to a short period in time.
 While the footprint-adjusted daytime-only NEE is useful for comparing BLB to the other regional fluxes, additional analysis on whole domain 24 h NEE from SiP and CT and 24 h NEE from NWT can be used to further assess regional flux responses that are not apparent in the seven daytime flights (Figure 7). These differences mainly reflected the role of nighttime respiration on net uptake. Here, it is clear that peak uptake at NWT is larger than all regional models including SiP and CT. While midday uptake across the region was similar to NWT, the weaker regional 24 h NEE implies greater total ecosystem respiration across the region, as expected given the colder local soil temperatures at NWT [Monson et al., 2003] and suggests possibly lower total NEE across the entire region than at NWT.
 In contrast to the daytime NEE seen in Figure 6, 24 h NEE patterns in Figure 7 reveal earlier and weaker peak uptake in SiP compared to NWT. While the reduction of peak uptake was stronger in NWT (in terms of slope of NEE decline per day), SiP also showed a decline in NEE to above zero. The absolute difference in peak NEE for NWT and SiP for the secondary peak of NEE is smaller than for the primary peak, suggesting that the secondary peak was a less common feature in lower-elevation forests, where higher temperatures continued to suppress uptake. In contrast, CT peak uptake occurred later than NWT, with a larger magnitude than SiP, followed by a near-shutdown of uptake for the rest of the season and no obvious secondary peak. Annual total NEE for the two models for 2007 were −46 gC m−2 yr−1 for SiP and −66 gC m−2 yr−1 for CT, both much smaller than total uptake than NWT (−220 gC m−2 yr−1), even when considering the advective contributions which may reduce this total flux on average by 11% based on previous year estimates [Yi et al., 2008].
3.5. Spatial Patterns
 Given the uncertainty in the BLB technique over mountainous terrain, spatial gradients among individual upwind-receptor pairs for any one flight day cannot be compared (hence the need for multiple profile averaging). This limitation can be avoided for flights where consistent upwind-receptor pairs were flown with roughly similar footprints. In these cases, averaging across time for any one individual sample should allow for spatial comparison across a latitudinal gradient. However, when averaged in this way, no gradient related to the pattern of forest cover or mountain pine beetle associated mortality was seen across the five receptor pairs (Table 2), with the largest uptake for the WIL receptor, and weakest at GNB. This pattern did not correspond to spatial variations in either forest cover or bark beetle disturbance.
Table 2. Comparison of Receptor-Based Mean Flux (μmol m−2 s−1) for Cases 2–4 (16–19 June), Where All Five Receptors Were Sufficiently Sampled, Ordered by Latitudea
Mean Flux (μmol m−2 s−1)
Error represents measurement error and temporal variability across the three cases summed in quadrature. Significant variability exists across the receptor mean fluxes, and no strong latitudinal gradient was found.
−7.2 ± 5.4
−1.2 ± 3.1
−13.2 ± 3.9
−5.2 ± 3.7
−1.6 ± 7.1
 The gridded models (CT and SiP) can also be directly examined for spatial patterns (Figure 8). Relative difference in mean May (peak uptake) to mean July (drought) NEE for SiP and CT revealed that most of the domain had weaker uptake in July, though SiP showed a few areas in southern Wyoming and also southwest Colorado (near 40°N, 108°W) with increased uptake coincident with the area of greater deciduous forest and overall lower vegetative cover, which may not experience as much plant drought response. CT showed a NW to SE spatial gradient of stronger to weaker drought response. This pattern was not apparent in SiP, which mostly showed a stronger drought NEE response in higher-elevation regions. Both models, however, had a strong region of reduced carbon uptake around 41°N, 107°W, coincident with a forested area north of Steamboat Springs, CO that has been recently attacked by bark beetle and picked up in the remote sensing of leaf area inputs used by both. However, other areas with significant beetle mortality do not show a similar reduction, so coincidence is the most reasonable explanation at this point.
4.1. Timing and Magnitude of Uptake
 We initially set out to ask whether regional carbon uptake could be reliably estimated by comparison of multiple complementary techniques, and further how processes that relate NEE to climate at a single flux tower (NWT) compare to regional climatic process and NEE. There is no expectation that a single flux tower anywhere is representative of anything about regional carbon exchange in areas with variation in edaphic conditions such as elevation and climate. Consequently, it is not surprising to find variation in estimates of temporal evolution of regional NEE among methods, some of which is due to methodological error (see section 4.3), but some also reflecting undersampled ecological variability. As such, the similarity that we observed in magnitude of uptake among the regional methods and between the tower and these methods was unexpected.
 In particular we introduced a novel budget approach based on aircraft observations and while it is clear that refinements to the BLB approach can be made, it is also clear that with sufficient sampling and careful consideration of sampling footprints and multiple profiles, airborne boundary layer budgets provide useful information about regional carbon cycles and their sensitivity to climatic variability. The BLB method is best used not as a stand-alone method, but rather as a method to evaluate hypotheses observed at smaller scales with more temporal frequency. Multiple parallel paired profiles are critical for success in deploying this approach over complex terrain, in contrast to results shown in flat terrain where flight tracks perpendicular to the flow can be used [e.g., Lin et al., 2003]. Assuming similar land cover within each upwind-receptor pair, each additional pair reduces error by averaging out the effects of divergence and convergence of air masses in complex terrain.
 Our results suggest that NWT daytime NEE was similar in magnitude to regional uptake estimated by top-down methods (CT and BLB) during the growing season, almost surprisingly so because the footprint of regional fluxes cover a mix of forest types (from pine-dominated to spruce-fir dominated forests), beetle-killed forest, and nonforest cover. Despite the observed similarities among methods with regard to determination of mean fluxes, when 24 h average flux tower observations were compared to the two models over the whole domain, this similarity in magnitude disappears.
 In contrast, SiP showed less uptake than NWT but a temporal evolution more similar in timing to NWT than CT or BLB. Given that SiP's forested functional types were parameterized directly from this flux tower, the primary difference between SiP and NWT reflect almost exclusively the integration of seasonal differences in meteorology and remotely sensed LAI across the elevation gradient. While daytime uptake in SiP was lower than NWT, for example, the 24 h uptake at SiP was much smaller than NWT, suggesting that regional fluxes had greater ecosystem respiration (ER) than NWT, perhaps related to larger proportion of nonforest and dead forest cover as well as the higher average soil temperatures across the domain compared to the flux tower site. However, given the relative similarity of daytime uptake and assuming that higher regional ER persisted through the day, it follows that there was greater GPP over the region. Perhaps this pattern of fluxes is related to the average elevation of the region being lower than the elevation of NWT, leading to warmer temperatures conducive to greater GPP and ER, at least before midsummer drought dominated. This explanation is further supported by the smaller difference of 24 h NEE later in the season between SiP and NWT, when moisture was a stronger limiting factor than temperature. Nonetheless, further analysis is required of these differences.
 While SiP essentially maps the effect of meteorology, LAI, and land cover variability based on the expected NEE variation observed at NWT, inconsistencies of SiP to patterns in BLB and CT demonstrates the importance of unobserved ecological variability in driving variation in regional NEE. Of course, seasonal pattern analysis with the BLB method was severely limited by the small number of cases (7) and the large uncertainty that comes with the method. A larger number of flights would have been ideal to clearly identify seasonal patterns by airborne flux analysis, though weather factors will always limit the number of safe flying days in the mountains. Still, the comparison did suggest that high-elevation forests of this region behave quite differently from deciduous forests at lower elevations and in simpler terrain, and possibly even from the cooler, wetter NWT site, likely owing to the strong impact of reduced transpiration and carbon uptake in mid summer.
4.2. Controls on Regional Carbon Uptake
 The seasonal pattern of carbon exchange in high-elevation ecosystems of the western mountain regions of North America is less regular and punctuated by multiple maxima, compared to what is typically observed in temperate forests in eastern North America. Our primary goal with regional flux observation was to evaluate the regional applicability of hypotheses that drive this effect and confirmed at the local scale at NWT. The hypothesis for seasonality of NEE for the region based on work at Niwot Ridge suggests a strong link between snowmelt and flux seasonal cycle [Hu et al., 2010]. All models and the flux tower showed some reduction from peak uptake in June, though the estimated strength of this reduction was quite different by method. Analysis of SnoTel snow water equivalent (SWE) data showed a positive correlation between elevation and SWE above 2400 m (r = 0.51, p < 0.001, data not shown), implying that less availability of moisture at lower elevation would reduce net productivity, while higher soil temperature should lead to greater net respiration.
 Lagged correlation analysis of BLB NEE suggested a similar pattern of regional peak uptake followed by reduction, but one that occurred earlier than at NWT, supporting the idea that the mean lower elevation of the region shifts the timing of uptake and drought response. However, if this result was solely an effect of climate, then the results from the SiP model, which was driven by local temperature and precipitation, would agree more with BLB. Instead, SiP daytime NEE had its highest correlation to NWT at zero lag, while for daily NEE using SiP, the uptake pattern occurred a week earlier. This result implies that ER is driving the difference between the two, given the inclusion of nighttime fluxes in the 24 h NEE and suggests that in the SiP model, ER processes were more affected by elevation-driven variations in climate than GPP.
 The secondary peak in NEE, apparently driven by cooler temperatures and increased precipitation in late summer, was not clearly delineated in the 7 BLB cases, which showed mean fluxes of similar magnitude to the early part of the season. A secondary peak was observed, albeit weakly, in the daily SiP NEE. However, this feature was not seen in CT, though this is likely related to how CT inversions work (see section 4.3). Consequently, whether a strong secondary peak of NEE occurred in the region similar to that detected in the NWT tower flux observations cannot be addressed by this analysis.
 Still, it does appear that the key controls over carbon sequestration in montane systems are fundamentally different from those in mesic, low-elevation ecosystems, requiring a specific effort to improve our ability to predict carbon fluxes in these high-elevation, semiarid forests of the western U.S. climate observations and models suggest that warming temperatures will cause more precipitation to fall as rain in mountainous terrain and that early melting of the snowpack will lead to the early onset of spring conditions [Barnett et al., 2008]. On the basis of evidence from deciduous and coniferous forest ecosystems in eastern North America [Barr et al., 2007; Desai, 2010; Goulden et al., 1996; Hollinger et al., 2004] one might expect an earlier spring to enhance CO2 uptake in terrestrial ecosystems by increasing the length of time for photosynthetic activity [Myneni et al., 1997]. However, an analysis of a decade of measurements of CO2 exchange at NWT indicated that earlier onset of spring conditions led to less annual carbon uptake because of the strong dependence of forest carbon uptake on the winter snowpack, which tends to be lowest in years with earlier spring [Hu et al., 2010].
 Findings from this study also need to be reconciled with trends in remote sensing observations of net primary productivity over western North America in the past decade, including negative trends in parts of the U.S. West [Zhao and Running, 2010]. Simulations of future carbon accumulation in these ecosystems under scenarios of climatic change further hint that carbon accumulation is likely to decline as moisture availability declines and water stress increases [Boisvenue and Running, 2010].
 Beyond direct responses to climate, forests in this region are also sensitive to secondary abiotic and biotic disturbances, many of which are ultimately caused by climate stresses. For example, drought can lead to increased fire frequency. Warmer winters can lead to increased severity and range of pest outbreaks [Raffa et al., 2008]. The current outbreak of mountain pine bark beetle is widespread in the Rocky Mountains (Figure 2) and ongoing, and the impact of this outbreak on carbon fluxes in this region is not well documented or modeled. It is possible that these kinds of outbreaks will have a much greater impact on regional carbon balance than the direct influences of climatic variability. However, in this study, we were unable to clearly show significant variations in carbon accumulation due to disturbance, owing to the limited ability of boundary layer budget fluxes to resolve within-flight spatial variations and the relatively coarse resolution of the ecosystem models. The models did show some areas with high bark beetle mortality having different seasonal patterns and magnitudes, but they were not widespread or consistent. Further analysis, especially as beetles infest sites like NWT, will provide more insight into how climate sensitivity of carbon accumulation in these ecosystems will be mediated by changes in forest structure, plant stress responses, and long-term mortality of overstory species.
 It is unlikely that aircraft measurements, especially with only seven flights, can be used to develop these types of detailed hypotheses, but these campaigns can evaluate whether regional patterns are consistent with hypotheses developed using local data and modeling. Here we found the flight observations could not falsify hypotheses developed using local data and process models. In the future, efficient use of limited airborne assets suggests that a priori hypotheses about flux phenology and mountain pine beetle outbreaks can be used to develop flight schedules that efficiently test the scaling of local theories to regions.
4.3. Sources of Error
 While our results revealed some areas of consistent magnitudes and similar patterns between regional carbon exchange and local carbon exchange, there are many sources of uncertainty and error, which are dependent on analysis method. Consequently, there is no single way to assess whether any one observation or method is “right” per se. Instead we rely on areas of consistency to test assumptions and hypotheses about controls on regional carbon exchange. There are a number of improvements that could be made to each method.
 The NWT eddy covariance flux tower contains both the traditional errors associated with random flux uncertainty but also a potentially large underestimate of NEE owing from vertical and horizontal advection in the face of complex mountain terrain. A study by Yi et al.  showed the standard low-turbulence filtering of NEE at NWT underestimates the advective flux contribution, derived from a multitower array, by ∼10%, or roughly 1 μmol m−2 s−1 for turbulent summer daytime conditions. Since no estimate of these terms were made beyond the initial 5 year study period, we are unable to correct the record presented here for advection, but note that given the uncertainties in the regional flux observations and the variability in the flux tower observations, the interpretation of our results are not expected to change, especially for daytime observations.
 In the case of the BLB, the use of multiple upwind-downwind pairs over reasonably similar flux footprints does have the effect of reducing uncertainty in the flux. The results that we report do suggest that at least three pairs are needed (based on random sampling of pairs to estimate mean flux of all pairs), but it is unclear how many pairs are sufficient and how much sampling error would be reduced with additional pairs. Further tests of this assumption are critical to advance the use of airborne budgets in terrain, including approaches similar to Mahrt . Other sources of uncertainty, such as errors in mean boundary layer depth, extrapolation to surface for total column CO2 estimation, and calibration of CO2 concentration should also be addressed since decreasing the uncertainty of these directly reduces uncertainty in estimated flux. Still, a major advantage of this technique over more complex data assimilation approaches is its simplicity in implementation and ability to retain independence from other data on regional NEE.
 A more difficult issue to be confronted with the budgets is stitching together multiple flight campaigns to estimate the regional temporal patterns, when each flight case has a different footprint with somewhat different coverages of forest, beetle mortality, and mean elevation. The preferred way around this is to use a more sophisticated assimilation framework (i.e., regional inversion) for these data [cf. Matross et al., 2006; Lin et al., 2007]. Also, airborne carbon cycle data have also been shown to be especially valuable in, evaluating the performance of inverse models that assimilate surface-only data [Stephens et al., 2007].
 As with the issues of properly assimilating data in regional inverse models, similar problems arise when using a global nested-grid inverse model at the regional scale. CT divides the world into ecoregions that can be continental in scale and adjusts scaling factors for the sum of productivity and respiration in response to assimilation of 3-hourly atmospheric CO2 data at weekly time intervals, the information content of which may not necessarily resolve many smaller-scale features [Brooks et al., 2011]. Since the problem is underconstrained, multiple and unphysical solutions may exist, though much of this is mediated in CT by using an ecosystem model driven by remote sensing and gridded surface meteorology as the prior. Still, the scaling factors for these fluxes represent a compromise solution to best simulate atmospheric CO2 across an entire region.
 In the global model, the North American evergreen needleleaf ecoregion spans both the northwestern Pacific coast and the Rocky Mountains. Scaling factors in any one week will be the same for both, and tradeoffs in the fit of CO2 data in one region versus another will lead to scaling factors that may not be optimal for the Rocky Mountain region alone. Some individual grid points and times in the Rocky Mountain region in 2007, for example, had reversed diurnal patterns of NEE occurring in late summer owing to negative scaling factors, an issue that is currently being addressed in CT (A. Jacobson, personal communication, 2010). These tradeoffs partially explain the shutdown of NEE in the latter part of the season in CT and the lack of a secondary peak. Thus, our use of CT in this analysis was not optimal and either CT should be analyzed at a large scale (as we attempted in comparison to daily SiP NEE) or used as a prior or inflow to a regional high-resolution inversion.
 While CT may have limited appeal when downscaling, the opposite is the case for SiP. SiP was originally designed to simulate carbon exchange at flux tower sites and has been shown to reproduce short- and long-term variation in carbon exchange at NWT. SiP may have been overly conditioned on the flux tower, thus emphasizing local-scale dynamics. The deciduous parameterization, on the other hand, was poorly constrained, and use of a single grassland flux tower to represent the significant coverage of grassland and low shrub was not ideal. It is expected that this simplified model, when extrapolated with gridded meteorology and remotely sensed LAI, should capture the key variations in carbon flux across the domain driven by spatial gradients in forcing. However, this assumes that the climate sensitivity of all evergreen needleleaf forests in the Rocky Mountains, regardless of elevation, soil type, or age since disturbance, is similar to Niwot Ridge subalpine forest, which is probably not true [Bradford et al., 2008]. Further estimates of site-level carbon exchange across a range of elevation and forest types would significantly aid model upscaling efforts, but this is logistically challenging. However, the length of the growing season [Churkina et al., 2005] and the timing of snowmelt [Hu et al., 2010] strongly influence the carbon balance at any given site. Given the discrepancy between the BLB and SiP estimates (Figures 6a and 6c), additional spatial constraints on the timing of the onset of photosynthesis by monitoring a combination of budburst and the timing of snowmelt could be a cost effective alternative for model evaluation and parameterization.
 Given appropriate care and estimation of uncertainty, aircraft estimates of NEE have great potential to complement and thus overcome the sampling limitations of upscaling approaches to provide more robust estimates of regional NEE. Upscaling with ecosystem models makes strong assumptions on our ability to sample ecological variability across the domain, while aircraft budgets make major assumptions about air transport and boundary layer flux exchanges. However, together, they provide a clearer picture of the controls on regional NEE. Ultimately a fully coupled data assimilation framework is the likely way forward to reconcile these methods. Here, our goal was to set the first step of that in motion.
 Seasonal patterns from our analysis confirmed the general importance of moisture availability driving a pattern of early summer uptake. While one flux tower was not sufficient to understand the climatic controls of carbon cycling in this region, the analysis here showed that the sensitivity of carbon cycling to seasonal climate at the Niwot Ridge subalpine forest was mostly representative of the region. The impacts of bark beetle mortality on carbon cycling were not well detected by our approaches, though it is possible that the impact of this disturbance on regional carbon cycling had not yet become significant in 2007 over the domain sampled.
 Multimodel comparisons are always fraught with analytical complexity. This caution is further warranted when models are built to simulate responses at one scale but are then applied at another scale. Future work on how to best test model assumptions and rigorously compare models is needed [e.g., Schwalm et al., 2010]. Assimilation systems that integrate the observations and best estimates of uncertainty at appropriate space and time scales for each approach are likely to improve upon these relatively simplistic comparisons.
 Even given these uncertainties, intensive field campaigns combined with regional models and flux towers do provide insight on the likely gains in understanding to be made in carbon cycling in complex terrain, especially for diagnosing and predicting changes to the seasonal cycle of NEE. Our study did confirm the importance of repeated sampling to capture temporal changes during intensive field campaigns. Given that the largest uncertainties here occurred from temporal and spatial sampling and less so from observational precision, future flight campaigns may want to focus on routine, low-cost, frequent sampling.
 The authors acknowledge support of the ACME07 field campaign by the National Science Foundation (NSF) via the University Corporation for Atmospheric Research (UCAR). A principal source of funding was a grant from the U.S. National Science Foundation Biocomplexity Program (grant EAR-0321918). Analysis was further supported by the U.S. National Oceanographic and Atmospheric Administration through the Office of Atmospheric Research (OAR) Climate Program Office (CPO) (grant NA09OAR4310065). Flight operations were logistically supported by the University of Wyoming Department of Atmospheric Sciences, and we thank the pilots and support staff for their dedication and hard work in carrying out the flights, especially J. French. The Niwot Ridge AmeriFlux measurements were supported by grants from the Department of Energy (DOE)-funded National Institute for Climate Change Research and DOE Terrestrial Carbon Processes programs. We also acknowledge T. Meyers, NOAA, for permission to use Fort Peck AmeriFlux observations. NCAR is sponsored by the National Science Foundation.