We provide quantitative estimates for the spatial variability of CO2, crucial for assessing representativeness of observations. Spatial variability determines the mismatch between point observations and spatial averages simulated by models or observed from space-borne sensors. Such “representation errors” must be properly specified in determining the leverage of observations to retrieve surface fluxes or to validate space-borne sensors. We empirically derive the spatial variability and representation errors for tropospheric CO2 over the North American continent and the Pacific Ocean, using in-situ observations from extensive aircraft missions. The spatial variability and representation error of CO2 is smaller over the Pacific than the continent, particularly in the lowest altitudes, and decreases with altitude. Representation errors resulting from spatial variability in the summer continental PBL are as large as 1∼2 ppmv for typical grid resolutions used in current models for inverse analyses.
 Current knowledge about global CO2 sources and sinks has been derived principally from the distribution of atmospheric CO2 concentrations, often using inverse modeling approaches [Gurney et al., 2002]. Most CO2 observations have been acquired in the marine boundary layer, at remote sites separated by distances of order 1000 km [National Oceanic and Atmospheric Administration (NOAA), 1997]. These stations provide a measure of global trends and seasonal variation but do not adequately characterize CO2 variations in the vertical dimension, in the horizontal at scales smaller than 1000 km, and over continents. Spatial variability determines the representation error associated with comparing point observations with model values averaged over finite gridcells [Gerbig et al., 2003a], or with pixel-mean values measured by space-borne sensors [Rayner et al., 2002]. As horizontal spatial heterogeneity increases, point observations characterize smaller areas and representation errors increase, with implications for design of observational networks [Wofsy and Harriss, 2002]. Variability in the vertical determines the number of observations needed to characterize tracer column amounts to specified accuracy, an important consideration in designing flask-sampling networks [Bakwin et al., 2003].
 Representation errors have to be properly specified within assimilation/inversion frameworks as part of error covariance matrices [Rodgers, 2000]. Neglect of these errors causes overestimation of the observational constraint derived from atmospheric observations or satellite-observed columns and may produce biased solutions [Gerbig et al., 2003a, 2003b]. Further, quantifying representation errors addresses the following question, critical for validation of space-borne sensors: Can differences between validation and satellite observations be explained by spatial variability, or does the difference reveal instrument problems?
 This paper analyzes aircraft observations to characterize the spatial variability of CO2 and the associated representation error over North America and the Pacific Ocean. Only one study to-date [Gerbig et al., 2003a] has quantified CO2 variability in the PBL at scales of 101–102 km2 (relevant scales for resolution of atmospheric models and satellites) using aircraft data collected during August 2000 over North America. Aircrafts provide the unique capability to observe spatial variability at the relevant scales by three-dimensional sampling over a short period of time, covering distances from a few km up to hundreds to even thousands of km. Here we significantly expand on the analyses of Gerbig et al. [2003a] by examining CO2 variability at altitudes above the PBL, including observations over the Pacific from the NASA Global Tropospheric Experiment (GTE) missions [McNeal, 1983] and a recent mission over North America (COBRA-2003).
 Data sets used in this study are from several aircraft missions. Continental observations derive from the CO2 Budget and Rectification Airborne (COBRA) missions in August 2000 and June 2003 over North America [Gerbig et al., 2003a, 2003b; Lin et al., 2004]. Data over the Pacific come from NASA's GTE missions [McNeal, 1983]: PEMWEST-A (Sept.∼Oct. 1991), PEMWEST-B (Feb.∼Mar. 1994), PEMTROPICS-A (Aug.∼Oct. 1996), PEMTROPICS-B (Mar.∼Apr. 1999), and TRACE-P (Feb.∼Apr. 2001). Data from these Pacific missions are pooled to comprise an “average” picture of CO2 variability. Locations of vertical profiles used in this analysis are shown in Figure 1. We limited the Pacific observations to those away from the East Asian coast and north of 10°N.
 The COBRA CO2 instrument was a non-dispersive infrared gas analyzer based on the design of the Harvard ER-2 CO2 analyzer with measurement errors typically ±0.25 ppmv (2-σ) or better [Daube et al., 2002]. The CO2 sensor on board the GTE missions was similar, with comparable precision and accuracy [Anderson et al., 1996]. An intercomparison between the ER-2 and GTE instruments showed agreement within 0.1 ppmv [Daube et al., 2002].
3. Analysis Methods
 We first analyze uncertainties in column-averaged CO2, caused by both instrument errors and atmospheric variability. Then we characterize the horizontal variability of column-averaged CO2 and quantify the additional uncertainty (“representation error”) due to spatial variability when column averages are represented in transport models with finite grids or when observed columns at point locations are compared with space-borne sensors that resolve with finite pixels. See Gerbig et al. [2003a] for more details concerning the analysis method summarized below.
 We assume that the signal used by models and relevant for satellite validation [cf. Rayner et al., 2002] is the density-weighted CO2 concentration averaged over a finite atmospheric column (). We defined columns for four altitude ranges over the Pacific: 0.15∼3 km, 3∼6 km, 6∼9 km, and <9 km. The lowest altitude (150 m) was selected to remove contamination from airports but still include enough observations within the marine boundary layer. For continental observations we chose the PBL height instead of the fixed altitude of 3 km as the top of the lowest column due to strong tracer gradients across the PBL top and large diurnal variability in PBL heights. The <9 km column was selected to approximate total-atmosphere column observations (neglecting the stratosphere) [e.g., Yang et al., 2002]. Note that a more detailed calculation of should use weighting functions corresponding to the specific averaging kernels for future space-borne sensors. The <9 km column was based on profiles that span at least 20% of the lowest layer (PBL or 3 km) and 60% of the entire 9 km column; not enough profiles spanned these altitudes in COBRA-2000.
 The uncertainty in , denoted σ(), reflects variability in the column average introduced by instrument errors and filaments in tracer profiles not represented by transport models. Since measurement errors and filaments cannot be cleanly separated, fluctuations around the column average—σ(CO2)—are regarded here as an upper bound for variance arising from atmospheric variability.
 σ() needs to account for effects of covariance between observations at proximate altitudes associated with layering of tracers. We estimated this covariance by calculating the autocorrelation between observations in different altitude bins, defining the thickness of tracer filaments to be the altitude range over which the autocorrelation decays to ∼1/e.
 The spatial variability of was estimated using variogram estimation. The variogram is the variance of differences between signals S— Var(Si–Sj)—measured at different locations, as a function of the distance h between the measurement locations [Cressie, 1993]. The observed served as the S in this study, and pairs of were aggregated into bins of at least 20 pairs separated by distance h. We used only pairs within a three-hour window to minimize temporal variation. The average separation time between pairs (Table 1) ranges from 0.75∼1.82 hours; this means that part of the observed spatial variability may be attributed to temporal changes in .
Table 1. Results of Statistical Analyses of CO2 Variability Within an Atmospheric Column That Gives Rise to σ(), the Uncertainty in Column-Averaged Concentrationa
Altitude [km ASL]
Average σ(CO2) [ppmv]
Average Layer Thickness [m]
σ() ≡ [ppmv]
Number of Pairs
Average Separation Time betw. Pairs [hrs]
The horizontal variability of CO2 is characterized by fitting the power variogram model (equation (1)) to pairs of observed .
N. Amer. Aug 2000
N. Amer. Aug 2000
N. Amer. Aug 2000
N. Amer. Jun 2003
N. Amer. Jun 2003
N. Amer. Jun 2003
N. Amer. Jun 2003
 A variogram model characterizing the spatial variability of was fit to the data set of Var(Si–Sj) versus distance h. We attempted to provide a conservative (low-end) estimate of the variability to assess whether even a low-end estimate would give rise to non-negligible representation errors; thus the power variogram model [Cressie, 1993] was chosen to avoid overestimation at small distances typical of many other variogram models [Gerbig et al., 2003a]:
where c0 (the “nugget”) represents the variability at h = 0 and was prescribed from the observed σ2(); c1 and λ are parameters to be estimated.
 The fitted variogram model was used in a Monte Carlo simulation to generate stochastic realizations of fields to estimate the representation error as the standard deviation, for a given gridcell size, of for all sub-grids within the gridcell, averaged over 50 simulations.
4.1. Variability Within Atmospheric Columns
Table 1 summarizes the statistics of CO2 variability. The σ(CO2) shown is the average, over all profiles, of the standard deviations around . σ(CO2) represents the deviation of a point measurement (e.g., flasks) from the column mean. σ(CO2) is the largest in the lowest altitudes and generally decreases with altitude. σ(CO2) has values of over 0.7 ppmv in the continental PBL and 0.5 ppmv below 3 km over the Pacific, much larger than the short-term instrument precision of ∼0.05 ppmv [Anderson et al., 1996; Vay et al., 2003]. This suggests that atmospheric variability dominates over instrument errors and that σ() is controlled primarily by unresolved atmospheric variability rather than instrument limitations.
 The covariance between CO2 at different altitudes leads to estimates of the average thickness of tracer layers as shown in Table 1. The mean tracer layer thickness within the PBL was ∼100 m over North America. The layers were thicker, approximately 300 m, at all altitudes over the Pacific and in the continental free troposphere. The higher values of σ(CO2) and thinner layers within the PBL arise from turbulent eddies. The free tropospheric variability and thicker layering can be traced to signatures of boundary-layer air transported into the free troposphere or stratospheric air transported into the troposphere [Newell et al., 1999].
 The variability of column-averaged CO2 in the continental PBL of 0.19 ppmv (Aug 2000) and 0.33 ppmv (Jun 2003) reflects unresolved variance for , irreducible in practice due to inability of models to simulate individual turbulent eddies exactly. In the free troposphere over North America and the Pacific σ() ranges between 0.1 and 0.2 ppmv, potentially resolvable if transport models can simulate CO2 deviations at high fidelity. However, this is challenging for current-generation models with free tropospheric gridcells generally coarser than the ∼300 m thickness of tracer layers. σ() is an uncertainty that may be avoided in validation efforts if the same tracer layers were sampled by space-borne measurements and validation sensors; however, this is difficult to achieve within the PBL, where rapid turbulent fluctuations take place.
4.2. Spatial Variability and Representation Error
 Two examples of Var(Si–Sj) and the fitted power variogram model are shown in Figure 2 for the PBL during COBRA-2003 and 0.15∼3 km over the Pacific. Marked differences can be seen: Var(Si–Sj) at comparable distances were an order of magnitude larger for the continental case.
 The striking difference in variograms of Pacific and North America directly translated into differences in representation error (Figure 3). At 200 km, a typical resolution of atmospheric models used in inverse analyses, the representation error in the continental PBL is ∼1.0 ppmv for June 2003 and ∼1.4 ppmv for Aug 2000. The representation error increases much more gradually for the Pacific results (Figure 3c) than for the continental results, reflecting the lower variability shown in the variograms (Figure 2b). The representation error for the 0.15 ∼ 3 km atmospheric column at a gridcell size of 1000 km rises to only ∼0.5 ppmv over the Pacific. Representation errors decrease markedly with altitude over North America; above the PBL the errors are comparable to the Pacific values, but still larger. Representation errors for the <9 km column, averaging over variability at different altitudes, resulted in values similar to the 3∼6 km column over both North America and the Pacific.
 Terrestrial CO2 fluxes are much stronger than oceanic fluxes [Lefevre et al., 1999] and exhibit greater spatiotemporal variability, contributing to the significantly larger spatial variability in and representation errors observed over the continent. Gerbig et al. [2003b] have shown that the variance associated with representation errors is directly attributable to sub-gridscale variations in upstream surface fluxes. The activity of sources/sinks probably accounts for the larger representation error in the continental PBL during Aug 2000 than during Jun 2003: biospheric CO2 fluxes are stronger and probably more heterogeneous during Aug, deep into the growing season, as compared to June, when leaf-out has just occurred and soil moisture levels are generally high.
 Representation errors over the Pacific (Figure 3c) are consistent with observed large-scale gradients within the marine boundary layer [Nakazawa et al., 1992; NOAA, 1997]. We infer that representation errors over the Pacific largely reflect hemispheric scale variations in fluxes that are resolvable by typical models, with minimal contribution from strong, localized sources/sinks.
5. Summary and Conclusions
 The variability of CO2 is important for determining errors necessary for inversion studies and validation of spaceborne sensors. We stress the importance of using quantitative error estimates as shown in this study when comparing observations and models. Point CO2 observations in the PBL are expected to deviate from the PBL average simulated in models by as much as 1 ppmv due to turbulent fluctuations (Table 1). Even when observations of column averages are available, the presence of tracer layers leads to errors of 0.1 to 0.3 ppmv (Table 1). These errors should be included within error covariance matrices in any framework for assimilation of CO2 data—e.g., from aircraft or satellites that measure atmospheric columns.
 The horizontal variability in CO2 causes deviations between values at a point location and spatial averages measured by space-borne sensors or represented in models. In the case of the proposed space-borne Orbiting Carbon Observatory [http://oco.jpl.nasa.gov], the small footprint of ∼1 × 1.5 km for column CO2 retrieval minimizes the scale mismatch with validation observations from, e.g., aircrafts. However, when these CO2 columns are used in current-generation transport models—with typical horizontal resolutions between 200 and 400 km [Gurney et al., 2002]—the scale mismatch results in representation errors, as suggested by the observed <9 km column, of 0.6∼0.7 ppmv (continent) and 0.2∼0.3 ppmv (Pacific). The significantly higher variability in the continental PBL translates into representation errors in current-generation models of 1.0∼1.5 ppmv (June 2003) and 1.4∼2.2 ppmv (August 2000). These errors can cause not only random noise but biases if the measurement site is systematically influenced by sub-gridscale fluxes that differ from grid-averaged values [Gerbig et al., 2003a, 2003b]. Thus when interpreting CO2 observations in the continental PBL, models with higher resolution are necessary to resolve the large spatial variability, reduce the representation error, and minimize any potential biases.
 In contrast to the continental PBL, representation errors in the lowest 3 km over the Pacific are 0.4∼0.5 ppmv for gridcells between 200 and 400 km. The attribution of spatial variability over the Pacific to large-scale gradients implies that observed gradients from the current observational network can be used to quantify representation errors at the different measurement sites. Since observed large-scale gradients over the ocean can be resolved by current atmospheric transport models, simple schemes which interpolate the model-resolved large-scale gradient and predict concentrations at observation locations could be used to minimize the representation error for atmospheric inverse models. The problem is thus fundamentally simpler than over land.
 This paper has highlighted errors arising from spatial variability, but similar errors follow from temporal variability. Modeled CO2 concentrations are often averaged to coarser time windows (e.g., monthly, annual) before being compared to observations, and validation measurements may not take place at the same time as the space-borne measurement. Discrepancies in these cases would arise from variability within the time window. We point out the importance of future analyses to quantify this temporal variability and the associated representation error by, e.g., analyzing fast-response, continuous observations at monitoring stations.
 We thank S. Denning for helpful discussions. COBRA was jointly funded by NSF, DoE, NASA, and NOAA. NASA's Earth Science Enterprise supported the CO2 work during the GTE missions.