Two community-wide undertakings define the current state of knowledge for the atmospheric CO2 budget: the GLOBALVIEW-CO2 in situ measurement network [GLOBALVIEW-CO2, 2005] and the TransCom 3 transport/flux estimation experiment [Gurney et al., 2002, 2003, 2004, 2005; Law et al., 2003; Baker et al., 2006a]. The GLOBALVIEW-CO2 network’s emphasis on acquiring accurate measurements through rigorous experimental methods, constant calibration using procedures and materials traceable to WMO standards, and continual vigilance against biases has created the recognized reference standard data set for atmospheric CO2 observations. The network collects surface in situ CO2 measurements at approximately 120 stations worldwide, spanning latitudes from the South Pole to 82.4°N (Alert, Canada). Typical measurement uncertainties are on the order of 0.1 ppm (0.03%).
2.1. Spatial and Temporal Gradients
 Distributions of atmospheric CO2 were simulated with the Model of Atmospheric Transport and Chemistry (MATCH) three-dimensional atmospheric transport model [Olsen and Randerson, 2004]. MATCH represents advective transport using a combination of horizontal and vertical winds and has parameterizations of wet and dry convection and boundary layer turbulent mixing [Rasch et al., 1994]. MATCH operates off-line using archived meteorological fields which for this study were derived from the NCAR Community Climate Model version 3 with T21 horizontal resolution (approximately 5.5° × 5.5°) and 26 vertical levels from the surface up to 0.2 hPa (about 60 km) on hybrid sigma pressure levels. The top of the first model level is approximately 110 m. The meteorological fields represent a climatologically “average” year rather than any specific year. This meteorological data was archived every 3 model hours and was interpolated to the 30-min MATCH time step. In this configuration MATCH has an interhemispheric transport time of approximately 0.74 years, about in the middle of the 0.55 to 1.05 year range of the models that participated in the TransCom 2 experiment [Denning et al., 1999]. A single year of dynamical inputs was recycled for the multiyear runs used in this study.
 Constraints on CO2 sources and sinks incorporated fossil fuel emissions as estimated by Andres et al. , atmosphere-oceanic exchange as estimated from sea-surface pCO2 measurement by Takahashi et al. , and biospheric fluxes modeled using the Carnegie-Ames-Stanford Approach (CASA) model [Randerson et al., 1997], including a diurnal cycle of photosynthesis and respiration. Simulated data were obtained from the model output by integrating vertically according to the OCO averaging kernel at the T21 horizontal resolution. In these simulations terrestrial ecosystem exchange was annually balanced; in other words, we omitted a “missing” carbon sink necessary to balance fossil carbon sources with the atmospheric CO2 growth rate [Gurney et al., 2002; Tans et al., 1990].
Figure 1. Modeled column-averaged CO2 dry air mole fraction (XCO2) simulated for the year 2000 using the CASA/MATCH model [Olsen and Randerson, 2004]. Values for 1300 LST on 15 January and 15 July are plotted relative to the annual average south of 60°S.
Download figure to PowerPoint
 These results indicate that space-based data with precisions better than 2 ppm are needed to resolve the peak-to-peak amplitudes in monthly and annual This precision is also sufficient to resolve regional scale meridional variations over the Northern Hemisphere boreal forests or the Southern Ocean. The OCO sampling strategy (section 4) is specifically designed to return space-based measurements with the high sensitivity and dense sampling in space and time required to attain this precision even when cloud and aerosol interference prevents observations of the complete atmospheric column in the majority of the observed scenes.
2.2. Global and Regional Variability
 Global spatial variability was quantified by analyzing the CASA/MATCH data calculated for 2000. Raw and experimental variograms were calculated to quantify the global variability of The raw variogram is defined for any two measurements as [Cressie, 1991]
where γ(h) is the raw variogram, z(x) is a measurement value at location x, z(x′) is a measurement value at location x′, and h is the separation distance between x and x′. The distance was calculated using the great circle distance between points on the surface of the earth [e.g., Michalak et al., 2004]:
where the coordinates xi = (ϕi, i) are the latitude and longitude, respectively, of the sample locations, and r is the mean radius of the earth. The raw variogram values are averaged for different ranges of separation distance to obtain the experimental variogram. Because the variogram is designed to represent the portion of the distribution that cannot be represented by a deterministic trend, the consistent North-South gradient was accounted for by detrending the data with respect to latitude, using the simulated latitudinal gradient for 1300 LST each month. In this way, the variograms represent the stochastic, spatially correlated, portion of the distribution, which is the portion that will need to be estimated to obtain a continuous distribution based on the point measurements taken by OCO. The resulting variograms are presented in Figure 2.
Figure 2. Global experimental and fitted theoretical variograms for CASA/MATCH modeled column-averaged CO2 dry air mole fraction presented in Figure 1 (1300 LST for 15 January and 15 July 2000).
Download figure to PowerPoint
 The experimental variogram for each month was fitted using an exponential theoretical variogram model [Cressie, 1991; Michalak et al., 2004]
where σ2 is the semivariance and L is the length parameter. The theoretical variogram describes the decay in spatial correlation between pairs of measurements as a function of physical separation distance between these samples. The overall variance at large separation distances is 2σ2 and the practical correlation range is approximately 3L. The σ2 and L parameters were estimated using a least squares fit to the raw variogram. The fitted variograms are presented in Figure 2, and the global variance and correlation range for each month are summarized in Table 1. The correlation length represents the distance at which the expected covariance between z(x) and z(x′) approaches zero, and the measurement z(x′) no longer provides useful information about the value z(x). The variance indicates the maximum uncertainty at unsampled locations, in the absence of nearby measurements, assuming the overall mean or trend is known.
Table 1. Global XCO2 Variability at 1:00 PM Local Time for a Representative Day in Each Month of the Year 2000 From CASA/MATCH Model Runs
|Month||Correlation Length (3L), km||Variance (2σ2), ppm2|
 To assess the regional variability of a separate variogram was constructed for each grid cell of the 5.5° × 5.5° model output (2048 cells globally), centered at that grid cell. In calculating the raw variogram for each grid cell, only pairs of data points with at least one member within a 2000-km radius of the grid cell were considered. Therefore the raw variogram consisted of data pairs where either (1) both measurements were within 2000 km of the central grid cell, or (2) one measurement was within 2000 km of the central grid cell and the other was not. In essence, this approach quantifies the variability between measurements in the vicinity of a grid cell and the global distribution.
Figure 3. Locally estimated correlation lengths for CASA/MATCH modeled column-averaged CO2 dry air mole fraction presented in Figure 1 (1300 LST for 15 January and 15 July 2000). In general, more retrievals will be required to characterize regions with shorter (red) correlation lengths.
Download figure to PowerPoint
Figure 4. Locally estimated variance for CASA/MATCH modeled column-averaged CO2 dry air mole fraction presented in Figure 1 (1300 LST for 15 January and 15 July 2000). In general, more retrievals will be required to characterize regions with larger (red) variances.
Download figure to PowerPoint
 The results of the regional variability analysis are qualitatively consistent with the results of Lin et al. , who also found longer correlation lengths over the Pacific relative to continental North America in their analysis of aircraft-derived fields. A quantitative comparison is difficult to establish because Lin et al.  used a nonstationary power variogram to represent CO2 variability. Such a variogram does not have a finite maximum variance (i.e., sill) or correlation length to compare to those presented in Figures 3 and 4. One quantitative comparison that can be made is a calculation of the separation distance at which the expected difference in at two sampling locations is expected to reach a specified variance. Based on the variogram used in Lin et al. , the separation distance at which the squared difference between vertically integrated CO2 concentrations (<9 km) is expected to reach 1 ppm2 is 57 km over the North American continent in June 2003, and 727 km over the Pacific Ocean. Data over the Pacific Ocean were a composite of multiple years of springtime (February to April) and fall (August to October) data. For the CASA/MATCH data, the separation distance is 460 km over the North American continent for June 2000, and 13,600 km (March 2000) and 5200 km (September 2000) over the Pacific Ocean. The two sets of results are consistent in showing greater spatial variability over the continental regions, but Lin et al.  shows more overall variability at smaller scales. The higher variability inferred by Lin et al.  is most likely largely due to the scale at which the aircraft measurements were taken relative to the scale of the CASA/MATCH modeled data, and the limited vertical extent of the aircraft profiles. As was previously discussed, data at finer scales typically exhibit more variability relative to coarser data. This will need to be considered further in interpreting global model data in the context of fine scale OCO measurements. Regional scale variability will also be driven by local conditions and meteorology [Nicholls et al., 2004]. Spatial and temporal heterogeneity of the covariance structure must therefore be taken into account in the design of a sampling strategy and retrievals.