3.2.1. Semivariogram Model
 The spatial variability of XCO2 is quantified by modeling the semivariogram of the XCO2 distribution, which describes the degree to which two XCO2 values are expected to differ as a function of their separation distance (h). To evaluate this relationship, the raw semivariogram γ(h) is evaluated for all pairs of XCO2 data:
where the distance (h) between locations xi and xj is the great circle distance between these points on the surface of the Earth:
and where (ϕi, ϑi) are the latitude and longitude of location xi, and r is the Earth's mean radius. The semivariogram is used to model the spatial autocorrelation of XCO2 that is not explained by a deterministic trend in the data. Therefore, the XCO2 north–south gradient is estimated for each month using linear regression and subtracted from the data prior to the analysis.
 A theoretical variogram model is selected on the basis of the observed variability to represent the spatial autocorrelation structure. The theoretical variogram describes the decay in spatial correlation between pairs of XCO2 measurements as a function of physical separation distance between these measurements. The exponential semivariogram [e.g., Cressie, 1993] is selected here to model MATCH/CASA XCO2 spatial variability, based on an examination of a binned version of the raw variogram. The exponential variogram is defined as:
where σ2 represents the expected variance of the difference between XCO2 measurements at large separation distances, and 3L represents the practical correlation range between XCO2 measurements. These parameters also define the corresponding exponential covariance function: C(h) = σ2 exp(−h/L).
 The exponential model parameters are fitted to the raw semivariogram of the latitudinally detrended XCO2 data using nonlinear least squares. The fitted variogram parameters define the spatial covariance structure of the modeled XCO2 signal. The uncertainty of the least squares fit of the variance (σ2) and range parameter (L) are not reported in this study because the results are based on an exhaustive sample from the simulated field, and the uncertainty resulting from limited sampling is negligible. The majority of the uncertainty associated with variogram parameters stems from assumptions about fluxes and transport, and the sensitivity to these choices is explored is sections 3.3 and 4.3.
3.2.2. Spatial Variability Analysis
 The global spatial variability is defined through semivariogram parameters fitted to the raw semivariogram. For each day, the raw semivariogram is constructed using detrended MATCH/CASA XCO2 at 1300 local time for all model grid cells. The analysis is repeated for each day of the model year to identify both the seasonal trends in global variability at daily resolution, and the relationships between these trends and seasonal changes in global CO2 flux and transport.
 Regional variability in the spatial covariance structure is evaluated through localized variograms representing subareas of the global domain. This analysis requires areas (regions) large enough to capture the scales of variability within a given subdomain of the model, while at the same time small enough to reveal the characteristics of local spatial variability.
 A regional variability analysis with a similar methodological goal was previously adopted by Doney et al.  to measure the mesoscale global spatial variability of satellite measurements of ocean color. In that study, daily anomalies from the monthly block mean of the natural log of chlorophyll concentrations were used to fit spherical variograms for nonoverlapping 5° regions globally.
 In the case of XCO2, regional covariance parameters were fit for each model grid cell, resulting in a regional spatial variability analysis at a 5.5° resolution. Because regional spatial variability may reflect global general circulation patterns as well as differences in surface fluxes between regions, correlation lengths of XCO2 may extend beyond individual continents or ocean basins. To account for this, the local semivariogram parameters in the current work are constructed to reflect both the local variability and its relationship to global spatial variability. First, regions are defined as overlapping 2000 km radius circles centered at each model grid cell, resulting in a total of 2048 regions covering the globe. A 2000 km radius was selected because it is sufficiently large to capture much of the variability in the vicinity of a given grid cell, while being small enough to capture regional variability in the spatial covariance structure. Second, the raw semivariogram (γregion(h)) is constructed using pairs of points with one point always within the defined region (XCO2(xregion)) and the other either within or outside that region (XCO2(xregion + h)) (see Figure 1). This approach focuses on the variability observed within each subregion, while also accounting for larger scales of variability:
Third, to emphasize the covariance of XCO2 within the analyzed region, weighted nonlinear least squares is used to fit the local semivariogram parameters, with higher weights assigned to points within a separation distance less than or equal to 4000 km. Numerically, correlation lengths are also restricted to a maximum of half the Earth's circumference.
Figure 1. The regional spatial covariance evaluates the spatial variability of (left) XCO2 values within a region (e.g., eastern North America) and between this region and (right) global XCO2 values.
Download figure to PowerPoint
 Conceptually, a higher variance is representative of more overall variability, as is a shorter correlation length, which is indicative of more variability at smaller scales. The parameter ho is introduced to provide a single representation of the degree of variability observed in different regions, and to merge information about both the variance and correlation lengths of XCO2 variability. If we consider a single sounding at a known location, ho is defined as the maximum distance from the sounding location at which the mean squared XCO2 prediction error is below a preset value, Vmax. The mean squared prediction error is the uncertainty associated with using the sounding to predict the unknown value at a given distance away from the sounding location, using ordinary kriging. Ordinary kriging is a minimum variance unbiased interpolator that takes advantage of knowledge of the spatial covariance structure to interpolate available measurements while providing an estimate of the interpolation error [Chiles and Delfiner, 1999]. For an exponential variogram:
where σR2 and LR are the fitted regional variance and range parameter, respectively.
 Both a higher regional variance σR2 and a shorter regional range parameter LR lead to a decrease in the overall spatial scale over which a given measurement is representative of the surrounding XCO2 values. It should be noted that no measurement error is assumed in the calculation of the regional variance σR2 and range parameter LR. Therefore, the resulting ho values demonstrate the overall spatial scale of the information provided by a noise-free XCO2 measurement over the measurement region and time.
 In subsequent sections of this work, variability inferred from the MATCH/CASA model is compared to other models and field data, where different theoretical variogram models are used to represent XCO2 spatial variability. Because parameters used to describe the variability differ between variogram models, the ho parameter also provides a convenient universal metric that can be compared across models. The equivalent ho parameters for the other variogram models used in this study are presented in subsequent sections.
 Conceptually, the ho parameter can also be thought of as a measure of the expected relative spatial density of retrieved soundings that would be required to capture the spatial variability of XCO2 over different regions. The choice of Vmax is somewhat flexible, but should represent a level of interpolation uncertainty that is relevant to potential applications of the data. In the presented results, Vmax is chosen to be 0.25 ppm2 (√Vmax = 0.5 ppm). This level is comparable to the 1 ppm regional-scale uncertainty described as a goal for OCO [Chevallier et al., 2007]. It should be noted that Vmax represents the interpolation uncertainty assuming no measurement error. Thus, the lower variance was chosen to compensate for the additional uncertainty that would be contributed by measurement errors and other sources of error.