Orbiting Carbon Observatory: Inverse method and prospective error analysis



[1] The objective, design, and implementation of the OCO inverse method are presented. The inverse method is the algorithm which finds the profile-weighted mean mixing ratio, XCO2, which best fits the measured spectrum, given a “forward model” which calculates the spectrum for a given atmospheric state, surface, and instrument properties. Minimizing bias among comparative values of XCO2 is a critical objective. The algorithm uses an “optimal,” maximum a posteriori inverse method, with weak a priori constraint, and employs a state vector containing atmospheric and surface properties expected to vary significantly between soundings. An extensive operational characterization and error analysis will be employed, producing quantities designed to aid atmospheric modelers in use of the OCO data. In particular, comparison to inverse models of surface CO2 flux will require use of the OCO column averaging kernel and a priori state vector. An off-line error analysis has also been developed for more detailed error studies, and its use is illustrated by prospective application to case studies of nadir observations in summer and winter at three sites. Uncertainties due to noise, geophysical variability, and spectroscopic parameters are considered in detail. At low and midlatitudes, the single-sounding errors due to these sources are expected to be ∼0.7–0.8 ppm for high-sun conditions and ∼1.5–2.5 ppm for low sun (winter). Errors from the same sources in semimonthly regional averages are predicted to be <1 ppm for all conditions.

1. Introduction

[2] Measurement of the sources and sinks of atmospheric carbon currently relies on a network of sites making routine in situ measurements from the surface and from tall towers, augmented by aircraft campaigns [GLOBALVIEW-CO2, 2006]. However, these in situ measurements alone have had limited success in inferring carbon flux between the surface and the atmosphere, because the existing measurement network is far too sparse to resolve uncertainties associated with atmospheric mixing, and the aliasing between atmospheric transport and surface emission [Gurney et al., 2003]. Local carbon sources further complicate their interpretation.

[3] Column measurements are much less sensitive to uncertainties in transport calculations, both because they have a larger sampling “footprint,” and because, to first order, vertical mixing merely redistributes gas within the column. The combination of column and surface measurements then is a powerful constraint on model transport. Recent studies have shown that the different response of these two types of measurement to transport and surface emission may be able to resolve uncertainties due to their covariance on diurnal and seasonal scales [Olsen and Randerson, 2004; Yang et al., 2007]. Furthermore, Olsen and Randerson conclude that even a small number of well chosen column measurement sites will be important in resolving this ambiguity.

[4] However, to be valuable in this way, column measurements must have high precision and intercalibration accuracy, indeed more so than was possible until quite recently. For example, Olsen and Randerson [2004] calculate the February/August change in CO2 column at Northern Hemisphere midlatitudes to be ∼6 ppm, so to measure it within 10% requires a precision of ∼0.6 ppm. Recent ground-based column measurements by solar-viewing Fourier transform spectrometers (FTS) have bettered this level of precision, and indeed achieve comparable accuracy once calibrated by aircraft in situ measurements [Washenfelder et al., 2006].

[5] Satellite measurements of the CO2 column have the potential to add the perspective of global coverage with much greater sampling density, providing insight into sources and sinks on both regional and continental scales. Surface flux inversion model calculations have shown that a precision of 2.5 ppm (1 part in 160) or better is necessary in regional monthly averages for CO2 [Rayner and O'Brien, 2001].

[6] Measurements from space of sunlight reflected from the Earth's surface, which can be made in the near IR and visible regions, are most sensitive to the lower troposphere where CO2 is most variable. Thus such measurements are expected to be more valuable for understanding sources and sinks than measurements of thermal radiation in the mid-IR, which are primarily sensitive to the middle and upper troposphere [Crisp et al., 2004; Chevallier et al., 2005]. Space-based near-IR observations have become available from the SCIAMACHY instrument onboard ENVISAT. CO2 retrievals from these observations have been reported by, e.g., Buchwitz et al. [2006] and Barkley et al. [2006].

[7] The NASA Orbiting Carbon Observatory (OCO) is being developed specifically to address the need for such observations [Crisp et al., 2004; Miller et al., 2007]. OCO will measure the profile-weighted mean mixing ratio of CO2 in dry air, XCO2 (also called the dry-air mole fraction). Unlike any previous mission, it will be dedicated to measurement of XCO2 with the precision and coverage needed to retrieve surface sources and sinks, and is scheduled for launch in December 2008. Its most critical objective will be to measure XCO2 with 1–2 ppm precision on regional scales (1000 × 1000 km) on semimonthly intervals for up to 2 years. OCO carries a grating spectrometer designed to measure the absorption of reflected sunlight by near infrared CO2 bands near 1.61 and 2.06 μm and in the O2 A-band near 0.765 μm with resolving power ∼17,000–21,000. It will employ two routine observing modes. In nadir mode, the observatory will point the instrument toward the local nadir. This mode provides the highest spatial resolution, and is expected to provide the most reliable data over bright land surfaces and in regions with patchy clouds. In glint mode, it will point the instrument at the glint spot, where sunlight is specularly reflected from the surface. This mode is expected to yield more reliable data over ocean and ice covered surfaces, which are relatively dark at near infrared wavelengths.

[8] The OCO retrieval algorithm is being developed to retrieve XCO2 from near-infrared radiance spectra. This algorithm is designed to minimize bias due to observing conditions, interfering species, and instrument characteristics. Each measured spectrum includes information about the atmospheric, surface, and instrument properties. We have chosen to use a state vector which incorporates as many of these properties as are likely to vary significantly during the course of a few soundings, and to apply a loose constraint to the state vector through the a priori covariance matrix.

[9] In this paper we will describe the inverse method of the OCO “first principles” retrieval algorithm, which will retrieve XCO2 by modeling the measured spectra as rigorously as is practical. The products of the algorithm will be discussed in depth, and we perform a detailed study to characterize the algorithm, testing in particular for any dependence of the derived XCO2 on other environmental quantities and on the algorithm itself. The algorithm has previously been tested, and in part validated, by application to real, existing measurements from the SCIAMACHY satellite instrument [Bösch et al., 2006]. Optimal parameters and procedures for use of the complete algorithm are under development, which is expected to continue until well after launch, through processing of initial OCO data. The algorithm will ultimately be validated for use on OCO measurements through a dedicated validation program, where it will be used to analyze both FTS and OCO data, and compared to independent analyses of the FTS results, which themselves will be validated by airborne, tower, and surface in situ measurements [cf. Crisp et al., 2004; Washenfelder et al., 2006].

2. Inverse Method

[10] The OCO Level 2 algorithm is being developed to retrieve the profile-weighted CO2 dry air mixing ratio, XCO2, from near-infrared radiance spectra. Although the retrieval algorithm will be primarily used for simultaneously fitting the O2 A-band at 0.76 μm and the CO2 bands at 1.61 μm and 2.06 μm, it can be applied to many different retrieval problems. The algorithm was designed with a flexible scheme to enable retrievals from many different atmosphere/surface/instrument scenarios. For example, Bösch et al. [2006] evaluated the algorithm by retrieving XCO2 from both SCIAMACHY nadir measurements of sunlight reflected from the Earth's surface and cotemporal measurements of direct sunlight made from a ground-based high-resolution FTS. The major components of the OCO Level 2 algorithm are the forward model and the inverse method. A description of the forward model is given by Bösch et al. [2006]. It consists of a radiative transfer model of the atmosphere coupled to a model of the solar spectrum to calculate the monochromatic spectrum at the top of the atmosphere, which is then convolved with the response function expected for OCO. The details of the inverse method are presented below.

2.1. Formulation and Implementation

[11] This section uses the notation and concepts of Rodgers [2000]. The spectrum, or measurement vector y, is expressed symbolically as y = F(x) + ɛ where x is the state vector, F is the forward model, and ɛ is the vector of measurement errors.

[12] The solution of the OCO inverse method is the state vector equation image with maximum a posteriori probability, given the measurement y. On the basis of earlier experience, our inverse method employs the Levenberg-Marquardt modification of the Gauss-Newton method. The operational inverse method consists of a set of routines which are essentially mathematical and independent of the physics embodied by the measurement and state vectors. This implies that the structure of both vectors may be varied, so the routines are readily applied to other experiments, such as SCIAMACHY or ground-based FTS [Bösch et al., 2006]. The ability to retrieve XCO2 from space-based and ground-based measurements using the same algorithm is critical for detecting and removing biases from the space-based data and forms a critical component of the OCO validation strategy [Crisp et al., 2004].

[13] We solve for the state vector update dxi+1, using a slightly modified form of Rodgers's [2000] equation 5.36, to improve numerical accuracy by avoiding inversion of a large matrix:

equation image

where K is the weighting function matrix, or Jacobian, K = equation image, xa is the a priori state vector, Sa is the a priori covariance matrix, Sɛ is the measurement covariance matrix, and γ is the Levenberg-Marquardt parameter.

[14] After each calculation of dxi+1, before using it to update xi, we assess the impact of nonlinearity using the difference between the effect of dxi+1 on the forward model calculation and its effect on a linear approximation to the forward model. If the effect is above an empirically determined threshold, we reject dxi+1, increase γ, and calculate a new value of dxi+1.

[15] After each successful iteration, we test for convergence. To facilitate that, we compute the error variance derivative

equation image

where equation image denotes the covariance of the retrieved state, using the relation

equation image

dσi2 is effectively the square of the state vector update in units of the solution variance.

[16] If dσi2 < n (the number of state vector elements), convergence is reached. We then update the state vector a final time.

[17] Last, we compute the retrieval covariance matrix, equation image, and the averaging kernel matrix A. equation image is given by

equation image

The averaging kernel matrix A is given by

equation image

Finally, the degrees of freedom for signal are given by the trace of the matrix A; the degrees of freedom for the CO2 profile are the trace of the CO2-only submatrix.

2.2. Structure of OCO Measurement and State Vectors

[18] As described in section 1, OCO will measure radiance with three spectrometers, with band pass chosen to include mostly CO2 in two of them, and O2 in the third. The spectrometer wave numbers are approximately 4800–4900 cm−1 (CO2), 6170–6270 cm−1 (CO2), and 12,950–13,190 cm−1 (O2), divided into 1024 channels each. The measurement vector consists of radiances of all three spectrometers, notionally 3072 channels.

[19] OCO will formally retrieve height-dependent quantities on a pressure grid nominally having 12 levels from the ground to near the stratopause. The exact contents of the operational state vector will be optimized on the basis of experience. In the current study, the full OCO state vector equation image includes the 61 elements shown in Table 1.

Table 1. OCO State Vector
DescriptionNumber of Elements
CO2 vmr12
H2O vmr12
T profile12
Surface pressure1

[20] The surface albedo is modeled by two parameters in each band, the mean and a slope, and the instrumental dispersion in each band by additive and multiplicative values, or “shift and stretch.”

[21] The XCO2 value is obtained by averaging the retrieved CO2 profile, weighted by the pressure weighting function, h, (defined in section 2.3), such that XCO2 = hTequation image. The formal error variance in the retrieved XCO2 is therefore given by σ2XCO2 = hTequation imageh.

[22] We emphasize that the OCO retrieval problem as described is underdetermined. There are typically 1.0–1.5 degrees of freedom for the CO2 profile (compared to 12 profile levels) and ∼20 total degrees of freedom (compared to 61 elements). However, the use of an a priori constraint guarantees that the problem is well posed and well conditioned. Although there is little more than one degree of freedom for the CO2 profile, and thus little ability to discriminate between altitudes, the profile, rather than just the CO2 column, is formally retrieved. This is to allow for the large variations of CO2 in the boundary layer; such variations can produce bias in the results if a fixed profile shape is used in the retrieval.

[23] In future, other quantities may be included in the state vector for research purposes, or operationally as dictated by experience. For example, experiments retrieving more than one independent type of aerosol are ongoing.

2.3. Pressure Weighting Function

[24] We define the pressure weighting function h to relate the local CO2 mixing ratio values specified on the discrete pressure levels to the profile-weighted average such that such that XCO2 = hTequation image. The vector h represents the pressure intervals assigned to the state vector levels, normalized by the surface pressure. The calculation of the pressure weighting function, h, is described in Appendix A.

2.4. Operational Error Analysis

[25] A series of standard error analysis calculations are performed as part of each retrieval to characterize the retrieval results and quantify their uncertainties. The retrieval uncertainties and averaging kernels are calculated from the measurement Jacobian, K of equation (1), evaluated at the retrieved state, the measurement covariance matrix, and the a priori covariance matrix. From these we compute the column averaging kernel, aCO2, the uncertainty due to smoothing and interference, equation imagec, and the correlation to XCO2, equation image1. These last three quantities are defined and discussed below. A complete list of operational retrieval products is given in Table 2.

Table 2. Inverse Method Products Recorded With Each Sounding
χ23sum of squares of normalized residuals in each spectrometer
equation imagenretrieved state vector
equation imageiindiagonal elements of equation image (error covariance matrix)
equation imageCO2q2CO2-only submatrix of equation image
equation image1nqcorrelation of XCO2 with non-CO2 elements of x
ACO2q2CO2-only submatrix of averaging kernel (A)
aCO2ncolumn averaging kernel
equation imagecnerror in XCO2 due to smoothing and interference
XCO21profile-weighted CO2 mole fraction
σ2noise1variance of XCO2 due to measurement noise
σ2smooth1variance of XCO2 due to smoothing
σ2interfere1variance of XCO2 due to interference
σ2total1total variance of XCO2 (σ2noise + σ2smooth + σ2interfere)
df1degrees of freedom (full state vector)
dCO21degrees of freedom (CO2 profile only)

2.4.1. Column Averaging Kernel, aCO2

[26] Let equation image be the retrieved CO2 mixing ratio profile, and equation image be the retrieved vector of all non-CO2 quantities, i.e.,

equation image

and let

equation image

so equation image = (hTA)j.

[27] Now consider only the CO2 part of the state vector, the first q elements ui, i = 1 to q and define the column averaging kernel

equation image

The column averaging kernel aCO2 has the property that its elements all equal 1 in the “ideal” case, where the retrieved XCO2 responds to changes in u exactly as the true value of the profile-weighted mixing ratio. For a real retrieval, the elements of aCO2 may be more or less than 1, and will have values much less than 1 in regions where the a priori CO2 profile is important.

2.4.2. Smoothing and Interference Due to the State Vector, equation imagec

[28] The vector equation imagec captures the smoothing and interference (or “cross-talk”) errors in XCO2 due to each element of equation image. Thus equation imagec reveals the sensitivity of the retrieved XCO2 value to uncertainties in elements of the state vector. It may be derived from the full averaging kernel matrix as follows.

[29] The error in XCO2 is given by

equation image

Equation (9) is an adaptation of equation (7) of Rodgers and Connor [2003]. The first term in (9) represents smoothing error, the second interference error, and ɛu all other sources of error. Here Auu and Aue are submatrices of A, representing the CO2-only component and the cross-talk components (those which mix elements of the CO2 profile u and the non-CO2 elements e), respectively. I is the identity matrix.

[30] It follows that the error in XCO2 due to each state vector element is given by

equation image

where σj is the error in element j. Since hj ≡ 0 for j > q, the full matrix A may be used in place of the submatrices in equation (9).

[31] Alternatively, (10) may be written

equation image

The first q elements of equation imagec, corresponding to the CO2 profile, are components of the smoothing error. The remaining elements represent the interference, or cross-talk error.

2.4.3. Correlation of XCO2 With Non-CO2 State Vector Elements, equation image1

[32] We define equation image1, the correlation of XCO2 with the non-CO2 state vector elements, to aid the diagnosis and understanding of cross-talk. As above, we assume the CO2 profile occupies the first q elements of the state vector, and define a matrix H with dimension n × (nq + 1) such that

equation image


equation image

and the correlation matrix of HTequation image is given by HTequation imageH, where equation image is the correlation matrix corresponding to equation image.

[33] equation image1 is the first row of HTequation imageH, where equation image1 = (1 ρXCO2,j …) for j = q + 1, n. The second and subsequent elements of equation image1 are the correlation coefficients of XCO2 with each non-CO2 element of the state vector.

3. Using the OCO Data Products With Models of Surface Flux

[34] The OCO retrieved profile-weighted mixing ratio and its variance, XCO2 and σ2total, are insufficient to infer accurate surface fluxes using inversion or data assimilation methods. It is also necessary to simulate what OCO would measure for a given CO2 profile, atmospheric state and surface.

[35] Ignoring error due to sources other than smoothing and interference, the retrieved state may be written

equation image

where x is the true CO2 profile.

[36] Thus XCO2 retrieved by OCO may be written

equation image

where XaCO2 is the a priori value, hTxa.

[37] The retrieved OCO value may be simulated from a model profile xm as

equation image

[Rodgers and Connor, 2003]. We refer to this as the “convolved model.” Now subtracting (15) from (14),

equation image

So the difference between the data, XCO2, and convolved model, XCO2m, is independent of the a priori value, XCO2a. Further, the averaging kernel only enters to smooth the difference between the true and the model profile. Thus if the model profile equals the real atmospheric profile, smoothing error is nil, and the measured and modeled values of XCO2 would be equal in the absence of other error sources. The variance of equation image is given by equation image

[38] Thus comparing XCO2 retrieved from OCO to models requires both the OCO column averaging kernel aCO2 and the a priori profile, xa, in addition to the estimates of smoothing and total error, σ2smooth and σ2total. All of these are contained in the standard OCO data product (Table 2).

4. Error Analysis: Prospective Examples for Nadir Observations

4.1. Off-Line Analysis

[39] An off-line error analysis code has been developed which extends and complements the operational analysis described above. It is intended to provide a means of performing rapid, interactive, in-depth experiments to analyze retrieval behavior. The off-line code performs a linear analysis using Jacobians calculated by the operational OCO forward model. It allows rapid assessment of changes in the noise and a priori covariances used in the retrieval, quantification of the effect of best estimate errors in the instrument, forward model, and of atmospheric variability. It enables evaluation of the amount and type of information available, and the degree of cross-talk between state vector elements.

[40] The off-line calculations may also be more detailed and more realistic than those performed operationally. For example, if forward model errors are included in the Sɛ matrix used operationally in equation (1), the retrieved state may be systematically biased by the a priori state. Thus we evaluate the effect of forward model errors off-line. Further, evaluation of the smoothing error strictly requires the covariance of the ensemble of true states, Sc, which is not necessarily equal to the a priori covariance Sa [Rodgers and Connor, 2003]. Experience has shown that the a priori constraint, embodied in Sa, should be as uniform as practical over all soundings. However, the covariance of true states, Sc, varies with latitude, longitude, and season. Estimates of Sc are readily included in the off-line error estimates.

[41] Equations (17)(21) follow the definitions of Rodgers [2000] and Rodgers and Connor [2003]. Given K, equation image, and Sa, we first characterize the operational retrieval by calculating the averaging kernels (equation (5)), the gain function

equation image

and the closely related quantities aCO2 and equation imagec (equations (8) and (10)).

[42] We then specify a list of estimated errors to include in the calculation. At the same time, we provide the ensemble covariance Sc, the forward model parameter covariances Sb and the measurement Jacobian with respect to the forward model parameters, Kb (there will typically be distinct Sb and Kb for each type of forward model error).

[43] For each error in the list, we calculate the resulting covariance of the retrieved state vector, as follows. For measurement error,

equation image

For forward model error,

equation image

For smoothing error,

equation image

For interference error, which refers to error in CO2 caused by non-CO2 components of the state vector

equation image

where Sec is the ensemble covariance for the non CO2 elements e. Finally, for each error source, we calculate the resulting variance of XCO2 via σ2XCO2 = hTequation imageh, and sum the various error components.

4.2. Case Studies

[44] We present an analysis of nadir observations in six cases, encompassing summer and winter at 3 sites, namely Park Falls, Wisconsin, USA (46°N) (“PF”), Darwin, Northern Territories, Australia (12°S) (“DA”), and Lauder, New Zealand (45°S) (“LA”). These are not intended to be comprehensive of all relevant conditions, but to be examples of the range of conditions expected in observations over land at low and middle latitudes. They are also 3 of the sites where high-precision CO2 column measurements are made, which will be used for OCO validation. The goal of OCO is to produce reliable CO2 values in cloud-free footprints with aerosol optical depth τ < 0.3. The footprint of OCO was made unusually small (3 km2) to increase the number of soundings which are cloud-free, and a cloud screening algorithm is under development to identify them. Aerosol opacities >0.3 are not considered here.

[45] This analysis includes our best estimates of noise errors and geophysical variability, and a treatment of spectroscopic error, and is meant to be representative of our current understanding. A full error analysis, including instrument error, is beyond the scope of this study, and awaits completion and characterization of the OCO instrument hardware. Tests of the complete instrument will be conducted at the Jet Propulsion Laboratory during 2007–2008, followed by a complete prospective error analysis before launch.

4.2.1. Assumptions for Noise, Albedo, and Operational Constraints

[46] An example of the three spectra to be measured by OCO in each sounding is shown in Figure 1. Operationally, we expect to use a measurement noise covariance, Sɛ, which is diagonal, with values derived for each sounding by the operational calibration algorithm. These values consist of a constant component plus one varying as the square root of incident intensity. Thus noise will vary with scene brightness, so for present purposes we use the best available estimate of noise for each wave number and each scene. These vary significantly with surface type and spectral region, as given in the Table 3.

Figure 1.

Three spectra observed by OCO on each sounding, calculated for Park Falls in July.

Table 3. Continuum Signal-to-Noise Ratio (SNR) for Aerosol Optical Depth = 0.1 for O2 A-Band, Weak CO2, and Strong CO2

[47] The albedo of each scene varies in a similarly complex manner. We have assumed the values in Table 4 which are taken from the ASTER Spectral Library (Vol. 1.2, available through http://asterweb.jpl.nasa.gov).

Table 4. Mean Surface Albedo for O2 A-Band, Weak CO2, and Strong CO2

[48] The spectra and the Jacobians have been simulated using H2O and temperature profiles from the ECWMF ERA 40 data set, CO2 profiles and surface pressure from the MATCH/CASA model run [Olsen and Randerson, 2004], each interpolated in time and space. We have used an exponentially decreasing, tropospheric aerosol profile with a scale height of 2 km, which has been scaled to reproduce the different total aerosol optical depths. The aerosol optical properties are for a continental type.

[49] Increasing the aerosol optical depth from 0.01 to 0.3 mostly increases the signal observed by the satellite instrument, depending on surface albedo and solar zenith angle. For very bright surface, aerosol extinction can also result in a decrease of the signal. For the O2 A-band region, we observe an increase of the signal of ∼1% for the continuum and by up to 10% for strongly absorbing regions. For the weak CO2 band, the intensity increases relatively smoothly throughout the band by several percent for bright surface and up to 50% for snow. The largest effect of aerosol is observed for the strong CO2 band region with an intensity increase for a vegetated surface of 10% for the continuum and up to 30% for strongly absorbing lines. For snow surfaces, the signal in the strong CO2 band region is dominated by contributions from aerosol scattering and we observe an increase between 130 and 170%.

[50] In addition to tropospheric aerosol, we have also included a fixed stratospheric aerosol profile, which is based on SAGE 2 measurements. The surface albedo is from the ASTER database as described above.

[51] The a priori covariance matrix, Sa, has been constructed out of several submatrices arranged along its diagonal, one for each of the physically distinct components of the state vector as given in section 2.2, i.e., CO2, H2O, etc. These components are assumed independent of each other, so all elements not within the submatrices are set to zero.

[52] The CO2 submatrix naturally has the most impact on retrieval of XCO2. In this paper all results use a single CO2 covariance matrix. That covariance has been constructed by assuming a root-mean-square (rms) variability of XCO2 of 12 ppm, which is an estimate of global variability [Dufour and Breon, 2003]. Variability as a function of height is assumed to decrease rapidly, from ∼10% at the surface to ∼1% in the stratosphere. The covariance among altitudes in the troposphere is derived from aircraft observations at Carr, Colorado (P. Tans, private communication, 2003). The total variability embodied in this covariance is unrealistically large for most of the world (all relatively clean air sites). It is intended to be a minimal constraint on the retrieved XCO2; the use of a single covariance everywhere at all times eliminates the covariance matrix as a source for variation in retrieval characteristics.

[53] The a priori covariance for the other state vector components has been set as follows. In all cases, we have assumed variability as large as expected at any of the sites, to minimize dependence on the a priori state vector itself. For H2O, we have used the observed variability at Park Falls in July, 1.4 cm precipitable water. For temperature, we have similarly used the covariance calculated for Park Falls in July, but imposed variability increasing in the lower troposphere to 10 K at the surface. The surface pressure is assumed to have a standard deviation of 20 mbar. For aerosol, total optical depth varies by ±0.15; the variability decreases from 150% at the surface to 50% with altitude with a scale height of 2 km; it has a correlation length of 1 km. The uncertainty in mean albedo is formally set to the unphysical value of ±1 (which effectively nullifies any a priori influence), with a slope which implies a variation of ±0.5 at each end of the spectral range.

4.2.2. Ensemble Variability and Error Sources

[54] We have considered 3 classes of error source: random, systematically variable, and fixed. Random errors include noise and some portion of each type of geophysical error, for example surface pressure. The last two classes merit more extended description. Certain errors will vary systematically with time and place, producing a bias in comparative values of XCO2. For example, errors due to aerosol will vary systematically with optical depth and zenith angle. We assume that such errors will not be reduced by averaging, but will produce a consistent error in XCO2, and class them as “bias” errors. This is an inherently conservative assumption, likely to overestimate the effect of these errors.

[55] The class of fixed error sources consists of those inherently constant everywhere at all times. They are exemplified by error in spectroscopic line parameters, which will be discussed in the next section.

[56] The submatrices making up the ensemble covariance matrix, Sc, used for calculating smoothing and interference error, are in principle chosen to be the best estimate available of actual atmospheric variability at the time and site in question. For the CO2 submatrix, the a priori covariance has been scaled to correspond to XCO2 variability observed with solar viewing FTS instruments by the authors at the Park Falls and Lauder sites, namely 2.5 and 0.7 ppm, respectively. For Darwin, 0.7 ppm is used as a representative value for the Southern Hemisphere.

[57] It will be seen below that interference error, that is, between CO2 and other state vector components, is relatively small, and so does not depend critically on the actual ensemble variability. In the current work we have made the following assumptions.

[58] For H2O and for temperature we have calculated covariance matrices based on ECMWF H2O for each site. For surface pressure, a 5-mbar standard deviation is based on pressure measurements at Park Falls. The aerosol covariance is based on an ad hoc constraint using a Markov description with a scale length of 1 km. The aerosol covariance has been subsequently scaled to reproduce a standard deviation for the total aerosol optical depth of ±0.09 for the Northern Hemisphere, and values half that in the Southern Hemisphere, which is based loosely on MODIS data for nonpolluted conditions. For albedo and spectral dispersion we have assumed the ensemble variability equals the a priori variability.

4.2.3. Spectroscopic Errors

[59] Spectroscopic errors belong to the class of error sources which are truly fixed. Unfortunately, because of the varying amount of information in each measured spectrum relative to the a priori constraint, the resulting errors in retrieved XCO2 are not fixed. In fact, the errors vary systematically with the observing conditions, depending principally on ground albedo, aerosol optical depth, and, most importantly, solar zenith angle. Fixed errors in XCO2 would leave the gradients unchanged, and so not affect flux inversions. However, the implications of such systematic variations in the XCO2 errors are profound; uncorrected, they will introduce significant biases into the determination of relative changes in XCO2, and consequently into inference of surface flux. This malign influence would be somewhat ameliorated if the correlations of the systematic error among soundings were properly taken into account.

[60] We have adopted a two-pronged approach to deal with spectroscopic errors. First, we will minimize them by a combination of laboratory measurements and the calibration of spectral measurements from upward looking spectrometers by in situ aircraft measurements. Washenfelder et al. [2006] performed an extensive comparison of upward looking FTS data taken at Park Falls, Wisconsin to simultaneous aircraft measurements in situ. The FTS data were analyzed in one of the CO2 spectral bands to be used by OCO (λ ∼ 1.61 μm) as well as an adjacent CO2 band, and the 1.27 μm band of O2. They concluded that by appropriate adjustments to the band strength and air-broadened line width of these bands, that they could achieve an absolute accuracy of 0.25% in the O2 column, and 0.3% of the CO2 profile-weighted mean mixing ratio.

[61] The calibrated FTS column measurements are combined with laboratory measurements by the following procedure. It is observed that the retrieved column depends on assumed line width differently for weak and saturated lines, being proportional to width for saturated lines, and approximately proportional to the square root of width for weak lines. Thus we assume that the column measurements, C, are proportional to the product of the band intensity I and a single parameter characterizing line width, W:

equation image

where we expect β = 1 for the strong O2 A-band, and β = equation image for the weaker CO2 bands.

[62] In effect, we then have two estimates of each parameter characterizing strength and width, one directly from laboratory measurement of that parameter, and the other by solving equation (22). These two estimates may be formally combined in a variance-weighted average, which statistically is the most likely value of the parameter, given the two independent estimates, and which itself has inverse variance given by the sum of the two inverse variances.

[63] In practice, we calculate the variance of this weighted average by performing a notional a posteriori retrieval of the intensity and width parameters, using the FTS column as the “measurement” and the laboratory values and uncertainties as “a priori information” on the intensity and width. For present purposes, it is a notional retrieval only because we are only interested in the net uncertainties in the strength and width parameters, not their values.

[64] There has been intensive laboratory work done recently on spectroscopy relevant to OCO. The near infrared bands of CO2 have been analyzed by Toth et al. [2006a, 2006b] and Devi et al. [2007]. On the basis of their work, we assume uncertainties in laboratory data of 0.5% for band strength and 1.0% for air-broadened line widths. The same investigators are now studying the O2 A-band. For present purposes, we will assume that uncertainties comparable to those for the CO2 bands will be achieved. We use the procedure described above with these values, and the FTS uncertainties given by Washenfelder et al. [2006] for the measured columns. The resulting estimate of net uncertainties are given in Table 5.

Table 5. Spectroscopic Uncertainties Used for CO2 and O2
 Uncertainties, %
Column MeasurementLaboratory StrengthLaboratory WidthNet StrengthNet Width

[65] Second, we will assess the residual errors in XCO2 and their variability by the techniques used in this paper, combined with the net width and strength errors shown in Table 5. To that end, we have analyzed twelve cases covering the realistically useful range of solar zenith angle, albedo, and aerosol optical depth, specifically zenith angles of 10° and 70°, albedo of 0.05, 0.2, and 0.5, and aerosol optical depth of 0.01 and 0.3.

[66] The critical issue is not the magnitude of the errors themselves, but rather how they vary with observing conditions, and thus spatially and temporally. Therefore we take the RMS variability of the errors to be indicative of typical bias errors in XCO2 gradients, and the maximum difference in the errors as the upper limit, for typical observing conditions, of XCO2 gradient error due to spectroscopic parameters. These values are shown in Table 6. They show that spectroscopic error, particularly CO2 air-broadened line width error, is a potentially important source of variable bias, but one which is usually smaller than 1 ppm, and always within our 1–2 ppm target. Nevertheless these results emphasize the need for both accurate laboratory data and for precise FTS column measurements at sites representing a broad range of observing conditions, well calibrated by aircraft overflights wherever possible. The OCO validation plan incorporates all of these elements in an effort to reduce such bias errors.

Table 6. Difference in Bias Error in XCO2 Due to Spectroscopic Parametersa
Error SourceRMS DifferenceMaximum Difference
  • a

    Unit is ppm.

CO2 band strength0.10.2
CO2 line width0.41.3
O2 band strength0.10.5
O2 line width0.20.6
H2O band strength<0.1<0.1
H2O line width<0.10.2
All spectroscopic0.41.1

[67] It is also interesting to examine the averaging kernels for these cases, to understand how variations in observing conditions result in bias error. Figure 2 shows the column averaging kernels for 4 cases, as detailed in the Figure 2 caption. Case 1 corresponds to low aerosol values, high albedo (bright surface), and high sun, while case 4 has high aerosol for the same albedo and zenith angle. Similarly, case 2 has high aerosol, low albedo (dark surface), and low sun, while case 3 has the same except for high sun. The difference in XCO2 error from spectroscopic parameters is largest between cases 2 and 3, which have very different averaging kernels, owing to a large change in zenith angle in the presence of a dark surface and high aerosol scattering. Conversely, cases 1 and 4 show that changing aerosol makes little difference given a bright surface and high sun.

Figure 2.

Averaging kernels spanning a wide range of three key observing parameters, solar zenith angle θ, surface albedo α, and aerosol optical depth τ. For case 1, θ = 10°, α = 0.5, and τ = 0.01. For case 2, θ = 70°, α = 0.05, and τ = 0.3. For case 3, θ = 10°, α = 0.05, and τ = 0.3. For case 4, θ = 10°, α = 0.5, and τ = 0.3.

[68] The fact that fixed spectroscopic errors produce variable bias errors in XCO2 at all is worthy of comment. The fundamental reason for this behavior is the realistic assumption made in the a priori covariance input to the retrieval algorithm, that CO2 is more variable near the surface than at higher altitudes. This assumption has greater or lesser effect on the result, depending on the signal-to-noise ratio of the measured spectrum. The alternative, to assume an uncertainty uniform with altitude, would inhibit the algorithm from responding to real CO2 changes in the lower troposphere, and produce a bias dependent on the degree of boundary layer enhancement. Clearly, such enhancement is less well known than aerosol optical depth or albedo, which are retrieved from the data, or solar zenith angle.

4.2.4. Case Study Results

[69] The essential results for the six benchmark cases are shown in Figures 3 and 4and in Table 7. The six cases fall naturally into 2 groups, corresponding to high and low solar elevation. The high sun group includes Park Falls and Lauder in summer (July and January, respectively) and Darwin in both seasons, while winter at Park Falls (January) and Lauder (July) are low sun. We have performed error analyses for three values of aerosol optical depth at 0.76 microns, namely τ = 0.01, 0.1, 0.3. These values are expected to span the useful, cloud-free observing range.

Figure 3.

(a) Column averaging kernels for aerosol optical depth τ = 0.01. The four high sun cases are Lauder in January, Park Falls in July, and both months at Darwin. (b) Same as Figure 3a but for τ = 0.3.

Figure 4.

Degrees of freedom for the CO2 profile for the six standard cases (Park Falls, Darwin, and Lauder in January and July), for τ = 0.1.

Table 7. Error Sources and Magnitudes for the Benchmark Casesa
Error SourceXCO2 Errors in ppm
Single SoundingRegional Average
RandomRandom + BiasRandomRandom + Bias
  • a

    RSS is the square root of the sum of the squares of the listed error sources.

Park Falls, Jan    
Measurement noise2.382.380.170.17
Water vapor0.
Surface pressure0.
Park Falls, Jul    
Measurement noise0.660.660.050.05
Water vapor0.
Surface pressure0.
Darwin, Jan    
Measurement noise0.540.540.040.04
Water vapor0.
Surface pressure0.
Darwin, Jul    
Measurement noise0.580.580.040.04
Water vapor0.
Surface pressure0.
Lauder, Jan    
Measurement noise0.570.570.040.04
Water vapor0.
Surface pressure0.
Lauder, Jul    
Measurement noise1.421.420.100.10
Water vapor0.
Surface pressure0.

[70] Figure 3 shows the column averaging kernels for the six cases. In Figure 3a, τ = 0.01, while in Figure 3b, τ = 0.3. In the lower optical depth case, all six cases are very similar, with values near 1 at the surface, decreasing to ∼0.7 at the tropopause.

[71] With increased aerosol optical depth (Figure 3b), winter at Park Falls and Lauder begin to stand out from the group, showing sensitivity slightly decreased at the ground but increased in the middle or upper troposphere. Both the winter results are driven by the same two factors: low sun and perturbed surface albedo, relative to the other cases. These factors act primarily to reduce the spectral signal-to-noise. In January at Park Falls, a snow covered surface is assumed, while at Lauder in July, “frost” is assumed. The snow surface is highly reflective in the O2 A-band, but very dark in the CO2 bands. The frost surface is intermediate, being similar to snow in the A-band, and similar to vegetation in the CO2 bands. The effect of low sun is for the contribution of light reflected from aerosol in the upper troposphere to grow relative to light reflected from the ground. This is exacerbated by lower albedo in the CO2 bands for snow covered surfaces.

[72] As illustrated in Figure 3, the effect of varying optical depth is modest for the conditions considered. For that reason, we have chosen to present all subsequent results for the intermediate case of τ = 0.1. The degrees of freedom for the CO2 profile are shown in Figure 4. There are approximately 1.4 to 1.5 degrees of freedom for the high sun cases, 1.2 for winter at Lauder, and 1.0 for winter at Park Falls.

[73] Table 7 shows detailed estimates of the contributions to XCO2 error for the six cases, both for a single sounding and for regional, semimonthly averages. The regional averages are taken to apply to an area 1000 × 1000 km. Because of the intensive computational demands of the OCO data, it is planned to process only 2% of the spectra with the algorithm described here, at least initially. Thus the regional averages are assumed to include typically 200 soundings. It is shown in Table 7 that the dominant random error, as expected, is noise. Also included in Table 7 are typical bias errors due to spectroscopic parameters, from Table 6. These are the dominant source of error in the semimonthly, regional averages.

[74] It is worth noting that, as a group interference errors (such as from the temperature profile) become both relatively and absolutely more important as sources of bias in the low signal-to-noise cases (Park Falls in January and Lauder in July). Even in those cases, noise is an insignificant contributor to regional average error. Thus uncertainties in the SNR values of Table 3 are unlikely to be important for regional averages.

[75] Concern is often expressed over the extent to which the a priori influences the retrieved result. Figure 5 shows the ratio of the retrieved to the a priori uncertainty for each state vector element, for the case of July at Park Falls. This ratio is a measure of the extent to which the measurement and retrieval process has reduced uncertainty. Values less than ∼0.5 indicate that the information in the measurement dominates the a priori information. Values near 1 show the a priori is dominant. It may be seen from Figure 5 that the measurement is overwhelmingly dominant for the albedo, spectral dispersion, and surface pressure. For CO2, the a priori plays a small role at each altitude, but for the column integral XCO2, the measurement dominates the a priori strongly. The temperature, water vapor, and aerosol profiles are fairly well determined by the measurement at lower altitudes, but strongly constrained by the a priori at higher altitudes.

Figure 5.

Ratio of retrieved uncertainty to a priori uncertainty (“error reduction”) for Park Falls in July, τ = 0.1. CO2, H2O, temperature, and aerosol are each represented as profiles at 12 pressure levels from the ground to the stratopause. Within each profile, pressure decreases with increasing state vector index. Albedo and dispersion are described by six parameters each (see section 2.2).

[76] Error in the retrieved values of XCO2 due to smoothing (CO2 profile variability) and interference is also a cause for concern. Figure 6 shows the quantity equation imagec (equation (16)), a direct estimate of this error due to each element in the state vector, again for the Park Falls July case. Note first that the errors are small for all state vector elements. The largest errors, 0.1–0.2 ppm, are due to smoothing by high-altitude CO2, and interference from tropospheric aerosol.

Figure 6.

Smoothing and interference error in XCO2 due to each element of the state vector, for Park Falls, July, τ = 0.1. State vector index (y axis) as described for Figure 5.

[77] Figure 7 summarizes the single sounding results from Table 7, separating error sources into noise, smoothing, interference, and forward model (the only forward model error considered here is spectroscopy). Noise is the largest error in all cases. Spectroscopic error is usually next, though interference error becomes larger in Park Falls winter. Note that smoothing is essentially negligible in all cases. Further note that interference error is near-negligible for the high sun cases. For the winter scenes with low sun, and snow covered scenes with low albedo in the CO2 bands, interference is significant, but noise becomes the dominant error.

Figure 7.

Summary of single sounding errors in XCO2 for the six standard cases, aerosol optical depth τ = 0.1.

5. Discussion and Conclusions

[78] The OCO inverse method has been developed with the needs of the atmospheric modeling community in mind. Its routine products will include all quantities needed to understand the information content of the measurement, its uncertainty, and its dependence on interfering atmospheric properties.

[79] An off-line procedure will be used for complete analysis of errors in the OCO results. We illustrate the procedure here, and produce error estimates, including noise, geophysical, and spectroscopic errors for nadir observations over land at low and midlatitudes. The single-sounding errors due to these sources are expected to be ∼0.7–0.8 ppm for high sun conditions, and ∼1.5–2.5 ppm for low sun (winter). The estimates of single sounding error are dominated by noise, and, in the high sun cases, will roughly scale with the actual noise value. However, noise is expected to be unimportant for semimonthly regional averages, where errors from these sources are predicted to be <1 ppm for all conditions.

[80] Instrument errors are not considered in detail here. Nevertheless, calculations have been performed for indicative instrument errors, which show that they are likely to produce retrieved XCO2 errors comparable to those from the sources considered in this paper. The most important instrument error source is likely to be uncertainty in the instrument line shape. After calibration and characterization of the flight instrument is complete, the instrument error will be fully evaluated.

[81] Smoothing error is very small, largely because the column density of CO2 is well determined by the spectra, but partly by design; we have chosen the retrieval parameters to minimize dependence of retrieved XCO2 on the a priori state vector, while using it to maintain numerical stability. Use of the column averaging kernel and a priori state will remove residual dependence on a priori and allow unambiguous comparison to model calculations.

[82] Systematic errors due to spectroscopy are the major source of bias, of the error sources considered, in determining gradients in XCO2. The OCO validation program has been designed to detect and correct such bias. We hope in future to develop a characterization of any residual bias in terms of a global covariance matrix, relating the error in a given semimonthly, regional average to that in every other semimonthly, regional average. These errors will be nearly fully correlated among such averages, reducing their impact on flux inversions.

Appendix A:: Calculation of the Pressure Weighting Function

[83] The pressure weighting function, h, is defined so that XCO2 = hTequation image. We describe the calculation of the elements of h, hi, for i = 1, q, where q is the number of levels. Note that hi = 0, for i = q + 1, n, where n is the number of elements in the state vector.

[84] In order to calculate h, the pressure interval in each layer must be conceptually divided by assigning fractions of it to the two adjacent levels, in such a way that integrating over all levels conserves both total pressure and CO2 column. Let u(p) be the CO2 mixing ratio as a function of pressure, and let an infinitesimal pressure interval dp, between levels i and i + 1, be divided between the adjacent levels in proportions given by the following:

equation image

These definitions conserve both total pressure and CO2 column, as can be checked trivially. Then the pressure interval assigned to level i is the integral of dgi from level i − 1 to level i + 1.

[85] Now if u varies linearly with a function F(p), then

equation image

If we adopt the interpolation rule that u(p) varies linearly in ln p, this implies

equation image


equation image


equation image

Note that the definitions (A1) of dgi and dgi+1 are indeterminate if u is constant, but u drops out of the final expressions, (A4) and (A5), for them.

[86] Now hi equals the integral of dgi over the two layers adjacent to level i, divided by the surface pressure:

equation image

For the edge layers, if i = 1, only the first term applies, while if i = q, only the second term applies.

[87] After some algebra,

equation image

where we have kept the two terms separated for ease of calculating the edge layers. Note that if p decreases with increasing i, hi is formally negative, thus we have taken the absolute value.


[88] Part of the research described in this paper was performed at the Jet Propulsion Laboratory, California Institute of Technology, under contract with NASA. Additional support was provided by the Orbiting Carbon Observatory (OCO) project, a NASA Earth System Science Pathfinder (ESSP) mission. This work received additional funding from the New Zealand Foundation for Research, Science, and Technology, contract C01X0204.