The analysis of radiance measurements from the Atmospheric Infra-Red Sounder (AIRS) has been providing the first global maps of CO2 concentrations in the cloud-free upper troposphere. This paper explores the usefulness of this data for the estimation of CO2 surface fluxes. It appears that atmospheric mixing makes the upper tropospheric CO2 concentrations rather zonal, which indicates that AIRS data inform about very broad features of the surface fluxes only. Further, such a small variability imposes a stringent constraint on the size of retrieval biases and of transport model biases for the estimation of CO2 surface fluxes. We show that latitude-dependent biases larger than a few tenths of a particle per million (ppm), at least south of 25°N, would harm the inversions. Significant improvements to the concentration retrieval algorithms and to the transport models are a prerequisite for the inversion of surface fluxes from AIRS.
 Satellite data play an outstanding role in the monitoring of the Earth atmosphere, not only for numerical weather prediction, but also for the study of chemical compounds, like ozone or carbon monoxide. However, no satellite instrument is yet operational, that was designed for the observation of CO2. The CO2-dedicated Orbiting Carbon Observatory (OCO) and the Greenhouse gases Observing Satellite (GOSAT) will not be launched until 2008. In the mean time, expectations have risen based on theoretical studies [Rayner and O'Brien, 2001; Houweling et al., 2004] and on the retrieval of CO2 concentrations from existing instruments built for other purposes than CO2 mapping [e.g., Chédin et al., 2003] despite the entanglement of many signals in the satellite radiances [e.g., Houweling et al., 2005]. In particular, the high spectral resolution of the Atmospheric Infra-Red Sounder (AIRS) flown on-board the National Aeronautics and Space Administration's (NASA) Aqua platform provides significant information about the CO2 concentration in the upper troposphere [Engelen and Stephens, 2004]. This instrument has been operated since 2002 and CO2 retrieval algorithms have been developed at several institutes [Crevoisier et al., 2004; Engelen et al., 2004; Chahine et al., 2005]. This study is the first one to consider the potential utility of such retrievals to infer surface fluxes based on real data. Our analysis is based on the comparison between the AIRS retrievals produced at the European Centre for Medium-Range Weather Forecasts (ECMWF) and the CO2 concentrations simulated by the global climate model of the Laboratoire de Météorologie Dynamique (LMDZ) [Sadourny and Laval, 1984; Hourdin and Armengaud, 1999] using surface fluxes from a climatology.
2. Inferring CO2 Surface Fluxes
 Up to now, Bayesian inference has guided the works on the estimation of CO2 surface fluxes at the global scale [e.g., Gurney et al., 2002]. Let x be a vector of discretized CO2 surface fluxes. Under the assumptions of unbiased Gaussian error statistics, theory indicates that the most probable values of x's components, given some prior fluxes (or background) xb and some measurements of CO2 concentrations y, can be expressed as [e.g., Rodgers, 2000]:
where R and B are the error covariance matrices of the observations and of the background respectively, and H is the transport model that simulates the observations from the surface fluxes x. Expressions equivalent to equation (1) also exist, that provide alternate numerical approaches to compute the same optimal solution. In this study, the LMDZ model, guided by ECMWF meteorological analyses, is the H operator. A detailed evaluation of the realism of the LMDZ model for the simulation of atmospheric chemistry is given by Hauglustaine et al. . For the present application of LMDZ, tracer large-scale advection and subgrid-scale transport is solved on a regular 3.75° × 2.5° (longitude-latitude) grid with 19 sigma-pressure layers in the vertical.
3. Data Sources
 The data assimilation system at ECMWF has been primarily designed to analyze atmospheric variables of direct meteorological significance, like temperature and winds. CO2 has been recently added to the list of the analysis variables, as described by Engelen et al.  and Engelen and McNally . The CO2 analysis is directly controlled by the radiances from the AIRS instrument and benefits from the quality of the other analyzed variables, like temperature, ozone and humidity, that also affect the AIRS measurements. The CO2 analysis is currently restricted to the ECMWF model grid points collocated with the AIRS observations during the assimilation window. For each one of these grid points, a limited set of 18 channels is used to estimate one column value for the upper troposphere above about 600 hPa. In the presence of a cloud in the upper or in the middle troposphere, less channels are used and the retrieval only covers the tropospheric column above the cloud top. The horizontal resolution of the CO2 retrievals is about that of the instrument: 13.5 km at nadir and 41 × 23 km2 at the end of the scans. The data processed here include the 16 million retrievals from year 2003 for which all 18 channels were available (i.e. no cloud above the boundary layer). Note that ECMWF receives one AIRS spot every nine only and in principle more retrievals could be obtained.
 For background information, we use a climatology of carbon fluxes, that include anthropogenic and natural components. Fossil fuel CO2 emissions are from the EDGAR3.0 emission database [Olivier et al., 1996]. Air-sea CO2 exchange is prescribed from the climatology by Takahashi et al.  with a sink of 1.8 Gt C per year. The biosphere-atmosphere exchange of CO2 is estimated by the Terrestrial Uptake and Release of Carbon (TURC) model [Lafont et al., 2002], which is annually balanced. The daily fluxes calculated by TURC have been redistributed throughout the day to account for the diurnal cycle of the fluxes. The CO2 concentrations at the initial time step of the time window are defined from a simulation using optimized fluxes [Bousquet et al., 2000].
 In order to remove the global bias from the inference system (that may come from the observations or from the background), we calculate an offset of the atmospheric CO2 concentrations by subtracting the mean of the departure statistics y − Hxb from the prior concentrations at the initial step of the time window.
 To compare the LMDZ simulations with the individual AIRS observations, the modelled concentration profiles are first extracted at the same date, time and location as the retrievals. LMDZ upper tropospheric CO2 columns are then defined by convolving the profiles using a typical AIRS weighting function. It peaks at about 200 hPa, with a negligible contribution from the atmosphere below 500 hPa. For consistency with the definition of the ECMWF retrieval, which excludes the stratosphere, the stratospheric part of the weighting function is set to zero individually for each situation. Last, in order to take the influence of the prior information on the retrievals into account, the model values and the prior value are combined using the averaging kernel of each retrieval [e.g., Rodgers and Connor, 2003].
Figures 1 and 2 illustrate the comparison between the simulation using the prior fluxes and the AIRS observations. They condense the information in terms of seasonal cycles (Figure 1) and of zonal means (Figure 2). They show that the datasets share some common features in the tropics, where they both describe a similar seasonal cycle. Large systematic differences (i.e. up to a few ppm) appear at higher latitudes, with a significant land vs. sea contrast in the observations, that does not exist in the model. The differences out of the tropics are not surprising, since lower tropopause heights and smaller temperature lapse rates make the retrievals less reliable there [Engelen et al., 2004]. The simplicity of the weighting function with which the model is convolved may also contribute to the biases. This aspect is currently under investigation.
 The analysis of the variability of the individual estimates from the two datasets is particularly informative for flux inversions. This study focuses on the data variability as a function of latitude. Regional variability could be discussed similarly. As shown in Figure 3, the model displays a north-south gradient of variability with a standard deviation less than 3 ppm. The AIRS estimates behave differently. Their standard deviation is about 3.7 ppm within 25° from the equator and rises above 6 ppm at high latitudes, similarly in both hemispheres. This standard deviation actually results from the retrieval error superposed upon the natural variations of CO2. As for the model, its relatively coarse resolution is expected to smooth the simulated variability, but there are good reasons to trust its latitudinal variations, that are based on known differences in vegetation, fossil fuel emission and transport. For the observations, the retrieval accuracy diminishes from the equator to the poles [Engelen et al., 2004]. The bowl-shaped curve in Figure 3 seems to indicate that the retrieval variability is dominated by the retrieval error and that the natural variability of CO2 does not exceed a couple of ppm, at least south of 25°N. For comparison, the model variability at the surface in the tropics is five times larger than in the upper troposphere. The reduced upper tropospheric variability improves the representativeness of measurements there. On the other hand, such measurements can only constrain the broad scales of the surface fluxes.
 From equation (1), one may notice that the Bayesian analysis increments xa − xb simply equal the observation-minus-background departures d = y − Hxb weighted by the gain matrix K = BHT (HBHT + R)−1. The linear dependency of the increments as a function of the departures means that observations harm the inversion (i.e. the analysis xa is worse than the background xb) if they are fraught with biases of the order of these departures. A poor estimate of the gain matrix K or a biased background xb degrades the analysis as well, but this is a separate issue. Owing to the small values of the model variability versus the AIRS one, the standard deviation of the departures d is about that of the observations. With departures standard deviations about 4ppm in the tropics, biases should be not larger than about 0.4ppm (i.e. one order of magnitude less than the departures) to have negligible impact on the flux inversion. Larger biases can be tolerated at higher latitudes only due to larger departures there. Now, in equation (1) the observation error is measured with respect to the forward model H, which is assumed to be perfect. It therefore combines the actual forward model errors, the representativeness errors of the measurements and the retrieval errors. Ensuring its biases to be not larger than a few tenth of ppm in a latitude band is particularly challenging. On the retrieval side, this may be achieved by processing the AIRS channels with more complementary information about temperature, aerosols and clouds. On the model side, the quality of the subgrid parameterization (boundary layer turbulence and moist convection) is essential. The bias requirement needs to be re-evaluated when more retrieval datasets are available, but improvements to the retrieval algorithms usually reduce the variability of the products together with their errors.
 The prominent role of carbon dioxide in the energy balance of the Earth system makes it important to monitor the temporal and the spatial variations of its concentration in the atmosphere. For the first time, global maps of CO2 in the upper troposphere are available from remote sensing. They are of direct relevance for the evaluation of transport models, in addition to the high quality but sparse aircraft measurements. The comparison between the AIRS retrievals and a simulation using the LMDZ transport model shows some agreement in terms of broad geographical and temporal features in the tropics only. A more detailed analysis will be reported elsewhere (Y. Tiwari et al., manuscript in preparation). Together with the retrievals of carbon monoxide [McMillan et al., 2005] and of methane, the AIRS data may also contribute to the understanding of extreme events, like plumes from biomass-burning. However, from the current version of the AIRS retrievals at ECMWF, CO2 concentrations in the upper troposphere appear to follow a rather zonal structure with variations usually less than 2 ppm at any given latitude south of 25°N. This weak variability makes it fundamental that latitude-dependent biases are kept within a few tenths of ppm in this portion of the world for quantitative use of the concentration retrievals, in particular for assimilation in a transport model. Upper bounds for the regional biases could be investigated in a similar way. The small variability highlights the limited information brought by high altitude concentrations on surface fluxes and indicates that only the very broad scales can be constrained. From the previous analysis, it is not surprising that the surface fluxes inverted with the AIRS data have not proved to be of particularly good quality so far (result not shown). A better situation is expected when accurate estimates of CO2 that include the boundary layer are available, for instance from OCO and GOSAT [e.g., Rayner and O'Brien, 2001].
 Authors wish to thank P. Bousquet, F.-M. Bréon, P. Ciais and P. Rayner for fruitful interactions, and F. Marabelle for computer support. This study was co-funded by the European Union under the project GEMS. C. Barnet (NOAA/NESDIS) and one anonymous reviewer provided constructive comments on an earlier version of this paper.