A series of observing system simulation experiments is presented in which column averaged dry air mole fractions of CO2 (XCO2) from the Greenhouse gases Observing SATellite (GOSAT) are made consistent or not with the transport model embedded in a flux inversion system. The GOSAT observations improve the random errors of the surface carbon budget despite the inconsistency. However, we find biases in the inferred surface CO2 budget of a few hundred MtC/a at the subcontinental scale, that are caused by differences of only a few tenths of a ppm between the simulations of the individual XCO2 soundings. The accuracy and precision of the inverted fluxes are little sensitive to an 8-fold reduction in the data density. This issue is critical for any future satellite constellation to monitor XCO2 and should be pragmatically addressed by explicitly accounting for transport errors in flux inversion systems.
 The global distribution of CO2 arises from atmospheric transport acting on sources and sinks. Reversing causality, we can use concentration measurements to improve our knowledge about surface fluxes by adopting a probabilistic approach, usually called ‘flux inversion’. Flux inversion is mostly formulated within a Bayesian framework in order to combine 1) concentration observations and their error statistics, 2) prior flux information and their error statistics, and 3) a numerical model of atmospheric transport and its error statistics. Robust estimation of model transport error has raised serious concerns in the scientific community in view of the large differences between models [e.g., Gurney et al., 2002] and between models and observations [e.g., Stephens et al., 2007]. For instance, some studies have found uncertainties at the subcontinental scale of several tenths of a GtC/a in the inverted fluxes resulting only from the choice of transport model, when exploiting surface measurements of CO2 concentrations [Gurney et al., 2002]. It has been argued that model transport error is less important for vertically integrated columns of CO2 [Rayner and O'Brien, 2001], as provided by satellites, but there are still stringent accuracy requirements on models and observations [Chevallier et al., 2005]. Studying the concept of a space-borne LIDAR instrument, recent work has shown that discrepancies in the inverted fluxes (i.e. the maximum of the flux probability density functions -PDF- after the inversion) of 0.1 GtC/a per 106 km2 arising from model differences only [Houweling et al., 2010]. These large uncertainties in the end products of the inversion systems currently limit their utility for political decision-making about the carbon cycle and motivate further research.
 Our paper examines the impact of the differences found between the current version of two global transport models: “LMDZ” from the Laboratoire de Météorologie Dynamique [Hourdin et al., 2006] and the GEOS-Chem transport model [Suntharalingam et al., 2004; Palmer et al., 2008]. The two models have been developed by distinct groups and have been applied to two flux inversion systems: a variational system for LMDZ [Chevallier et al., 2007] and an ensemble Kalman filter for GEOS-Chem [Feng et al., 2009]. This paper is a first attempt to assess how different the resulting CO2 fluxes will be from using these different systems when they process real satellite observations from the pioneer GOSAT platform [Yokota et al., 2009]. We present a series of one-year Observing System Simulation Experiments (OSSEs). These OSSEs have been designed so that the same flux inversion system exploits simulated retrievals of XCO2, that are either consistent or not with the atmospheric model embedded in the inversion system. In the presence of complicated model and observation error statistics, empirical strategies have been adopted in the past [e.g., Chevallier, 2007], two of which (observation error inflation and data thinning) are tested here in an attempt to minimize the problems encountered from the model differences. The paper is structured as follows. Section 2 describes the data and the models. Results are presented in Section 3 and Section 4 concludes the paper.
2.1. GOSAT Observations
 We are interested here in the XCO2 products that can be generated from the GOSAT radiation measurements in the near infrared spectral range (reflected solar radiation). GOSAT has been acquiring data since February 2009, but for simplicity the present simulations rely on the instrument and platform specifications as available from the scientific literature. We apply them to the atmospheric conditions of the year 2006. The XCO2 retrievals are characterized using the retrieval algorithm developed for the Orbiting Carbon Observatory (OCO) mission [Bösch et al., 2006] which has recently been adapted to GOSAT. Using the technical specification of GOSAT given by Suto et al. , we calculate scene-specific error estimates and 12-level averaging kernels for the XCO2 retrieval for five surface types, eight solar zenith angles and five aerosol optical depths. We assume that XCO2 is retrieved from a simultaneous fit to two CO2 bands (1.61 and 2.06 μm) and the O2 A Band (0.765 μm) and that CO2 is retrieved together with water vapour, temperature, surface albedo, surface pressure and the aerosol optical depth. The GOSAT look-up table is similar to the OCO ones used by Feng et al.  reflecting the similarity between the observed spectral bands of the two instruments. The results of Section 3 do not depend on the prior concentration profiles, which are set to zero.
 GOSAT is in a sun-synchronous orbit with a three-day repeat cycle and an equator crossing time of 1:00 p.m. in descending node. For simplicity, we assume that all soundings are acquired along the satellite ground-track only. We discard observed scenes that are contaminated by cloud and or have aerosol optical depths >0.3 by using seasonal PDFs for cloud and aerosol optical depths derived from the Moderate Resolution Imaging Spectrometer (MODIS) and Multi-angle Imaging Spectrometer (MISR) instruments [Feng et al., 2009]. In order to minimize possibly large correlated errors in the transport model and the observations at the subgrid scale, only one scan per orbit and per 3.75° × 2.5° grid box is kept for the XCO2 simulations, which leaves about 330,000 individual soundings for one year (starting from about 20 times more individual soundings).
 The error budget associated with the look-up table does not account for radiative transfer model uncertainties, like errors related to the description of aerosols. We assign 1 ppm to errors associated with the radiation model, representation, and transport model. We sum the variance of the three components and that of the tabulated retrieval error to assign the observation errors in the inversion system. The resulting sounding uncertainty varies between 1.8 and 7.2 ppm, which is comparable to the simple GOSAT error model of Chevallier et al. .
2.2. Transport Models and Boundary Conditions
 We use the LMDZ model [Hourdin et al., 2006] at a horizontal resolution of 3.75° × 2.5° (longitude-latitude) with 19 vertical levels. The simulation of the atmospheric flow is constrained by nudging towards the winds analysed at the European Centre for Medium-Range Weather Forecasts. We also use the GEOS-Chem chemistry transport model (v8-02-01) at a horizontal resolution of 2.0° × 2.5° (latitude-longitude-) with 40 vertical levels. The model is driven by GEOS-5 meteorological analyses from the NASA Goddard Global Modelling and Assimilation Office.
 A set of reference CO2 fluxes is defined that includes 3-hourly fluxes for the terrestrial biosphere, monthly ocean fluxes, monthly biomass burning fluxes and yearly fossil fuel emissions. More details are given by Chevallier et al.  for the first two types of fluxes and by Palmer et al.  for the other two. Surface fluxes, like meteorology, correspond to year 2006. The 3D global field of CO2 on 1 January 2006 (at the start of the simulation) comes from a previous run of LMDZ [Chevallier et al., 2010].
2.3. Inversion System
 The inversion system of Chevallier et al.  estimates weekly CO2 fluxes on a 3.75° × 2.5° (longitude-latitude) grid over a long period of time, typically a year or more, together with the field of CO2 at the start of the assimilation window. The statistically optimal fluxes and initial field are found by iterative minimization of the corresponding Bayesian cost function. The inversion system is used here for a whole year in the configuration that has been finalised for a 21-year reanalysis of surface measurements [Chevallier et al., 2010]. Based on the independent observations, some skill of the system has been demonstrated since the 21-year flux inversion improves the posterior atmospheric simulation compared to a naive standard that simply uses the annual global CO2 growth rate.
 The uncertainty of the inverted fluxes is central to this study. It is estimated from the Monte Carlo approach of Chevallier et al. , in which an ensemble of inversions is built from the statistics of the prior errors and of the observation errors. By construction, the ensemble of the inverted fluxes follows the theoretical (Bayesian) error statistics of the posterior fluxes. This approach can easily be extended to assess the error statistics in the presence of known sub-optimal features, like inconsistent transport modelling. In practice, five successive steps are followed for each OSSE: 1) the reference set of CO2 surface fluxes and CO2 initial field (described in the Section 2.2) is used as boundary conditions to a transport model to generate a set of pseudo observations following the characteristics given in Section 2.1; 2) the pseudo-observations are perturbed consistently with the assumed observation error statistics; 3) the reference CO2 surface fluxes and initial field are perturbed consistently with the assumed prior error statistics; 4) the flux inversion system is applied to the perturbed pseudo-observations (as data) and the perturbed CO2 fluxes and initial field (as prior field); and 5) the error of the inverted CO2 fluxes and initial field is quantified by comparison to the reference fluxes in terms of biases and standard deviation.
 The method is applied several times with different perturbations each time, in order to compute the inversion error statistics. The present study exploits ensembles of four one-year inversions of surface fluxes in eight-day segments. As a result, we have a series of 48 monthly fluxes available at each location of the world for each ensemble, thereby providing stable statistics.
 Two types of ensembles are built. The first one is internally consistent: the transport model in Step 1 and the one used by the inversion system in Step 4 are the same (LMDZ in our case). The second type is not: Step 1 relies on GEOS-Chem while Step 4 keeps LMDZ.
3.1. Direct Simulations
 LMDZ and GEOS-Chem are used to simulate the individual GOSAT XCO2 for the year 2006 with the boundary conditions (surface fluxes and CO2 initial state) described in Section 2.2. In spite of the large scientific and technical differences between the two transport models, a fair agreement is found, with a mean global difference of 0.04 ppm (LMDZ minus GEOS-Chem), a standard deviation of 0.6 ppm and a regression slope of 1.00156 ppm/ppm. By comparison, the variability of the simulated XCO2 is much larger (5.6 ppm standard deviation). No significant trend of the differences as a function of time or XCO2 value is noticed. The 0.04 ppm bias is likely caused by the interpolation of the model grids to the 12-level averaging kernel grids that was found not to conserve mass in the course of this study. At this stage, using a conversion factor of 2.12 GtC/ppm [Denman et al., 2007], we may expect that this bias will be interpreted by the one-year LMDZ-based inversion system as a negative global mass increment of up to −0.08 GtC/a. By comparison, the uncertainty in the global annual carbon fluxes from the NOAA ESRL surface network is about 0.15 GtC/a (given a 0.07 ppm/a uncertainty in the global mean growth rate; see http://www.esrl.noaa.gov/gmd/ccgg/trends/, access 15 June 2010).
3.2. Flux Inversion Results
 We first look at the inversion results in terms of fractional uncertainty reduction (FUR) for the fluxes. This quantity is defined as one minus the ratio of the posterior error standard deviation to the prior error standard deviation. It therefore quantifies the improvement in precision. It may become negative in suboptimal configurations or with insufficient realizations.
Figure 1 shows the FUR for the grid-point weekly CO2 fluxes aggregated at the monthly scale within the 22 TransCom3 regions of Gurney et al.  together with the corresponding prior errors. We acknowledge these are coarse geographical regions but they provide a framework with which to facilitate easy reproducibility of results. The largest uncertainty reductions (up to 80%) are located over vegetated lands while the GOSAT observations bring little information over the oceans [Chevallier et al., 2009]. The FUR is also displayed in the case where a different transport model is used in Step 1 (GEOS-Chem) and in Step 4 (LMDZ) of the OSSE, all other inversion components being consistent. The inconsistent atmospheric transport reduces the FUR by up to 0.36. Still the observations improve all fluxes in terms of random errors at these spatial scales.
 We now investigate the impact of the transport inconsistency in terms of the accuracy of the inverted fluxes. By construction, our prior fluxes are unbiased: their uncertainty is fully characterised by a covariance matrix. In principle, this feature holds for the posterior fluxes in the optimal configuration, even though in practice, our finite ensemble size leaves a global bias of +0.1 GtC/a. In contrast, the global budget of the sub-optimal OSSE is biased by +0.9 GtC/a. This number is larger than our guess of −0.08 GtC/a made in the previous section. The +0.9 GtC increment is actually only one side of a mass dipole. The other one occurs in the CO2 columns at the start of the inversion window, which are biased by −0.15 ppm after the inversion, even though their FUR is 8% (the FUR for this field reaches 12% in the optimal OSSE). This bias corresponds to a mass offset of −0.33 GtC, which counterbalances more than one third of the flux bias. To make the bias inventory complete, we note that the XCO2s are biased after the suboptimal inversion by −0.13 ppm, compared to 0.002 ppm in the optimal case.
Figure 2a shows the geographical distribution of the surface biases of the sub-optimal OSSEs. The circles marked ‘Sup-optimal reference’ correspond to the sub-optimal configuration illustrated so far, with only atmospheric transport being inconsistent. Two additional tests are reported where some features of the sub-optimal OSSE have been modified with the empirical methods in order to damp the biases [Chevallier, 2007]: 1) the data density is thinned everywhere on the globe by a factor of 8 and 2) the observation error variances assigned in Step 4 of the OSSE are inflated by a factor of 4. From Figure 2a, we see that the reference sub-optimal OSSE yields varying positive and negative regional biases. The largest one over land is in region South American Tropical (67 MtC/month). The annual mean bias amounts to 0.53 GtC/a in Europe, nearly twice as much as the European annual sink (about 0.3 GtC/a according to Schulze et al. ). Among the sub-optimal configurations, only the dramatic reduction of observation density improves the bias over Europe (down to 0.23 GtC/a). But even this configuration leaves the bias about unchanged in other regions. These results suggest the transport differences act at large rather than small scales.
 The FUR for the three sub-optimal OSSEs is reported in Figure 2b. The lowest performance is seen for the OSSE with eight times less observations, but the FUR actually varies by less than 0.15 in each region from one case to another. Conversely, we show that increasing the data density in the suboptimal configuration does not efficiently increase the precision of the surface CO2 budget.
 GOSAT is the first satellite to be operated specifically for the remote sensing of CO2. Compared to instruments designed for other atmospheric compounds, GOSAT is not motivated by the mapping of the concentrations as such, but rather by the mapping of the underlying surface fluxes. From the raw radiance measurements to the flux product, a series of sophisticated algorithms are run whose performance has to be carefully evaluated in order to guarantee the quality of the desired product. This paper concentrates on the uncertainties associated with the numerical modelling of atmospheric transport, which is part of this process, by exploiting two state-of-the-art models. For a one year inversion, we find biases in the inferred surface carbon budget of a few hundreds of MtC/a at the scale of subcontinents, resulting from existing differences between the atmospheric simulations of a few tenths of ppm only, that would be scarcely detectable from a validation network. It is noteworthy that this inversion experiment already took model differences into account, in the form of an uncorrelated random component in the assigned observation error, but this representation appears to be insufficient. Attempts to damp the biases by adapting the inversion system have hardly improved the performance of the inversion. The initial 3D field of CO2, which is part of the inverted variables, is also biased after the inversion, with mass offsets of opposite sign compared to the global mass offset from the fluxes. If the initial 3D field was known precisely (which is not realistic) this artificial dipole of mass increments would disappear: we find that the bias of the surface carbon balance would be reduced to 0.2 GtC/a, but large flux biases at the subcontinental scale would remain, up to 66 MtC per month (not shown). We also showed the benefit of adding more XCO2 retrievals in the inversion system saturates early, in contrast to what would be obtained in the absence of transport inconsistencies. We have not considered here the impact of possible biases in the XCO2 retrievals themselves that further hamper the flux estimation.
 If we make the hypothesis that the difference between LMDZ and GEOS-Chem is a fair illustration of the difference between any state-of-art transport model and the truth, this study points at a critical issue for the users of the GOSAT flux products. It is also of concern for the planning of future XCO2-monitoring satellites, as already noted by Houweling et al. . Pragmatic strategies have to be found to address this issue, given the long-standing difficulty in modelling of the subgrid-scale transport processes. We recommend that transport errors be explicitly accounted for in the inversion systems by including some representation of these errors in the inverted variables. Such strategies, initiated for methane inversions by Bergamaschi et al. , will likely exploit information from the surface network, for which transport biases are different, or from inventories, in addition to the GOSAT-type total column retrievals.
 This work was performed using HPC resources from GENCI- [CCRT/CINES/IDRIS] (Grant 2009- t2009012201) and DSM. It was co-funded by the European Commission under the EU Seventh Research Framework Programme (grant agreements 212196, COCOS, and 218793, MACC) and by the British Council under a grant from the Franco-British partnership programme. LF was funded by the UK Natural Environment Research Council under NE/H003940/1. HB was funded by a Research Council UK Fellowship. PR is the recipient of an Australian Research Council Professorial Fellowship, (DP1096309). We are grateful for the assistance of A. Cogan (Univ. of Leicester) for the simulation of the GOSAT spectra.