We investigated differences in the five currently-available datasets of column-integrated CO2 concentrations () retrieved from spectral soundings collected by Greenhouse gases Observing SATellite (GOSAT) and assessed their impact on regional CO2 flux estimates. We did so by estimating the fluxes from each of the five datasets combined with surface-based CO2 data, using a single inversion system. The five datasets are available in raw and bias-corrected versions, and we found that the bias corrections diminish the range of the five coincident values by ~30% on average. The departures of the five individual inversion results (annual-mean regional fluxes based on -surface combined data) from the surface-data-only results were close to one another in some terrestrial regions where spatial coverage by each dataset was similar. The mean of the five annual global land uptakes was 1.7 ± 0.3 GtC yr−1, and they were all smaller than the value estimated from the surface-based data alone.
Obtaining detailed information on the distribution and temporal variability of surface CO2 sources and sinks, or fluxes, is essential for better understanding the mechanisms and dynamics of the global carbon cycle [e.g., Wigley and Schimel, 2000]. Toward this end, efforts have been devoted to inferring surface CO2 fluxes from observed gradients of atmospheric CO2 concentrations using Bayesian inverse modeling techniques. Earlier studies successfully obtained flux estimates on subcontinental scales using CO2 data from surface networks of monitoring stations [e.g., Gurney et al., 2002; Baker et al., 2006; Bruhwiler et al., 2011], but estimates for undersampled parts of the globe, particularly Africa, South America, and Asia, were associated with large uncertainties. To augment the number and spatial coverage of the CO2 data and reduce the flux uncertainties, it was suggested to use space-based spectral soundings of surface-reflected sunlight in the short-wave infrared (SWIR) wavelength range from which column-integrated CO2 concentrations () can be retrieved [e.g., Rayner and O'Brien, 2001; Houweling et al., 2004]. Rayner and O'Brien  demonstrated that the satellite-based global retrievals can reduce uncertainties in regional flux estimates substantially if data from the surface-based monitoring stations were augmented by the retrievals with precisions of 1–2 ppm (~0.5%; on a regional scale with no bias). The first attempts at retrieving space-based were made with soundings collected by the SCIAMACHY (SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY) instrument aboard Envisat [Buchwitz and Burrows, 2004].
To obtain high-resolution SWIR spectra for retrieving with a targeted precision of less than 1%, the TANSO (Thermal And Near-infrared Sensor for carbon Observation) Fourier transform spectrometer aboard Greenhouse gases Observing SATellite (GOSAT) [Kuze et al., 2009] was placed in orbit in January 2009. In the GOSAT research community, there currently exist five retrieval algorithms developed by four groups: the National Institute for Environmental Studies (NIES), Japan (NIES v02 and PPDF-S) [Yoshida et al., 2013; Oshchepkov et al., 2013b], the NASA Atmospheric CO2 Observations from Space (ACOS) team (ACOS B2.10) [O'Dell et al., 2012], the Netherlands Institute for Space Research/Karlsruhe Institute of Technology, Germany (RemoTeC v2.0) [Butz et al., 2011; Guerlet et al., 2013], and University of Leicester, UK (UoL-FP v3G) [Boesch et al., 2011; Cogan et al., 2012]. These algorithms have already gone through several updates since the launch of the satellite. Although the algorithm improvement efforts continue, recent comparisons of the five retrievals to ground-based reference data obtained at the observational sites of the Total Carbon Column Observing Network (TCCON) [Wunch et al., 2011a] showed that the mean and standard deviation of the GOSAT-TCCON differences are on the order of a few tenths of a percent [e.g., Oshchepkov et al., 2013a]. With this progress, the first attempts at estimating CO2 fluxes from the GOSAT-based retrievals were made by multiple inverse modeling groups, and the results were cross-compared in the GOSAT CO2 inversion intercomparison campaign. The goal is to assess the range of differences and the added value of using GOSAT-based retrievals in the flux estimation. In the initial stage of the campaign, each group used their choice of inverse modeling scheme and retrieval dataset in obtaining their flux estimates. The result of the first assessment, focused on a 1 year period from June 2009 to May 2010, will be reported elsewhere. An independent analysis, focused on year 2010, was done elsewhere [Chevallier et al., 2014].
For evaluating differences in flux estimates that are based on various modeling setups and concentration datasets, it is critical to know individual contributions from (1) the inverse modeling systems and (2) the retrievals. We herein report the result of the latter assessment, which was obtained by estimating CO2 fluxes from the five different retrieval datasets using a single inverse modeling system, for the same 1 year period between June 2009 and May 2010.
2 Data and Method
2.1 Differences in XCO2 Retrievals
The flow of data processing common to all of the five retrieval algorithms is as follows: (1) prescreening of GOSAT Level 1B SWIR spectral radiance data for perturbations by clouds and aerosols, (2) simulating the measured radiance spectra with a forward radiative transfer model, (3) retrieving by optimizing the fit to the observed spectra, and (4) postscreening for low-quality retrievals. The details of the implementation of these steps vary among the individual retrieval algorithms. Some key differences among the algorithms, as well as the number of successful land retrievals yielded by each algorithm over the analyzed period, are shown in the upper part of Table 1.
Table 1. Key Differences in the Five Retrieval Algorithms
Only the retrievals based on GOSAT Level 1B spectral radiance data collected in high-gain mode (including oceanic retrievals) were used in this study.
Only the terrestrial retrievals are available.
Number of retrieval layers/levels are fixed (layer thickness or level varies with surface pressure).
Number of retrieval levels varies with local surface pressure (only the number of the lowest few levels changes).
CAI: Cloud and Aerosol Imager onboard GOSAT.
Retrieved with an O2 A-band-only algorithm based on an assumption of no clouds and aerosols present.
Each retrieval found within a ±2° grid box centered at each of 11 TCCON sites was compared with TCCON data that were averaged over ±30 min of GOSAT overpass time. The 11 TCCON sites are Sodankylä (67.368°N, 26.663°E), Bialystok (53.230°N, 23.025°E), Bremen (53.104°N, 8.845°E), Orleans (47.970°N, 2.113°E), Garmisch (47.476°N,11.063°E), Park Falls (45.945°N, 90.273°W), Lamont (36.604°N, 97.486°W), Tsukuba (36.051°N, 140.122°E), Darwin (12.424°S, 130.829°E), Wollongong (34.406°S, 150.879°E), and Lauder (45.038°S, 169.684°E).
The assessment of biases in the obtained values is an integral part of postretrieval data validation. The lower part of Table 1 lists global-mean GOSAT-TCCON differences of the five retrieval datasets. Results based on both bias-corrected and raw datasets (in parentheses) are shown. Biases in PPDF-S, ACOS, RemoTeC, and UoL-FP datasets were analyzed and corrected using multivariate linear regressions with which spurious variabilities in values were correlated with retrieval parameters such as surface albedo. The regression-based bias analysis for the NIES dataset (v02.00) was underway at the start of the GOSAT CO2 inversion intercomparison campaign, and for the current study the bias was corrected by raising each retrieved value by a global-mean GOSAT-TCCON difference (1.2 ppm). While debates on how to best analyze and correct biases outside the TCCON sites still continue, efforts are also devoted to investigating the causes of the biases. For more detailed descriptions on each of the five algorithms, including the bias correction approaches adopted, we refer the readers to a report on GOSAT retrieval algorithm intercomparison by Oshchepkov et al. [2013a] and literature listed in Table 1.
Figure 1 shows the standard deviations (SD) of collocated retrievals by the five algorithms for July 2009. The left panel shows SDs of coincident retrievals to which bias corrections were applied, and the right panel presents those of uncorrected retrievals. Note that the geographical distribution of these coincidences does not represent that of any particular retrieval dataset (see Figure S1 in the supporting information for the distributions of five datasets for July 2009). Only a fraction of five datasets was found to coincide (see Figure S2 for coincidences in other months); thus, values on these figures do not represent the spatial coverage of the individual datasets. Yet these figures indicate that the application of bias correction diminishes the spread among the five retrievals in most areas. The global-mean SDs of the bias-corrected and raw retrievals for July 2009 were 1.2 and 1.8 ppm, respectively. Over the whole analysis period, the global-mean SDs turned out to be 1.2 ppm (minimum: 0.2; maximum: 4.5) and 1.6 ppm (minimum: 0.2; maximum: 5.4), respectively. Despite that the bias correction reduced the global mean biases to nearly zero (Table 1), SDs of GOSAT-TCCON differences, both before and after the application of bias correction, remain approximately 2 ppm. We considered this 2 ppm uncertainty as a random error associated with the current versions of retrieval datasets and took it into account in the flux estimation as the GOSAT data uncertainty (described in the next section).
2.2 Experimental Setup
The inverse modeling setup used here is described in detail by Maksyutov et al. . In brief, the system makes use of version 08.1i of the NIES atmospheric tracer transport model [Belikov et al., 2013] for the simulation of CO2 concentration and a fixed-lag Kalman Smoother optimization scheme [Bruhwiler et al., 2005] for the inference of monthly fluxes of 42 subcontinental terrestrial regions and 22 oceanic basins. The a priori flux data used here consist of two emission inventories (anthropogenic and biomass burning emissions) and two model simulations (terrestrial biosphere and ocean fluxes) as described by Maksyutov et al. .
We estimated monthly regional fluxes and their uncertainties from each of the five retrieval datasets that were combined with GLOBALVIEW-CO2 (GV) surface-based network data, which are generated by smoothing, interpolating, and extrapolating selected CO2 measurements that represent baseline conditions [GLOBALVIEW-CO2, 2011]. Data from 220 surface monitoring locations, including airborne sites, were used (see upper left panel of Figure S1 for locations). Following Law et al. , we shifted offshore the locations of all coastal sites used in order to account for the selective measurements reflected in GV data. For estimating monthly fluxes on a subcontinental scale, we used monthly-mean concentration data; after performing the forward concentration simulation of each GV and value, the GV values were monthly-averaged, and the retrievals were gridded to 5° × 5° cells and averaged on a monthly basis. We chose to regularize the retrievals this way to reduce the potential influence of differences in the number of retrievals each algorithm yields (Table 1; the maximum difference is as large as ~40,000 retrievals per year) and in their horizontal coverage (Figure S1) on the flux estimation as much as possible. Not considered here were 5° × 5° cells with less than three retrievals per month. The uncertainties for the GV values were taken from residual SDs about smooth curves that are stored in the GV dataset, and those for the retrievals were determined as SDs of retrievals found in each of 5° × 5° grid cells in a month (all-data mean SD: 1.6 ppm; range: 0.02–7.8 ppm).
Following Law et al. , we took account for errors associated with both the measurement and the forward concentration simulation by setting minimum uncertainties for the GV and values at 0.3 and 3.0 ppm, respectively. The minimum uncertainty for retrievals is based on the above-mentioned uncertainty associated with retrieval (2.0 ppm) and error in the simulation of vertical column concentrations (~1.0 ppm) as reported by Belikov et al. .
3.1 Spread of Five Estimated Fluxes Due to Differences in XCO2
Presented in Figures 2a and 2b are the mean and SD of the five independent monthly fluxes for July 2009 estimated from the bias-corrected retrievals. The fluxes shown include anthropogenic emissions. The influence of the retrievals on these regional flux estimates is not uniform but depends, among other factors, on the availability of both retrievals and GV data within and around each region. To identify flux estimates on which retrievals had large influence, we show in Figure 2c the uncertainty reduction rate (UR), which represents the degree to which retrievals contribute to constraining regional fluxes. Following Takagi et al.  and Maksyutov et al. , the rate in percent is given as
where σGV and σGV + GOSAT denote the uncertainties of fluxes estimated from the GV data alone and both the GV and retrievals, respectively. Figure 2c shows the mean of five UR values. To distinguish cases with pronounced influence by GOSAT retrievals from those in ambiguity, we set a threshold of 10% UR, which comes from doubling the annual mean URs of Amazonian regions (regions 9 to 12) whose fluxes were constrained by data collected in distant regions since both GV data and retrievals were nearly not present in these regions throughout the analyzed year. In Figure 2b, terrestrial regions with URs greater than the threshold are indicated with asterisks. The statistical consistency of these above-UR-threshold GV + fluxes with the corresponding GV-only values, which determines whether the GV-GOSAT joint estimation is a refinement of the GV-only case, is ensured by the fact that among the high-UR GV + fluxes (total of 767 monthly estimates in the analyzed year; five flux datasets total), 93% of them were found within the uncertainty ranges (flux estimated ± a posteriori uncertainty) of the corresponding GV-only values, and in the remaining cases (7%), their uncertainty ranges overlapped those of the corresponding GV-only values.
Flux SDs for these high-UR regions ranged from 0.2 (region 18) to 0.6 (region 39) gC m−2 day−1, and each of these SD values was found to be nearly equal or smaller than the mean of the corresponding a posteriori flux uncertainties. In the case of region 39 (associated with the largest SD in the analyzed period), the spread between the largest and smallest flux estimates among the five results was 1.2 gC m−2 day−1, which translated into a maximum SD of five a posteriori concentrations of 3.7 ppm (Figure 2d; SD of monthly mean concentrations simulated on a 2.5° × 2.5° grid at 0.975 sigma level within region 39).
3.2 Annual Mean Fluxes
To investigate the larger-scale influence of the differences in the five retrievals on the flux estimation, we calculated annual global mean fluxes (net) and land/ocean partitions (without anthropogenic emissions) for each of the five inversion results. The values were obtained by aggregating the monthly regional fluxes and are listed in Table 2 (unit: GtC yr−1). The mean of the five annual global land uptakes was 1.7 ± 0.3 GtC yr−1. Relative to the GV-only result, all five results show reduction in global terrestrial biosphere uptake or enhancement in respiration.
Table 2. Annual Mean Fluxes in GtC yr−1
Mean and SD of Five Results
Land and ocean uptakes do not include anthropogenic emissions. Land uptakes include biomass burning emissions.
To further explore this commonality, we show in Figure 3 annual regional fluxes estimated from GV data alone (Figure 3a) and the mean of five GV + annual regional fluxes (Figure 3b). The anthropogenic and biomass burning emissions are not included here. Figure 3c shows the mean and SD of the departure of each of the annual mean GV + estimates from the GV-only result. The values are shown as GV + minus GV-only result. Similar to the approach presented in the previous section, we identified annual regional flux estimates with pronounced influence of GOSAT retrievals based on annual-mean UR values. Those are marked with asterisks in Figure 3b and colored in Figure 3c. URs of temperate North America and Australia regions were below the threshold because the fluxes were constrained more strongly by surface-based data because of their uncertainties that are smaller than those of retrievals. URs of upper boreal regions (> ~ 60°N) were low because GOSAT retrievals were only available during the local summer months. Oceanic URs were all below the threshold, and therefore, only the terrestrial results are presented in Figure 3c.
Integrated over the 11 continental-scale TransCom terrestrial regions [Gurney et al., 2002] (the names of the 11 regions are listed at the bottom of Figure 3c), the GV-only annual estimates in Figure 3a shows a pattern of tropical land regions (tropical America, tropical Africa, and tropical Asia) being CO2 sources and Northern Hemisphere extratropics (temperate North America, Europe, and boreal Eurasia) being CO2 sinks, which agrees with the results of surface-based, long-term inversion studies previously reported [Baker et al., 2006; Gurney et al., 2008; Bruhwiler et al., 2011]. The GV + result in Figure 3b shows the same pattern, but in the finer 42 terrestrial-region subcontinental-scale framework (Figure 3c), it indicates uptake reductions or respiration enhancements in northern parts of South America region (regions 15 and 16), southeastern boreal Eurasia (region 26), and northeastern temperate Asia (region 32), which partly account for the changes of the global terrestrial uptake values from the GV-only result shown in Table 2. It also shows uptake enhancements or respiration reductions in northern parts of South Africa region (regions 23 and 24) and southwestern temperate Asia (region 30).
4 Discussion and Concluding Remarks
Among the departures of the high-UR GV + flux estimates from the GV-only results presented in Figure 3c (colored), values for regions 16, 23, 24, and 26 are associated with small SDs (<0.1 GtC yr−1), indicating that the flux estimates are less dependent on the choice of dataset. The spatial coverage that each of the five 5° × 5°-gridded datasets shows over these regions was found to be similar to one another throughout the analyzed year. The number of 5° × 5°-gridded data that cover region 16 for July 2009, for instance, is nearly the same among the five datasets (8 to 9; see Figure S3 for the spatial coverage). On one hand, the departures for the remaining colored regions (15, 17, 18, 22, and 29 through 32) are variable with SDs greater than ~0.2 GtC yr−1. The error bars of the values for regions 18, 22, 29, and 31 cross the zero departure line in Figure 3c, showing that the sense of the five departure values (enhancement or reduction) was not uniform in these cases. The larger SDs may be linked to the following: (1) the agreement among retrievals within and around these regions, which did not appear in Figures 1 and S2, was difficult to reach, and/or (2) the horizontal distribution of the number of available retrievals was quite different from dataset to dataset. While the former link remains to be unclear, the spatial coverage by each of the five 5° × 5°-gridded datasets was found to be different from one to another, particularly over the temperate Asia regions (see Figure S3). The number of 5° × 5°-gridded data that cover region 32 (temperate Asia NE) in July 2009, for instance, varied from 6 to 20 and that of individual values (not averaged to monthly-gridded values) counted in the same region and month ranged from 57 to 161 (see Figure S1 for the distribution differences).
How strongly fluxes are constrained in the inversion (as reflected in UR values) depends on the number and geographical locations of the observations and the data uncertainty prescribed to them. The influence of differences in horizontal data coverage on a posteriori flux estimates has been addressed in previous surface-data-based inversion studies by Law et al.  and Bruhwiler et al. . The implication is that the impact of the differences in the number of retrievals may be more pronounced if they were processed in the inversion without any application of data number regularization as in the present study. A check on the sensitivity of SDs of the departures (shown in Figure 3c) to changes in the minimum uncertainty for the retrievals reveals that with a reduction by 1 ppm (reduced from 3 to 2 ppm; meaning more constraint exerted by retrievals), SDs of the temperate Asia departures increase by ~23%. Care should be taken in analyzing flux estimates of regions in which the number of retrievals varies largely from dataset to dataset.
We herein demonstrated that GOSAT retrievals can improve the ability of the inversion to resolve surface CO2 sources and sinks and therefore have the potential of providing carbon cycle researchers with new information on the magnitude and distribution of the fluxes for poorly sampled regions in south America, Africa, and Asia that remained practically unresolved before. However, efforts at reducing the inter-dataset differences, which include analyzing and correcting the biases outside the TCCON reference data sites as well as removing the biases themselves at the level of retrieval algorithm development/tuning, are needed to increase the consistency of GOSAT-based CO2 flux estimates.
The GOSAT Project is a joint undertaking of three organizations: the Japan Aerospace Exploration Agency, the National Institute for Environmental Studies (NIES), and the Japanese Ministry of the Environment. The authors would like to thank the members of the GOSAT Project for their contribution to this work. The computational resources were provided by NIES. The meteorological data used in the forward transport modeling were provided from the cooperative research project of the JRA-25 long-term reanalysis by Japan Meteorological Agency and Central Research Institute of Electric Power Industry. We thank P. Wennberg, R. Kivi, N. Deutscher, T. Warneke, R. Sussmann, D. Griffith, and J. Robinson for making their TCCON measurements available for this study. R.J.A. was sponsored by U.S. Department of Energy, Office of Science, Biological and Environmental Research programs and performed at Oak Ridge National Laboratory under U.S. Department of Energy contract DE-AC05-00OR22725. A.B. was supported through the Emmy-Noether programme of Deutsche Forschungsgemeinschaft, grant BU2599/1-1 (RemoTeC). The authors would like to thank the anonymous reviewers for their helpful and constructive comments that contributed to improving the manuscript.
The Editor thanks two anonymous reviewers for their assistance in evaluating this paper.