This report is the second in a series of companion papers describing the effects of atmospheric light scattering in observations of atmospheric carbon dioxide (CO2) by the Greenhouse gases Observing SATellite (GOSAT), in orbit since 23 January 2009. Here we summarize the retrievals from six previously published algorithms; retrieving column-averaged dry air mole fractions of CO2 (XCO2) during 22 months of operation of GOSAT from June 2009. First, we compare data products from each algorithm with ground-based remote sensing observations by Total Carbon Column Observing Network (TCCON). Our GOSAT-TCCON coincidence criteria select satellite observations within a 5° radius of 11 TCCON sites. We have compared the GOSAT-TCCON XCO2 regression slope, standard deviation, correlation and determination coefficients, and global and station-to-station biases. The best agreements with TCCON measurements were detected for NIES 02.xx and RemoTeC. Next, the impact of atmospheric light scattering on XCO2 retrievals was estimated for each data product using scan by scan retrievals of light path modification with the photon path length probability density function (PPDF) method. After a cloud pre-filtering test, approximately 25% of GOSAT soundings processed by NIES 02.xx, ACOS B2.9, and UoL-FP: 3G and 35% processed by RemoTeC were found to be contaminated by atmospheric light scattering. This study suggests that NIES 02.xx and ACOS B2.9 algorithms tend to overestimate aerosol amounts over bright surfaces, resulting in an underestimation of XCO2 for GOSAT observations. Cross-comparison between algorithms shows that ACOS B2.9 agrees best with NIES 02.xx and UoL-FP: 3G while RemoTeC XCO2 retrievals are in a best agreement with NIES PPDF-D.
 Densely sampled global observations of carbon dioxide (CO2) from space, with sensitivity throughout the entire vertical column down to the planetary surface, are recognized as being important for improving our understanding of the spatial and temporal distributions of CO2 in the atmosphere. With the launch of the Greenhouse gases Observing SATellite (GOSAT) that has been in orbit since 23 January 2009, a large number of high-resolution spectroscopic observations of reflected sunlight are available from the National Institute for Environmental Studies (NIES).
 A number of algorithms have been developed in different groups throughout the world to process the GOSAT data to systematically retrieve global and temporal distributions of the gas amounts. The algorithms are focused on the retrievals of column-averaged gas abundance to yield the column-averaged dry air mole fractions of atmospheric carbon dioxide (XCO2) along with, from most algorithms, methane (XCH4). In particular, the NIES and the National Aeronautics and Space Administration (NASA) Atmospheric CO2 Observations from Space (ACOS) projects have routinely processed the GOSAT data with their own operational algorithms to provide the standard data products. The NIES and ACOS algorithms are described in Yoshida et al. [2011, 2012a], and O'Dell et al.  and Crisp et al. , respectively. In addition, research products are provided from the University of Leicester Full Physics (UoL-FP: 3G) [Connor et al., 2008; Boesch et al., 2011; Cogan et al., 2012], the Netherlands Institute for Space Research (SRON)/the Karlsruhe Institute of Technology (KIT) (Remote sensing of greenhouse gases for carbon cycle modeling, RemoTeC) [Butz et al., 2009; 2010; 2011], and NIES (PPDF-D) [Bril et al., 2007; Oshchepkov et al., 2008, 2009, 2011]. Both ACOS B2.9 and UoL-FP: 3G retrieval methods evolved from the algorithm originally developed for the Orbiting Carbon Observatory (OCO) mission that was lost due to a launch vehicle malfunction on 24 February 2009.
 The major source of error in retrieving gas amounts from space-based measurements of reflected sunlight is atmospheric light scattering. Even in clear-sky conditions, high-altitude subvisible cirrus or aerosols can introduce large biases in gas retrievals [O'Brien and Rayner, 2002; Dufour and Bréon, 2003; Mao and Kawa, 2004; Houweling et al., 2005; Aben et al., 2007; Oshchepkov et al., 2008, 2009; Reuter et al., 2010]. Significant effort has been undertaken to reduce this bias in the GOSAT retrieval algorithms. Due to the initial developments by the independent research groups, these algorithms apply different cloud pre-screening and post-processing filters to remove satellite soundings that are highly contaminated by atmospheric light scattering. They also explore different basic approaches, aerosol and cloud models, and prior assumptions to account for these effects for the remaining observations.
 It is generally agreed that the primary source of errors due to atmospheric light scattering is the uncertainty in the modification of the light path through the atmosphere [e. g., Oshchepkov et al., 2011]. To quantify and control the light path modification in space-based remote sensing of greenhouse gases, Bril et al.  and Oshchepkov et al.  have proposed including the photon path length probability density function (PPDF) in the retrieval process. The PPDF-based method provides rapid data processing as the light path modification is similar within each GOSAT band. A small number of PPDF parameters, which are representative of the optical path lengthening and shortening, are retrieved from measurements of radiance spectra in the molecular oxygen (O2) A-band. These parameters approach zero when the impact of atmospheric light scattering on the gas retrievals is negligible. Other algorithms set upper limits of the total aerosol/cloud optical depth and examine the difference between retrieved and prior surface pressure to exclude large contributions of atmospheric light scattering when the gas retrievals become impractical.
 At present, all GOSAT retrieval algorithms are under continuing active development to make the results of actual GOSAT data processing useful for surface flux inversion. According to several theoretical studies, the user requirements for the GOSAT retrieval quality are rather stringent [Rayner and O'Brien, 2001; Houweling et al., 2004; Miller et al., 2007; Chevallier et al., 2007]; for example, regional biases of a few tenths of a part per million could hamper surface flux inversions [Miller et al., 2007; Chevallier et al., 2007]. An important aspect for validating the GOSAT retrievals and meeting the user requirements is the use of the ground-based high-resolution Fourier transform spectrometer (FTS) measurements from the Total Carbon Column Observing Network (TCCON) [Washenfelder et al., 2006; Wunch et al., 2011a], whose uncertainties in XCO2 are expected to be within 0.8 ppm (2σ) [Wunch et al., 2010; Deutscher et al., 2010; Messerschmidt et al., 2010]. Several studies have compared GOSAT retrieval gas products from different algorithms against TCCON measurements [Morino et al., 2011; Butz et al., 2011; Wunch et al., 2011b, Parker et al., 2011; Oshchepkov et al., 2012; Cogan et al., 2012]. Most of these studies find encouraging validation results using different periods of GOSAT observations, GOSAT-TCCON coincident criteria, and equations to represent statistical characteristics of the comparison. It is also important to interpret the GOSAT-TCCON XCO2 discrepancies for further algorithm improvements. One of the main reasons for which the TCCON has become a reliable reference source of greenhouse gas measurements is the direct solar-viewing geometry, which virtually eliminates the impact of atmospheric light scattering on the measurements. In contrast, the space-based processing, such as for GOSAT data, is often contaminated by atmospheric light scattering and hence GOSAT-TCCON comparisons should include close analysis of these effects.
 In our recent companion paper [Oshchepkov et al., 2012] (hereafter referred to as Part I), we have applied the PPDF-based method to evaluate the light path modification from GOSAT observations. For a representative statistical GOSAT-TCCON comparison, we collected weekly mean GOSAT data within a rather large 15° latitude by 45° longitude grid box (both over land and ocean) centered at each TCCON station. To constrain the natural variability in CO2 within each sampling domain, we excluded observations for which the NIES atmospheric transport model (NIES-TM) [Belikov et al., 2012] showed >1 ppm difference in CO2 column abundance from that of the TCCON. In particular, the results revealed the effects of optical path lengthening over Northern Hemispheric sites, essentially from May to September of each year, and of light path shortening for the GOSAT sun-glint observations in tropical regions. We showed that these effects compromise accurate gas retrievals.
 This study is the second in a series of companion papers describing the effects of atmospheric light scattering on the observations of atmospheric carbon dioxide from space. The paper introduces a comparative analysis of six retrieval algorithms for the GOSAT SWIR data processing during 22 months of GOSAT operations from June 2009 to March 2011. We include an algorithm comparison against TCCON measurements, an algorithm cross-comparison, and analysis of the impact of atmospheric light scattering on the gas retrieval products.
 The paper is organized as follows. Section 2 briefly outlines the data (GOSAT, TCCON, atmospheric transport model) used in this study. We give an overview of the inversion schemes and basic specifications of each GOSAT retrieval algorithm in section 3. Section 4 compares the results of the GOSAT validation study from different algorithms in terms of CO2 seasonal variability (section 4.1) and a pairwise GOSAT–TCCON XCO2 statistical comparison (section 4.2). Section 5 analyzes the contribution of atmospheric light scattering to GOSAT data products represented by different algorithms over TCCON sites. We analyze algorithm cross-comparison in section 6 and summarize the results in section 7.
 The basic specifications of the GOSAT instrument [Hamazaki et al., 2005; Kuze et al., 2009; Nakajima et al., 2010] and TCCON measurements [Washenfelder et al., 2006; Wunch et al., 2011a], as well as the atmospheric transport model (NIES-08.1i) [Belikov et al., 2012] were outlined in Part 1. In this section, we specify observation conditions and GOSAT-TCCON coincidence criteria used in this study.
 We analyzed L2 data (XCO2) derived from 22 months of GOSAT operation from June 2009 to March 2011. In this period, the following versions of L1B (radiance spectra) data derived from the thermal and near-infrared sensor for carbon observation-Fourier transform spectrometer (TANSO-FTS) [Kuze et al., 2009] have been applied to L2 data processing: V050.050, V080.080, V100.100, and V110.110. Suto et al.  give details of the updates applied through these versions.
 When comparing against TCCON measurements only GOSAT observations over land were considered in this study and these were selected within a 5° radius latitude/longitude circle centered over each TCCON site. To increase its dynamic range, the GOSAT TANSO-FTS instrument collects data over land in one of two gain states. Over most surfaces, it uses “high gain” (Gain H), but over bright desert surfaces (e.g., Sahara, central Australia), and for solar calibration, it uses medium gain (Gain M). There appears to be an offset in the calibration between Gain H and Gain M channels of Band 1 (O2 A-band) that can introduce a −5 hPa bias in the optical path length and surface pressures between these bands. Only data collected in Gain H are used here.
 Eleven TCCON sites have been selected for the GOSAT validations study, namely, eight sites in the Northern Hemisphere: Bialystok, Poland (53.2°N, 23.1°E); Bremen, Germany (53.1°N, 8.85°E); Garmisch, Germany (47.5°N, 11.1°E); Lamont, USA (36.6°N, 97.5°W); Orleans, France (48.0°N, 2.11°E); Park Falls, USA (45.9°N, 90.3°W); Sodankyla, Finland (67.4°N, 26.6°E); and Tsukuba, Japan (36.0°N, 140.2°E); and three sites from the Southern Hemisphere: Darwin, Australia (12.4°S, 130.9°E); Lauder, New Zealand (45.0°S, 169.7°E); and Wollongong, Australia (34.4°S, 150.9°E). The ground-based TCCON data considered here are mean values measured within ±1 h of the GOSAT overpass time (around 13:00 local time).
2.3 NIES TM
 The NIES global atmospheric tracer transport model was used in this study, primarily to fill gaps in the seasonal variability of XCO2 when TCCON data were not available. The present version (NIES-08.1i) of the NIES TM [Belikov et al., 2012] has a horizontal resolution of 2.5° × 2.5° and implements a 32-level flexible hybrid sigma-isentropic (σ-θ) vertical coordinate system. This combines both terrain-following and isentropic levels switched smoothly near the tropopause. The vertical transport in the stratosphere is calculated from the climatological heating rate derived from JRA-25/JCDAS reanalysis [Onogi et al., 2007]. The heating rate was adjusted to fit to the observed mean age of air in the stratosphere as described by Belikov et al. .
 The model incorporates fossil fuel CO2 fluxes derived from the Emission Database for Global Atmospheric Research (EDGAR) 1998 distribution [Olivier and Berdowski, 2001] scaled by the growth rate obtained from the Carbon Dioxide Information Analysis Center (CDIAC) [Boden et al., 2009]. All biospheric source/sink distributions over land and ocean are represented by the climatological inversion flux derived using inverse modeling with 12 TransCom3 models and observational data obtained from GLOBALVIEW-CO2 at 87 sites during 1999–2001 [Miyazaki et al., 2008].
 The XCO2 calculated from NIES TM tracer distributions was compared with measurements acquired at 12 TCCON ground-based FTS sites for the period from January 2009 to January 2011. The model was able to reproduce the seasonal and inter-annual variability of XCO2 with correlation coefficients of 0.8–0.9. A comparison of modeled data with TCCON observations revealed model biases of ±0.8 ppm for XCO2 except at the Sodankylä site, for which the model showed a larger bias of 1.21 ppm [Belikov et al., 2012].
3 Overview of Algorithms
 Most of the retrieval algorithms for the GOSAT data processing include the numerical solution of the radiative transfer equation when modeling measured radiance spectra; they are often referred to as “full physics” algorithms. The PPDF-based algorithm utilizes the photon path length statistical characteristics and the equivalence theorem [Bennartz and Preusker, 2006]. In contrast to “full physics” algorithms, the PPDF-based method considers the optical path modification through the atmosphere, which is similar within each GOSAT band. This minimizes the number of parameters to account for atmospheric light scattering and excludes time-consuming line-by-line radiative transfer calculations. The basic specifications of each algorithm in regard to data processing, aerosol and cloud modeling, and post-processing filtering are summarized in Table 1.
Table 1. The Basic Specifications of Each Algorithma
In regard to data processing (spectral interval at each band (Δν), SNR), aerosol and cloud modeling (the number of gas and aerosol/cloud layers, the number of aerosol and cloud components (AC)), and post-processing filtering (the total aerosol and cloud optical depth (AOD), the absolute difference between retrieved and prior surface pressure (ΔP), degree of freedom for signal (DFS), and a posteriori error of XCO2 (ΔXCO2)).
χ2 Band 1
χ2 Band 2
χ2 Band 3
Number of gas layers
Number of aerosol and cloud layers
Solar irradiance spectrum
G.C. Toon model
R. Kurucz model
G.C. Toon model
G.C. Toon model
G.C. Toon model
G.C. Toon model
Cloud pre-screening, coherence test
Cloud pre-screening, coherence test
Simultaneously with CO2
Simultaneously with CO2
Simultaneously with CO2
 All algorithms described in this work have been developed within the optimal estimation or maximum a posteriori rule depending on interpretation of prior information. These algorithms minimize a cost function in terms of the weighed least squares deviation between the observed and modeled radiance spectra for the GOSAT SWIR bands under constraints on the state vector of desired parameters. Although some algorithms offer an additional a posteriori bias correction, this paper uses the raw XCO2 retrievals from GOSAT single soundings with no a posteriori bias correction.
3.1 ACOS B2.9
 Build 2.9 (B2.9) of the ACOS operational algorithm was used to routinely process GOSAT observations collected between April 2009 and April 2012 by the NASA ACOS project. Earlier versions of this retrieval algorithm are described by Crisp et al.  and O'Dell et al. , and its performance on GOSAT data is described by Crisp et al. . The latter work also describes a few key updates incorporated into B2.9, include scaling of the O2 A-band absorption cross sections by 1.025 to eliminate a 10 hPa surface pressure bias, retrieving a zero-level offset in the O2 A-band to reduce the impact of the TANSO-FTS band 1 nonlinearity, and correcting a few coding and implementation issues that introduced errors in the B2.8 product. Other changes in B2.9 include updated gas absorption cross sections (ACOS v3.3) for the 1.61 and 2.06 µm CO2 bands that incorporate revised line strengths [Toth et al., 2008], widths [Predoi-Cross et al., 2010], isotopic abundances [Wunch et al., 2010], and line mixing parameters [Hartmann et al., 2009].
 Only those soundings that pass a cloud filter based on the O2 A-band are processed through the full retrieval; the cloud filter is described in more detail in Taylor et al. . O'Dell et al.  found that soundings contaminated by low water clouds or aerosol layers of significant optical depth (greater than 1) occasionally pass this pre-screening filter. One important feature of the ACOS retrieval algorithm is that the observation (versus a priori) part of the cost function is constructed, not using the true instrument noise to weight each channel, but rather with an ad hoc noise term that is a function of the continuum signal as well as the true noise [Crisp et al., 2012]. This “empirical noise” is used to ensure that persistent spectral residuals associated with errors in gas absorption coefficients, and other forward model errors do not produce values of the reduced χ2 in each band that increase with the signal level; it has the primary effect of changing the relative weights of the three spectral bands with respect to each other as well as to the prior.
 To account for atmospheric light scattering, the ACOS retrieval forward model solves for an admixture of four atmospheric aerosol and cloud components, each with fixed wavelength dependent optical properties. These four types are chosen to cover a wide range of optical properties, such that when combining them appropriately, the retrieval is expected to reproduce virtually any profile of atmospheric light scattering in all three spectral bands. The four airborne particle types chosen are water cloud, ice cloud, and two different types of aerosol, corresponding to the aerosol types “2b” and “3b” of Kahn et al. . The liquid water cloud has a Gamma size distribution [Hansen and Travis, 1974] of spherical drops with an effective radius of 8 µm, while the ice water cloud optical properties are taken from the model of Baum et al. [2005a, 2005b], assuming an effective particle radius of 70 µm. Type “2b” Kahn aerosol is a mixture of sulfate, sea salt, and course and fine-mode dust, while type “3b” Kahn aerosol is a mixture of sulfate and sea salt, and carbonaceous particles. The wave number dependency of light scattering properties is fixed for each of the particle component. The extinction efficiency and single scattering albedo of each type in the three spectral bands can be found in O'Dell et al. . For the version B2.9 retrieval algorithm, the state vector parameters of these scatterers are vertical profiles of the logarithm of the extinction optical depth for each type; the logarithm is used to avoid negative optical depths. The prior total cloud plus aerosol optical depth (total AOD) is 0.15. Rayleigh light scattering is included in the O2 A-band according to the parameterization of Bodhaine et al. . The radiative transfer code employed in the forward model is fully polarized and accurate to approximately 0.1% for even relatively optically thick scenes [O'Dell, 2010].
 A post-processing filter is used to remove retrievals of bad or questionable quality. The filtering parameters include the number of diverging steps taken in the iterative retrieval, the reduced χ2 of the residuals in each GOSAT SWIR band (χ2 < 2 in both weak and strong CO2 absorption bands and χ2 < 1.4 in the O2 A-band), the retrieved total AOD (≤ 0.15), the a posteriori error estimate, the calculated degrees of freedom for CO2, and the difference between the retrieved and prior surface pressure within 10 hPa [O'Dell et al., 2012]. The average throughput of the pre- and post-processing filters in the ACOS B2.8 algorithm is about 7% [Crisp et al., 2012]. The yield for B2.9 is similar for TANSO-FTS soundings collected in Gain H.
3.2 NIES 01.xx and NIES 02.xx
 Details of the NIES version 01.xx operational algorithm that routinely processed the GOSAT radiance spectra at the National Institute for Environmental Studies are outlined by Yoshida et al.  (where “xx” is defined corresponding to updated L1B radiance spectra products [Suto et al., 2011]). After pre-screening for cloud, using the TANSO-CAI cloud flag test [Yoshida et al., 2011], this algorithm additionally applies the TANSO-CAI spatial coherence test (over sea) and the TANSO-FTS 2 µm band test using the measurement radiance of the H2O-saturated absorption area of the 2.0 µm band (5150–5200 cm−1) to look for evidence of elevated cirrus clouds and to screen out the corresponding GOSAT scans [Yoshida et al., 2011].
 This algorithm divides the atmosphere into 15 layers between the surface and 0.1 hPa. The a priori gas profiles are calculated for every day of observation by the NIES TM [Maksyutov et al., 2008]. Prior variance-covariance matrices of gas profiles were evaluated by Eguchi et al.  by comparing 2008 NIES TM with observation-based reference data (GLOBAL VIEW 2008). To avoid unexpected strong constraints on the a priori values and to gain as much information from the observed spectra as possible, the original variance-covariance matrices were multiplied by a factor of 100.
 The version 01.xx NIES algorithm retrieves aerosol optical depth assuming that it is uniformly distributed from the surface to 2 km altitude. The aerosol optical properties are calculated from an offline three-dimensional aerosol transport model, the Spectral Radiation-Transport Model for Aerosol Species (SPRINTARS) [Takemura et al., 2000], which simulates aerosol mass concentration distributions (soil dust, carbonaceous, sulfate, and sea-salt aerosols). The wave number dependency of the aerosol optical depth, as well as the single-scattering albedo and phase function are also calculated from SPRINTARS and were fixed during the retrieval. The aerosol optical depth is simultaneously retrieved with the gas concentrations, surface pressure, surface albedo, temperature, and stretch factor from the O2 A-band (12,950–13,200 cm−1), the 1.6-µm CO2 band (6180–6380 cm−1), and the 1.67-µm CH4 band (5900–6150 cm−1). Since the Jacobian for aerosol optical depth in CO2 and CH4 bands showed high correlation with Jacobians for gas concentrations, the aerosol optical depth is mainly retrieved from the oxygen A-band. A post-processing quality assessment accepts only those retrievals with χ2 < 3, degrees of freedom for signal (DFS) ≥ 1, retrieved aerosol optical depth at a wavelength of 1.6 µm (AOD) ≤ 0.5, and the signal-to-noise ratio of polarization synthesized spectra (SNR) > 100.
 An improved NIES algorithm, version 02.xx, has been recently developed [Yoshida et al., 2012a; 2012b] to avoid the large negative bias and scatter seen in the retrieved XCO2 with the version 01.xx algorithm (section 4). First of all, in version 02.xx, the solar irradiance database was updated to the G. C. Toon line list because the remaining terrestrial absorption structure in the solar spectrum of R. Kurucz introduced a non-negligible bias [Uchino et al., 2012]. To remove the surface pressure bias, the absorption cross section of the O2 A-band was scaled by a factor of 1.01. Aerosols simulated by SPRINTARS are categorized in NIES 02.xx into fine and coarse-mode particles. The logarithms of the vertical profile of the number density for both components are retrieved instead of retrievals of the total aerosol optical depth in NIES 01.xx. Further, the zero-level offset for the O2 A-band is retrieved in order to reduce the impact of the response nonlinearity in TANSO-FTS band 1 [Suto et al., 2011; Butz et al., 2011; Crisp et al., 2011]. These parameters are simultaneously retrieved with the gas concentration from the above mentioned spectral windows and the 2.0 µm CO2 band (4800–4900 cm−1). Within the retrieval, the true noise is amplified empirically to reduce the impact of systematic spectroscopic residuals; an amplified factor is a function of the signal-to-noise ratio, which ensures that the χ2 itself is not a function of the signal-to-noise ratio.
 The post-processing quality assessment for version NIES 02.xx algorithm included restriction to retrievals with degrees of freedom for the signal (DFS) ≥ 1, the retrieved aerosol optical depth at 1.6 µm ≤ 0.1, differences between the prior and retrieved surface pressure ≤20 hPa, and the signal-to-noise ratio ≥70 (Table 1).
 Although NIES 02.xx shows smaller bias and scatter than 01.xx (see following sections), differences remain between Gain-H/M and land/ocean retrievals. Further investigations are necessary to reduce these differences. Also, the possibility of using retrievals for cases affected by cirrus is under investigation instead of the current solution of excluding such cases.
3.3 NIES PPDF-D
 The photon path length probability density function method (currently built on version PPDF-D) has been developed at NIES. This version of the PPDF-based method incorporates PPDF retrievals in the O2 A-band (12,950–13,190 cm−1) as a pre-screening step to identify those satellite soundings that are not significantly affected by atmospheric light scattering. CO2 is then retrieved from these observations using the CO2 band 2 (6192–6368 cm−1) and band 3 (4815–4885 cm−1) on the basis of differential optical absorption spectroscopy (DOAS) [Buchwitz et al., 2000; Frankenberg et al., 2005; Oshchepkov et al., 2008].
 The PPDF-based radiative transfer model describes atmospheric light scattering by optical path modification. The light path is mainly expressed through two PPDF parameters, α and ρ, which are representative of the optical path shortening and the optical path lengthening, respectively. These parameters can be interpreted as follows: α is the relative reflectivity of the aerosol layer, i.e., the ratio of photons that were backscattered by the layer and detected by the satellite to the total number of detected photons; ρ is the scaled first moment of the PPDF within the aerosol layer [Oshchepkov et al., 2012]. In section 5, we estimate PPDF parameters for the GOSAT observations used for CO2 retrievals by all algorithms.
 Details of the PPDF-D version of the retrieval algorithm have been published elsewhere [Bril et al., 2007; Oshchepkov et al., 2008, 2009, 2011]. The key features of this algorithm were outlined in Part 1 and Table 1. In this paper, we use an additional post-processing filter that restricts spectral variability in surface albedo between band 1 (Γ1) and band 3 (Γ3):
 We apply this criterion to exclude observations with strong inter-band variations of PPDF parameters whose estimation in the oxygen A-band by PPDF-D algorithm might not be appropriate to accurately represent the optical path modification in the target CO2 absorption bands (e.g., for observations over snow and sea ice). In section 5.2, we also apply this filter to all other algorithms to demonstrate the improvement of the retrievals.
 Aerosol and cloud screening using two PPDF parameters offers an advantage over the light path detection based on the surface pressure retrievals in the oxygen A-band. This is because the deviation of the retrieved surface pressure from the meteorological prior characterizes only the integrated effect of the light path. Therefore, the GOSAT scans with a non-modified retrieved surface pressure (i.e., non-modified integrated optical path in the oxygen A-band) under a fortuitous combination of aerosol, cloud, and surface optical properties might be misinterpreted as clear-sky scenes when effects of optical path lengthening and shortening compensate each other [Oshchepkov et al., 2008; O'Dell et al., 2012]. The PPDF-based method excludes such misinterpretation because it retrieves both light path shortening and light path lengthening as separate parameters. Furthermore, negligible light path modification in the oxygen A-band does not imply that there is a similar effect in the target gas bands. In this respect, the PPDF-based method permits inter-band transformation of PPDF parameters using the surface albedo information [Bril et al., 2007; Oshchepkov et al., 2008]. Band-to-band variability of the optical path modification will be allowed in the new version of the retrieval algorithm (PPDF-S), which simultaneously estimates target gas amount and PPDF parameters from all available GOSAT SWIR bands. The PPDF-S version processing is currently underway in the NIES GOSAT project.
 The RemoTeC retrieval algorithm has been developed at SRON-Netherlands Institute for Space Research and at the Karlsruhe Institute of Technology (KIT) and utilizes the radiative transfer model developed by Hasekamp and Landgraf [2002, 2005] and by Hasekamp and Butz . Details of the inversion scheme have been described in detail by Butz et al. [2009, 2010, 2011]. RemoTeC is designed to simultaneously infer gas concentrations and particle light scattering characteristics in the observed atmosphere. Aerosols are parameterized as a single atmospheric layer with a Gaussian vertical distribution of the particle optical depth. The location of the distribution peak zs (under the assumed distribution width) is retrieved simultaneously with the total column number density and the particle size parameter αs. The particle number density size distribution is a power law , with r the particle radius. The particle complex refractive index is a fixed-value 1.400 − i0.003. The algorithm allows simultaneously retrieval of the 12-layer profiles of CO2 and CH4 column number densities.
 The present retrieval setup exploits radiances in four windows covering the O2 A-band (12,920–13,195 cm−1), a weakly absorbing CO2 band (6170–6278 cm−1), a CH4 band (6045–6138 cm−1), and a strongly absorbing CO2 band (4806–4896 cm−1). Calculation of molecular absorption by O2 and CO2 includes line-mixing spectroscopy and considers collision-induced absorption by O2 [Tran and Hartmann, 2008] and line mixing for CO2 [Lamouroux et al., 2010]. Absorption cross-sections by CH4 and H2O are modeled by HITRAN 2008 line parameters in combination assuming a Voigt line shape model. Solar Fraunhofer lines are represented through an empirical line list provided by G. C. Toon, JPL, USA. O2 absorption cross sections in the A-band are scaled by a factor 1.030 to account for a spectroscopic bias. An important feature of RemoTeC is that surface pressure is not retrieved.
 The RemoTeC retrievals shown here are processed and filtered as proposed by Butz et al., . Most importantly the TANSO-CAI cloud flags are used to filter for cloudy scenes. Post-processing filters include convergence criteria such as number of iterations, degrees of freedom for CO2 > 1, and quality of the fit (χ2 < 4), as well as filters based on the retrieved scattering optical thickness (AOD < 0.25) and on a combination of retrieved aerosol parameters zs, αs, and AOD (). This latter filter screens out difficult scenes where many large particles were retrieved at high altitudes.
3.5 UoL-FP: 3G
 The University of Leicester Full-Physics (UoL-FP, currently version 3G) retrieval method is based upon the OCO Level 2 algorithm [Connor et al., 2008; Boesch et al., 2011]. Like the ACOS model, UoL-FP forward model employs the LIDORT (Linearized Discrete Ordinate Radiative Transfer) model combined with a fast two-orders-of-scattering vector radiative transfer code [Natraj et al., 2008]. In addition, the code uses the low-streams interpolation functionality [O'Dell, 2010] to accelerate the radiative transfer calculations. The algorithm was modified to allow retrievals from GOSAT spectra and has also been used to successfully retrieve XCH4 [Parker et al., 2011]. The retrievals use a 20 level atmosphere, retrieving profiles of CO2, extinction profiles for cirrus ice clouds, liquid water clouds, and the same two aerosol types used by the ACOS algorithm. It also retrieves surface pressure, surface albedo and its spectral slope, scaling factors for CH4, H2O and temperature profiles, additive intensity offset in the O2 A-band, and spectral shift/stretch. The prior for the aerosol extinction profile assumes a Gaussian-shaped profile with a height and width of 2 km and a total optical depth of 0.05 for each type. The optical properties for the two aerosol types are taken from Kahn et al.  and represent carbonaceous/dusty continental and carbonaceous/sooty continental aerosol mixtures, thereby providing the retrieval with two different optical properties that are used to describe the unknown scene-dependent aerosol. The a priori extinction profile for cirrus is also Gaussian-shaped but with height and width that are latitudinally dependent, based on Eguchi et al. , with a total optical depth of 0.05. The cloud optical properties are taken from Baum et al. [2005b] for an effective radius of 60 µm. Aerosol and cloud extinction are retrieved as a log-value with an a priori covariance with a 1-σ uncertainty of a factor of 50 at each level. The UoL-FP: 3G uses the v3.2 OCO spectroscopy and solar irradiance model as described by Crisp et al.  and the TCCON spectroscopy for CH4 and H2O, which is based on HITRAN 2008 but includes updates to H2O based on Toth, 2005 and Jenouvrier et al.  and CH4 based on Frankenberg et al. .
 Cloudy scenes were removed by selecting only observations where the difference between ECMWF surface pressure and surface pressure retrieved from a narrow window O2 A-band fit (13,056–13,074.8 cm−1) is less than 20 hPa. The GOSAT scans with a SNR < 50 in each of the three spectral bands were removed from the processing. Similar to the ACOS retrieval, a post-processing filter based on χ2 of the fit residual, a posterior error, surface pressure bias, aerosol and cloud optical depth, number of divergence steps, and a number of minor criteria is applied. Relatively loose thresholds are used for the filter to ensure good global coverage. To account for an observed bias in retrieved surface pressure, which is most likely due to deficiencies in the O2 A-band spectroscopy [Butz et al., 2011], the retrieved XCO2 is normalized with the observed surface pressure bias.
4 Comparison of GOSAT and TCCON XCO2
 In total, 8638 GOSAT single soundings from all GOSAT retrievals were detected within a 5° radius of 11 TCCON sites during the 22 months from June 2009 to March 2011. We use all these scans when comparing the seasonality of GOSAT XCO2 retrievals with TCCON measurements (section 4.1) and when performing an algorithm retrieval intercomparison (section 6). From these observations, 5561 GOSAT single soundings were available for the GOSAT-TCCON XCO2 pairwise comparison (section 4.2). The “Observation fraction” in Table 2 indicates the percentage of the total (5551) coincident observations available from each individual algorithm.
Table 2. Statistical Characteristics of the GOSAT XCO2 Retrievals From Six Algorithms as Compared Against to TCCON Measurements Using Single Scans, Daily Mean, and Weekly Mean XCO2 Dataa
UoL − FP: 3G
All GOSAT soundings are collected over land within a 5° radius circle over 11 TCCON sites (Bialystok, Bremen, Darwin, Garmisch, Lamont, Lauder, Orleans, Park Falls, Sodankyla, Tsukuba, and Wollongong). The TCCON XCO2 data were mean values measured within ±1 h of the GOSAT overpass time. The statistical characteristics are: the number of GOSAT individual scans coincident with TCCON soundings (Nc), number of average points (Na days or weeks) meeting the coincidence criteria, the regression slope (a), bias (Bias), standard deviation (σ), determination coefficient (R2), Pearson's correlation coefficient (r), and interstation bias (i − Bias) between GOSAT and TCCON XCO2. Values in parentheses are derived after additional scan selection by spectral variability in albedo (equation (1)).
Single GOSAT and TCCON Scans
Daily Mean GOSAT and TCCON Data
Weekly Mean GOSAT and TCCON Data
4.1 CO2 Seasonal Trends
 Figures 1-3 display the weekly mean seasonal variability in XCO2 from GOSAT (blue open symbols), TCCON (green closed symbols), and the NIES-TM (red crosses). Light blue crosses represent XCO2 retrievals from GOSAT single scans and the bars correspond to the XCO2 standard deviations for the GOSAT weekly mean data. We show the results over the Lamont site separately in Figure 1 as the GOSAT observations over this site have the largest sample size, due to multiple orbit overpasses within the coincidence criteria, comparatively clear skies, and special observation requests from the GOSAT Research Announcement (RA) activity [http://www.gosat.nies.go.jp/eng/proposal/proposal.htm]. The results for the other seven Northern Hemisphere TCCON sites (Bialystok, Bremen, Garmisch, Orleans, Park Fall, Sodankyla, Tsukuba), as well as for three Southern Hemisphere sites (Darwin, Lauder, Wollongong) are plotted together in Figures 2 and 3, respectively. Each of six satellite retrieval products is shown in a separate panel. We display the number of weekly mean observations (Na), average bias (Bias), standard deviation (σ), and Pearson's correlation coefficient (r) between GOSAT and TCCON weekly mean XCO2 in each panel. These characteristics are derived from a weighted least squares fit by minimizing the perpendicular offset between GOSAT-TCCON coincident observations [York et al., 2004; Oshchepkov et al., 2012]. This derivation makes it possible to account for uncertainties in both the GOSAT and TCCON datasets. The weights are inversely proportional to the error variance of the data. The variances are defined by the GOSAT a posteriori retrieval error and by uncertainties in the TCCON coincident measurements.
 As evident from Figures 1, 2, all GOSAT XCO2 products except NIES 01.xx perform reasonably in reproducing the temporal patterns in the Northern Hemisphere observed in the TCCON measurements and simulated by NIES TM throughout 2 years. Minima in late summer and maxima in spring are clearly reproduced in the seasonal cycle of CO2 column abundance. Note that the rather large scatter seen in Figure 2 (Northern Hemisphere sites) and Figure 3 (Southern Hemisphere sites) does not allow characterization of the GOSAT-TCCON discrepancy in XCO2 because the data in these two Figures are collected from several sites. The standard deviation, bias, and correlation coefficient in the insets of Figures 1-3 are derived from the coincident station-to-station observations.
 The NIES 02.xx retrieval algorithm provided the smallest standard deviation (1.17 ppm) and the highest correlation coefficient (0.94) over Lamont (second row in right column of Figure 1). Here we also found the largest sample size (98 weeks) and the smallest bias (−0.31 ppm) for UoL-FP: 3G (second row of left column in Figure 1). As with all other algorithms, GOSAT-TCCON XCO2 standard deviations and correlation coefficients are somewhat degraded (1.71 ppm and 0.89 for NIES 02.xx retrieval algorithm) over other Northern Hemisphere sites (Figure 2), even though the total number of coincident GOSAT-TCCON observations (Na) is approximately twice as large compared to Lamont.
 Consistent with the TCCON measurements, GOSAT data over the three Southern Hemisphere sites, Darwin, Wollongong, and Lauder, show much weaker seasonal cycles of XCO2 (Figure 3) than those from the Northern Hemisphere sites. The largest correlation coefficient of 0.44 and the lowest standard deviation of <1.7 ppm are found for NIES 02.xx and ACOS B2.9 over these stations.
4.2 Pairwise GOSAT-TCCON XCO2 Comparison
 Figure 4 represents GOSAT(Y)-TCCON(X) XCO2 correlation diagrams from the six algorithms. Both XCO2 from GOSAT and from TCCON are weekly mean values. Also shown in the insets of each panel in Figure 4 are the number of coincident GOSAT-TCCON observations (Nc), the regression slope (a) (for the slope-intercept form of the linear regression), the coefficient of determination (R2) (goodness of fit), and other statistical characteristics mentioned above (σ, r) [section 4.1; Part 1]. Table 2 summarizes these characteristics for each data product, including GOSAT-TCCON comparisons of single scans, as well as the daily and weekly mean XCO2. In Table 2 we also present station-to-station bias variability: the interstation bias (i − Bias) and the standard deviation of the set of 11 individual biases from each TCCON site. Both interstation mean bias and the standard deviation i − Bias were calculated with diagonal 11 × 11 dimensional weight matrix
 which accounts for the number of coincident GOSAT and TCCON observations (Nci) and standard deviations (σi) from each ith individual TCCON site. In equation (2), is the Kronecker delta.
 Compared to our previous validation study of PPDF-D retrievals [Oshchepkov et al., 2012] (Part 1), the exclusion of observations over sea and the much smaller coincidence criteria (5° radius latitude/longitude circle versus 15° latitude × 45° latitude grid box around TCCON sites in Part 1), and therefore sample size, used in this paper somewhat deteriorates the correlation coefficient (r = 0.85 → 0.79) and the standard deviation (σ = 1.80ppm → 2.10ppm) for weekly mean data (Table 2). For GOSAT-TCCON single scan and daily mean XCO2, PPDF-D retrievals provide r ≥ 0.73 and σ ≤ 2.48ppm (Table 2). A similar trend toward degraded r and σ holds for other algorithms when using daily mean or single scan data (Table 2). We found the regression slope in the GOSAT-TCCON XCO2 best fit is closest to unity (|1 − a| ≤ 0.04) for NIES 02.xx, RemoTeC, and PPDF-D when using daily and weekly mean data. The number of satellite data available from PPDF-D retrievals is among the lowest of the algorithms (the only algorithm with fewer data points available is NIES 01.xx). The low number of coincident GOSAT-TCCON data for this algorithm can be explained by the filtering of data to only include small light path modifications (α ≤ 0.04 and ρ ≤ 0.04 [Part 1]) and also by the restriction to a small range of spectral variability in the surface albedo (section 3.3).
 Figure 4 and Table 2 demonstrate the substantial improvements in the NIES 02.xx data product compared with the previous version, NIES 01.xx (top of left and middle columns for weekly mean data). Both the negative bias (Bias = − 7.90 ppm) and scatter (σ = 3.97 ppm) are significantly decreased in the version NIES 02.xx to −1.07 and 1.68 ppm, respectively (insets of the panels). Morino et al.  derived similar values of the statistical characteristics for NIES 01.xx data product using 10° latitude × 10° longitude grid box over TCCON stations and even poorer values using a smaller sample size.
Uchino et al.  found that the root cause of the negative bias correction in NIES 02.xx is the solar irradiance database. The solar irradiance line list for NIES 01.xx retrieval was derived from high-resolution solar FTS measurements at Kitt Peak National Observatory by removing the terrestrial atmospheric absorption structure (R. Kurucz, http://kurucz.harvard.edu/sun/irradiance2008/). However, a small residual CO2 absorption structure was found in the solar irradiance database that is probably due to insufficient atmospheric correction. This bias showed an air-mass dependency, i.e., large negative bias for small air-mass and small negative bias for large air-mass. In NIES 02.xx, the Kurucz model of solar irradiance spectra was replaced by the model of G. C. Toon, based on balloon-borne and ground-based spectra, which is utilized in all other algorithms (section 3).
 There are several other reasons why NIES 01.xx has a large negative bias and scatter. The assumption of uniform aerosol vertical distribution from the surface to 2 km altitude has a small impact on spectra and Jacobians in the CO2 bands, but a relatively large impact on the spectrum and Jacobian for aerosol optical depth in O2 A-band. Therefore, the retrieved aerosol optical depth and surface pressure tend to show large positive biases when minimizing the residuals in O2 A-band. Positive biases in aerosol optical depth and surface pressure cause a negative bias in XCO2 and its impact changes from scan to scan. Simultaneous retrievals of the aerosol mass concentration profile implemented in NIES 02.xx reduce both the bias and scatter and result in more data passing the post-screening (Figure 4 and Table 2). Another reason for the bias reduction in NIES 02.xx is that the absorption cross section of O2 A-band was scaled by a factor of 1.01 according to the detected surface pressure bias. Finally, in contrast to all other algorithms, NIES 01.xx did not use the strong CO2 absorption band at 2.0 µm (4800–4900 cm−1) where the radiance spectra are rather sensitive to atmospheric light scattering. Incorporation of this band to the data processing in NIES 02.xx, therefore, improves simultaneous gas and aerosol retrievals or, at least, reduces the scatter by filtering out those GOSAT scans for which the effect of the erroneous aerosol model is critical. The remaining XCO2 bias from NIES 02.xx is approximately −1 ppm. This algorithm provided the lowest GOSAT-TCCON XCO2 standard deviation ranging between 1.7 and 2.17 ppm from weekly mean to single scan observations (Table 2).
 ACOS B2.9 and UoL-FP: 3G data products provide the largest number of observations coincident with TCCON measurements (insets in the bottom of the left and middle columns of Figure 4) but the correlation and determination coefficients are below 0.8, ranging from 0.70-0.79 depending on the temporal averaging period of observations (Table 2). The correlation coefficients from these products are much larger over the Lamont site (0.90–0.92) and rather low over all other Northern Hemisphere sites (0.76–0.78) for weekly mean data (insets in Figures 1 and 2). UoL-FP: 3G provides the lowest interstation bias (0.39–0.68 ppm) and simultaneously the lowest global bias <0.2 ppm (Table 2). Although the global XCO2 bias from ACOS B2.9 is also rather small (−0.25–0.20), the station-to-station bias approaches 1 ppm for daily and weekly mean data (Table 2). Along with the noticeable deviation of the slope a from unity (0.81–0.90), this could hinder carbon flux inversions as noted by O'Dell et al. . However, as demonstrated in Wunch et al. [2011b], employing a suitable bias correction based on a multi-linear regression against a small number of geophysical variables, such as signal level and surface albedo, leads to a value of a consistent with unity for ACOS B2.9. The values of global bias and standard deviations between GOSAT ACOS B2.9 and TCCON XCO2 generally agree with those previously reported by Wunch et al. [2011b] when using other coincident GOSAT-TCCON criteria (mean data from a much larger grid box of ±10° latitude × ±30° longitude combined with a potential temperature constraint of ±2 K at 700 hPa and a 10 day temporal constraint). We should also note that although ACOS B2.9 and UoL-FP: 3G algorithms provide the largest number of observations; they still only utilize ~60% of the total number of individual GOSAT soundings available from all data products.
 Except for the number of available coincident observations, RemoTeC and NIES PPDF-D have rather similar statistical characteristics of the GOSAT-TCCON XCO2 relationship (Figure 4 and Table 2). This could be related to similar constraints on height distribution of the aerosol. We discuss this in more detail in section 5.
5 Effects of Optical Path Modification
 Among all GOSAT retrieval algorithms presented in this study, only the PPDF-D inversion scheme is designed to retrieve gas amounts under negligible optical path modification (section 3.3). This is achieved by excluding those GOSAT soundings for which the light path variability is rather large (α or ρ ≥ 0.04) and, thereafter, the DOAS-based method can be applied to the remaining observations [Oshchepkov et al., 2008]. Note that negligible optical path modification does not imply zero aerosol and cloud approximation because, for example, light scattering from near-ground aerosols does not modify the light path significantly and, therefore, its contribution to the radiance spectra can be taken into account by spectral polynomials of low orders when applying DOAS. All other algorithms target retrieval of aerosol (RemoTeC, NIES 02.xx) or aerosol and cloud (ACOS B2.9, UoL-FP: 3G) characteristics simultaneously with gas amount (section 3). We should discriminate two important issues in this context. First of all, even in clear-sky conditions, there is always a certain threshold of atmospheric light scattering beyond which the gas retrievals become impractical. To overcome this problem, an upper limit to the acceptable aerosol optical depth retrieved from full physics algorithms is implemented in the post-processing filters to remove the contaminated observations (section 3 and Table 1). One important consideration here is whether the remaining observations are substantially affected by atmospheric light scattering in terms of light path modification, or if the full physics approach is important only to screen out the contaminated observations. Next, we also need to investigate how accurately simultaneous gas and aerosol retrievals below the AOD threshold correct atmospheric light scattering providing non-biased estimation of gas concentration or whether there is still systematic bias due to the optical path change. This section aims to clarify these issues by looking at PPDF distributions of the satellite soundings selected by each algorithm; to this end, we apply the PPDF method to estimate light path modification in the individual scans available from all products. We also discuss the possible biases that result from light path variability for the data product from each algorithm.
5.1 PPDF Seasonal Variability
 Figure 5 displays seasonal variability in GOSAT single scan counts (green histograms) available from each algorithm (top to bottom rows). We also plot here the PPDF parameters α (blue symbols) and ρ (red symbols), which are mainly responsible for the light path shortening and for the light path lengthening, respectively [Part 1]. The data over eight Northern Hemisphere sites are shown in the left column and the data over three Southern Hemisphere sites are shown in the right column. The number of GOSAT single scans Ns (Ns ≥ Nc) available from each algorithm is in the inset of each panel. These observations are mostly after the Cloud and Aerosol Imager (TANSO-CAI) pre-screening test [Part 1] or other tests (section 3) that identify atmospheric conditions under clear skies.
 The number of observations available from each algorithm shows considerable seasonal variation, apart from in September to October 2010 over the Northern Hemisphere stations (left column of Figure 5) and in July 2010 over the Southern Hemisphere stations (right column of Figure 5) when all algorithms have large numbers of processed observations, probably due to clear skies conditions. We have not found any significant correlations between the seasonal trends of GOSAT scans (counts) available for processing and PPDF parameters (scatterers). Only the PPDF-D retrieval product removes a large number of observations over Northern Hemispheric sites from May to September of each year (third from the top of the left column of Figure 5), when the effects of light path lengthening are essential (ρ ≥ 0-red symbols).
 The seasonal variability in PPDF parameters from all data products is similar to that previously reported in Part 1 for PPDF-based retrievals when using weekly mean GOSAT and TCCON data within larger grid boxes (15° latitude × 45° longitude) centered on TCCON sites. In particular, the results reveal the effects of optical path lengthening (heightened values of ρ parameter) especially occur over Northern hemisphere sites from May to September (red symbols in the left column of Figure 5). The light path shortening shows no remarkable seasonal variability with rather small values of α parameter, which are mostly within the threshold α ≤ 0.04 (blue symbols in Figure 5) where the impact of atmospheric light scattering on CO2 retrievals can be neglected [Part 1].
5.2 PPDF Counts
 To quantify the consistency between GOSAT soundings and PPDF parameters, in Figure 6 we plot the α-distribution (red histogram) and ρ-distribution (yellow histogram) of GOSAT single scan counts available from each of the six algorithms. These histograms are convenient for comparing the significance of the light path modification within each data product. Figure 6 shows that the number of GOSAT soundings falls drastically with increasing α parameters (red histograms), that is, the effects of the light path shortening due to aerosol and cloud over dark surface are not representative of these observations, which were all over land. Previously in Part 1, we detected substantial light path shortening when processing GOSAT observations over ocean. All six data sets show an elevated number of GOSAT observations at around ρ = 0.08 where the lengthening of the optical path is in excess of the threshold ρ = 0.04 (Part 1). The values on the top left side of each panel in Figure 6 indicate the percentages of the GOSAT scans for each data product that were processed under negligible path length modification (α < 0.04, ρ < 0.04) when the simple DOAS-based method could be applicable for data processing. The portion of these observations is seen to be approximately 75% for NIES 02.xx, ACOS B2.9, UoL-FP: 3G, and even larger for NIES 01.xx (89%). RemoTeC and PPDF-D data sets consist of a lower portion (65%) of observations that are not affected by path length variability (right column in Figure 6). While PPDF-D provides gas retrievals only for this 65% of observations, RemoTeC processes all scans. According to our estimations, atmospheric light scattering with ρ ≤ 0.2 (ρ interval for yellow histograms in Figure 6) is much less than for conditions beyond which the gas retrievals become impractical. Specifically, applying PPDF-D for ρ ≤ 0.2 does not significantly degrade the GOSAT-TCCON correlation diagram but increases the number of available GOSAT scans by a factor of 1.5. We assume, therefore, that most of the current versions of the full physics algorithms implement post-processing filters that are too conservative, and therefore remove potentially useful GOSAT observations. This might be associated with post-screening by surface pressure and aerosol/cloud optical depth (section 3) that are not always appropriate for characterizing the light path variability [Oshchepkov et al., 2011].
5.3 Effect of Surface Albedo
 The last important issue addressed in this section is the efficiency of each algorithm in accounting for light scattering after post-processing filtering. Although the PPDF retrieval is the most natural way to control optical path modification, its application to the post-retrieval screening could be also too conservative depending on surface albedo. For example, as indicated earlier using synthetic GOSAT data [Oshchepkov et al., 2008], the effect of light path changes on XCO2 is rather small if surface albedo varies around 0.2 when light path shortening and light path lengthening considerably compensate each other.
 Figure 7 displays the GOSAT counts and GOSAT-TCCON XCO2 bias distributed by surface albedo in the 1.6 µm CO2 band (Γ2). The original data from each algorithm are displayed in yellow. The red histogram (counts) and black curve (retrieval bias) in the right column at the top of Figure 7 refer to PPDF-D retrievals prior to PPDF-based screening and before applying equation (1), that is to say, all the data that pass the TANSO-CAI cloud pre-screening test are crudely processed with the assumption that there is no change in path length. As for the numerical studies [Oshchepkov et al., 2008], the zero retrieval bias in the actual GOSAT data processing holds at Γ2 ≃ 0.2. Apart from this, the magnitude and direction of the retrieval bias follow our physical understanding when neglecting optical path modification. Specifically, the light path tends to be shorter over dark surfaces (Γ2 < 0.2), because the backscattered light from photons that reach the absorbing surface is not detected. Correspondingly, the retrieved gas concentration tends to be underestimated. For satellite observations over bright surfaces (Γ2 > 0.2), aerosol and cloud could give rise to multiple light scattering/reflection between the ground surface and cloud/aerosol particles. As a result, the light path tends to be longer and the retrieved gas amount becomes overestimated. The behavior of the GOSAT-TCCON XCO2 bias as a function of surface albedo in the 2.0 µm band is similar to that for the 1.6 µm band.
 For NIES 01.xx, NIES 02.xx, and ACOS B2.9, the surface albedo dependence of the retrieval bias is the reverse of that for DOAS-based retrievals retaining zero bias at Γ2 ≃ 0.2 (the global GOSAT-TCCON XCO2 bias has been removed, as noted in the caption of Figure 7). This is direct evidence that these three algorithms tend to overestimate aerosol/cloud amounts. UoL-FP: 3G has a slightly reduced bias that might be explained due to the post hoc pressure correction in this algorithm [O'Dell et al., 2011]. The RemoTeC data product displays the smallest bias with surface albedo variability (right column at the bottom of Figure 7).
 The brown histograms in Figure 7 represent the GOSAT observation counts after applying equation (1) to all products, which restricts spectral variability in the surface albedo. For the remaining sets of these observations, we have recalculated the XCO2 bias as a function of surface albedo (brown curves in Figure 7), as well as all the statistical parameters in the GOSAT-TCCON XCO2 relationship (values in brackets in Table 2), and the PPDF observation counts (Figure 8). Statistical characteristics are substantially improved for most of the products, especially for NIES 01.xx, ACOS B2.9, and UoL-FP: 3G, with a reduction of the standard deviation and station-to-station bias as well as an increase in both the determination and correlation coefficients (Table 2). A significant improvement for NIES 02.xx arises in the interstation bias, although other characteristics are also improved (Table 2). Most of the removed observations can be assosciated with contamination by light path modification. We support this important insight by plotting updated PPDF observation counts in Figure 8; a fraction of GOSAT observations (%) for which the DOAS-based technique can be applicable is a great as 88-96% for the current versions of full-physics algorithms. In addition, the total remaining number of GOSAT soundings Nc becomes compatible with that available from PPDF-D (Table 2).
 The RemoTeC retrievals were the only retrievals for which we did not find substantial improvements in the statistical characteristics when removing observations with heightened light path variability (Table 2). Taking also into consideration that RemoTeC retrievals deal with the largest portion of observations contaminated by path length modification (35%, right column at the bottom of Figure 6) and provide comparatively good agreement with TCCON measurements (section 4.2 and Table 2), it is reasonable to suppose that this algorithm has the best setup for simultaneous gas and aerosol retrievals.
 Also, it is worth noting that PPDF GOSAT counts presented in this section for all algorithms are unique to land observations near TCCON stations. For example, the counts can be quite different and the retrieval biases could be much larger for observations over oceans and lakes, where shortening of the optical path due to aerosols and clouds contaminates most of the scans (Part 1), and for bright surfaces such as over the Sahara desert or Arabian Peninsula where multiple reflections of light between aerosol/cloud and surface tends to increase the path length [Oshchepkov et al., 2008; 2011]. Reliable schemes of simultaneous gas and aerosol/cloud retrievals are essential for these cases in comparison to those processed over land near current TCCON stations.
6 Algorithm Cross-Comparison
 Figure 9 displays pairwise statistical comparisons between XCO2 retrieved by different algorithms using coincident single scan observations (nine combinations excluding data from NIES 01.xx and NIES 02.xx-UoL FP 3G cross-comparisons). Scatter plots represent correlation diagrams, for which the statistical characteristics are presented in the insets of each panel. In the same way as in Figure 4, we discriminate between observations over Lamont (cyan), other Northern Hemisphere stations (yellow), and Southern Hemisphere sites (magenta). Red solid lines display the best fit for all sites (the linear regression slope a are in the insets, dashed red lines are the best fit ± σ) and the green line represents a one-to-one correspondence.
 To a large extent the algorithm disagreements are comparable with the differences between TCCON and GOSAT XCO2 retrieved from individual algorithms (section 4 and Figure 4). The best agreement, as reflected by the largest determination coefficient (0.98), the lowest standard deviation (1.21 ppm), and the largest number of coincident observations (3386) is detected between NIES 02.xx and ACOS B2.9 (left column at the bottom of Figure 9). Both algorithms adjust the aerosol similarly, making no assumptions on the aerosol profiles. Another distinctive feature of these algorithms as opposed to others is that they utilize “empirical noise” instead of “true noise” (section 3) that changes the interband balance. We also speculate that the high correlation is partially due to the fact that both of these algorithms simultaneously retrieve surface pressure as well as XCO2 and do not perform any post-hoc correction based on the deviation of the retrieved surface pressure from the expected value based on the prior meteorology (as is done in UOL-FP). However, the regression slope between XCO2 according to these algorithms (1.18) is considerably different from unity. The bias of this slope is attributed to ACOS B2.9 because the NIES 02.xx-TCCON XCO2 slope is close-to perfect (0.99) (Figure 4).
 The next best quality of agreement is obtained from the comparison of ACOS B2.9 against UoL-FP: 3G (right column at the top of Figure 9), probably because they both evolved from the algorithm originally developed for the OCO mission. The regression slope for ACOS B2.9-UoL FP: 3G is close-to-perfect (0.99) with sub-parts per million global bias (−0.64 ppm), standard deviation of 1.49 ppm, and 2013 coincident observations.
 It is also noticeable that NIES PPDF-D and RemoTeC are in comparatively good agreement with a regression slope of 1.04 and with the lowest global bias of 0.14 ppm (right column at the middle of Figure 9). The scatter diagram for this combination is rather compact even for Southern Hemisphere sites (magenta symbols). Although these algorithms have been developed independently using different physical backgrounds (section 3), the retrieval setups are rather similar in some aspects of the aerosol treatments. First, surface pressure is prescribed from the meteorological data sets in both algorithms. This excludes possible correlation of surface pressure and aerosol characteristics when simultaneously retrieving these from the radiance spectra in the oxygen A-band. Next, in a similar fashion to how RemoTeC constrains the height distribution of the aerosol optical depth by a Gaussian function, NIES PPDF-D divides the atmosphere into either two or three layers and constrains the scattering effects only at the borders of these layers. This also minimizes the possible correlation between aerosol and gas characteristics when retrieving both aerosol and gas vertical profiles simultaneously. The next version of the ACOS (B2.10) retrieval algorithm also constraints the height and logarithm of the aerosol optical depth assuming a Gaussian vertical profile shape (section 3.1); we therefore expect better agreement of this version with PPDF-D and RemoTeC (not shown in this paper) although they still fit for four different types of light scattering components.
7 Summary and Concluding Remarks
 In this paper, we have introduced a comparative analysis of XCO2 retrievals from six global algorithms (ACOS B2.9, NIES 01.xx, NIES 02.xx, NIES PPDF-D, RemoTeC, and UoL-FP: 3G), which were used to process the GOSAT radiance spectra during approximately 2 years from June 2009.
 We have focused on the GOSAT observations over TCCON stations, at which ground-based remote-sensing measurements by Fourier transform spectrometers provide a reliable reference source for the column-averaged dry air mole fractions of atmospheric carbon dioxide. The GOSAT-TCCON coincidence criteria select satellite observations over land within a 5° radius of 11 TCCON sites. Under this criterion, we detect 8638 GOSAT single soundings available from all GOSAT retrieval algorithms during the 22 months from June 2009 to March 2011. Among these observations, 5561 GOSAT single soundings were available for the GOSAT-TCCON XCO2 pairwise comparison.
 The XCO2 data products from each algorithm were statistically compared against TCCON measurements with respect to the GOSAT-TCCON XCO2 regression slope, standard deviation, correlation and determination coefficients, and global and station-to-station biases, as well as the number of coincident observations for each algorithm. For this comparison, we selected single scans as well as daily and weekly mean data. Most products reasonably reproduced the temporal patterns in the Northern Hemisphere observed in the TCCON measurements and simulated by the NIES TM. We have demonstrated and explained the substantial improvements of the NIES 02.xx data product over the previous NIES 01.xx version. In particular, the large negative bias (approximately −8 ppm) and standard deviation (~4 ppm) as well as low correlation coefficient (~0.44) between weekly mean NIES 01.xx and TCCON XCO2 were significantly improved in the version NIES 02.xx to −1.07 ppm, 1.68 ppm, and 0.88, respectively. The lowest GOSAT-TCCON XCO2 standard deviation was detected for the NIES 02.xx and ACOS B2.9 algorithms. At the same time, however, both of these algorithms show rather large interstation biases. For daily and weekly mean data, the best agreement with TCCON measurements was detected for NIES 02.xx and RemoTeC. UoL-FP: 3G provided the largest number of GOSAT-TCCON coincident observations in terms of 3339 single soundings, 672 days, and 362 weeks. This, however, covers only 63.5% of the single scans available in total from all algorithms. PPDF-D retrievals showed comparatively good agreement with TCCON measurements but the number of observations available after PPDF-based screening is the lowest amongst the algorithms.
 We have estimated the impact of atmospheric light scattering on XCO2 retrievals within each data product using scan by scan retrievals of light path modification with the photon path length probability density function (PPDF) method. Approximately 25% of GOSAT soundings processed by NIES 02.xx, ACOS B2.9, and UoL-FP: 3G were found to be contaminated by atmospheric light scattering, primarily due to increased optical path length over Northern Hemispheric TCCON sites from May to September of each year. We also found that the effect of these contaminated scans in the NIES 02.xx and ACOS B2.9 algorithms is a tendency to overestimate aerosol amount, resulting, in particular, in underestimation of CO2 for GOSAT observations over bright surfaces. Our preliminary results over TCCON sites suggest that the RemoTeC algorithm has the best setup for simultaneous gas and aerosol retrievals because this algorithm provides an accurate aerosol correction for the largest portion of observations contaminated by light path modification (~35% from the total available from RemoTeC).
 We have performed an algorithm cross-comparison by analyzing pairwise correlation diagrams built on coincident GOSAT single soundings over TCCON stations. We found the best agreement, as reflected by the largest determination coefficient (0.98), the lowest standard deviation (1.21 ppm), and the largest coincident observations (3386) between NIES 02.xx and ACOS B2.9. However, the regression slope between the retrievals from these algorithms (1.18) is not ideal. The next best quality of agreement occurs between the ACOS B2.9 and UoL-FP: 3G algorithms, as they both evolved from the algorithm originally developed for the OCO mission. The regression slope for ACOS B2.9-UoL-FP: 3G is near-perfect (0.99) with a sub-parts per million global bias (−0.64 ppm), standard deviation of 1.49 ppm, and the number of coincident observations is 2013. NIES PPDF-D and RemoTeC are in comparatively good agreement with regression slope of 1.04, standard deviation of 1.87 ppm, correlation coefficient of 0.84, and with the lowest global bias of 0.14 ppm.
 GOSAT is a joint effort of the Japan Aerospace Exploration Agency (JAXA), the National Institute for Environmental Studies (NIES), and the Ministry of the Environment (MOE), Japan. Part of this work on ACOS B2.9 was performed at the Jet Propulsion Laboratory, California Institute of Technology, under contract with NASA. GOSAT spectra were kindly provided to the California Institute of Technology through a memorandum of understanding between JAXA and NASA. U.S. funding for TCCON is provided by NASA's Terrestrial Ecology Program (grant number NNX11AG01G), the Orbiting Carbon Observatory Program, the Atmospheric CO2 Observations from Space (ACOS) Program, and the Department of Energy/Atmospheric Radiation Measurement (DOE/ARM) Program. The Darwin TCCON site was built at Caltech with funding from the OCO project and is operated by the University of Wollongong, with travel funds for maintenance and equipment costs funded by the OCO-2 project. We acknowledge funding to support Darwin and Wollongong from the Australian Research Council, Projects LE0668470, DP0879468, DP110103118, and LP0562346. Lauder TCCON measurements are funded by New Zealand Foundation of Research Science and Technology contracts C01X0204 and CO1X0406. We acknowledge financial support of the Białystok and Orléans TCCON sites from the Senate of Bremen and EU projects IMECC, GEOMON and InGOS as well as maintenance and logistical work provided by AeroMeteo Service (Białystok) and the RAMCES team at LSCE (Gif-sur-Yvette, France) and additional operational funding from the NIES GOSAT project. The Garmisch TCCON team acknowledges funding by the EC-INGOS project. Development of RemoTeC was partly funded by ESA through the GHG-CCI project (S. Guerlet) and by Deutsche Forschungsgemeinschaft (DFG) through grant BU2599/1-1 (A. Butz). The JRA-25/JCDAS data sets used for atmospheric transport modeling were provided by the cooperative, long-term reanalysis project by the Japan Meteorological Agency (JMA) and Central Research Institute of Electric Power Industry (CRIEPI). The authors thank Dr. Sasano, Director of the Center for Global Environmental Research at the NIES, the members of the NIES GOSAT and NASA ACOS projects.