Corresponding author: S. Guerlet, Now at Laboratoire de Météorologie Dynamique/IPSL/CNRS/UPMC, 4 Place Jussieu, F-75252 Paris, France. (firstname.lastname@example.org)
 Inadequate treatment of aerosol scattering can be a significant source of error when retrieving column-averaged dry-air mole fractions of CO2 (XCO2) from space-based measurements of backscattered solar shortwave radiation. We have developed a retrieval algorithm, RemoTeC, that retrieves three aerosol parameters (amount, size, and height) simultaneously with XCO2. Here we evaluate the ability of RemoTeC to account for light path modifications by clouds, subvisual cirrus, and aerosols when retrieving XCO2 from Greenhouse Gases Observing Satellite (GOSAT) Thermal and Near-infrared Sensor for carbon Observation (TANSO)-Fourier Transform Spectrometer (FTS) measurements. We first evaluate a cloud filter based on measurements from the Cloud and Aerosol Imager and a cirrus filter that uses radiances measured by TANSO-FTS in the 2 micron spectral region, with strong water absorption. For the cloud-screened scenes, we then evaluate errors due to aerosols. We find that RemoTeC is well capable of accounting for scattering by aerosols for values of aerosol optical thickness at 750 nm up to 0.25. While no significant correlation of errors is found with albedo, correlations are found with retrieved aerosol parameters. To further improve the XCO2 accuracy, we propose and evaluate a bias correction scheme.
Measurements from 12 ground-based stations of the Total Carbon Column Observing Network (TCCON) are used as a reference in this study. We show that spatial colocation criteria may be relaxed using additional constraints based on modeled XCO2 gradients, to increase the size and diversity of validation data and provide a more robust evaluation of GOSAT retrievals. Global-scale validation of satellite data remains challenging and would be improved by increasing TCCON coverage.
 Carbon dioxide (CO2) is the dominant anthropogenic greenhouse gas. Quantifying its emission and uptake processes on regional scales is crucial to better understand our climate and its evolution. Current inverse model estimates of sources and sinks of CO2 are generally only constrained by measurements of CO2 concentrations near the surface (obtained from flask samples, continuous in situ measurements and tall towers), that provide accurate but spatially sparse information. These measurements are influenced both by large-scale and local fluxes, which makes their interpretation complex. In addition, flux inversion results using surface concentration data are very sensitive to how the boundary layer height and vertical mixing are described in transport models. Satellite observations, if accurate enough, have the potential to map CO2 total columns with near-global coverage, which could reduce uncertainties in sources and sinks characterization [e.g., Rayner and O'Brien, 2001], especially in areas not covered by flask measurements. In addition, assimilation of column-averaged measurements is insensitive to planetary boundary layer height assumptions.
 The Greenhouse Gases Observing Satellite (GOSAT) was launched on 23 January 2009. It is the first satellite primarily dedicated to the measurement of column-averaged dry-air mole fractions of CO2 and CH4 (denoted XCO2 and XCH4). Onboard GOSAT, the Thermal and Near-infrared Sensor for carbon Observation-Fourier Transform Spectrometer (TANSO-FTS) measures spectra of sunlight backscattered by the Earth's surface and atmosphere [Kuze and Suto, 2009]. The main advantage of near infrared observations is their sensitivity to CO2 (and CH4) in the lowest layers of the atmosphere, which facilitates investigation of surface-atmosphere exchange. However, satellite observations face challenging user requirements: in order to match the current flux uncertainties obtained from the flask network, measurements of XCO2 have to reach a relative accuracy on regional scales of a few 10ths of a percent, and systematic errors as small as a few 10ths of ppm could hamper source/sink inversions [Chevallier et al., 2007; Miller et al., 2007].
 The main source of error in analyzing such measurements is an uncertain knowledge of the light path, which is modified by scattering events. Light path modifications (lengthening or shortening effects) depend strongly on cloud and cirrus coverage, aerosol type, height and load, and surface albedo. Several studies, mostly based on simulations, have shown that neglecting scattering caused by cirrus and/or aerosols can introduce errors of several percent in XCO2 [Kuang and Margolis, 2002; O'Brien and Rayner, 2002; Houweling and Hartmann, 2005; Aben and Hasekamp, 2007; Oshchepkov and Bril, 2008; Butz and Hasekamp, 2009; Reuter and Buchwitz, 2010]. Various retrieval algorithms have been developed to account for scattering by particles in GOSAT observations [e.g., Yoshida et al., 2011; Butz et al., 2011; Boesch et al., 2011; O'Dell et al„ 2012]. They intrinsically differ in their aerosol physical parametrization and state vector elements. For instance, in version v01.xx of their Level 2 algorithm, Yoshida and Ota , from the Japanese National Institute for Environmental Studies (NIES), assume a uniform distribution of aerosols in a 2 km thick layer from the ground and retrieve only aerosol optical thickness. On the other hand, in version B2.9 of their algorithm, O'Dell et al. , from the NASA Atmospheric CO2 Observations from Space (ACOS) team, retrieve the extinction profiles of two aerosol types and two cloud types (one water cloud and one cirrus cloud). As their model considers 20 vertical levels, they retrieve 4 × 20 aerosol parameters. Similarly, the University of Leicester Full Physics (UoL-FP) algorithm includes the retrieval of the extinction profiles of one cirrus and two aerosol types on 20 levels [Cogan et al., 2012]. We have developed a full physics retrieval algorithm, called RemoTeC, that simultaneously retrieves XCO2 and XCH4 as well as (only) three effective aerosol parameters representing particle amount, size, and height distributions. This aerosol parametrization aims at effectively describing scattering events within a small parameter space.
 Treatment of aerosols in these retrieval algorithms is evolving rapidly and has been revised in the most recent versions of the NIES and ACOS algorithms. In the NIES algorithm, the vertical profile of aerosol number concentration of a fine and a coarse mode aerosol particles are now retrieved in version v02.xx instead of the total aerosol optical depth, which was deemed an overly simplistic approach [Yoshida et al., 2013]. In version B2.10 of the ACOS algorithm, the parameter space was reduced from 80 profile quantities to eight parameters representing height and logarithm of optical depth of the four scattering particle types, which is similar to the RemoTeC parametrization of aerosol vertical distribution.
 Another difference between existing algorithms is their cloud filter. The TANSO-Cloud and Aerosol Imager (CAI) onboard GOSAT delivers cloud flags that are used by Yoshida and Ota  and RemoTeC for cloud screening. In contrast, O'Dell et al.  and Boesch and Baker  use retrieved surface pressure from the O2 A-band as a proxy for cloud contamination: data are filtered out if the retrieved surface pressure differs by at least 20 hPa from the prior value. The latter method not only filters for thick water clouds but also quite efficiently filters out cirrus and scenes with large modifications of light path in general but has been shown to fail to detect some low clouds [O'Dell et al., 2012]. Using CAI L2 cloud flags has the advantage that one can separate the effects of optically thick water clouds from subvisual cirrus clouds and aerosols, allowing for detailed sensitivity studies and error analyses, which is one of the goals of this paper.
 For validation purposes, satellite retrievals are often compared with measurements of XCO2 from the Total Carbon Column Observing Network (TCCON) [Wunch et al., 2011a]. TCCON is a network of intercalibrated ground-based Fourier transform spectrometers that measure the absorption of direct sunlight by trace gas species. These measurements are thus much less influenced by atmospheric scattering by cirrus and aerosols than GOSAT observations. TCCON XCO2 measurements have been calibrated and validated against dedicated aircraft campaigns, and their resulting precision and accuracy have been estimated to 0.8 ppm (2- σ value) [Wunch et al., 2010; Messerschmidt et al., 2011].
 A first validation study of GOSAT XCO2 retrievals with RemoTeC was performed around six TCCON stations by Butz et al. , who reported first estimates of their precision (2.8 ppm) and accuracy (1.4 ppm). Scattering errors have been investigated from simulated GOSAT measurements in Butz and Hasekamp . The aim of this paper is to evaluate the ability of RemoTeC to account for atmospheric light scattering in XCO2 retrievals from real GOSAT measurements. Based on this evaluation, we propose a number of criteria to filter out scenes with very large light path modifications (i.e., clouds, subvisual cirrus, and high aerosol loading). We then characterize residual errors and propose a bias correction to obtain a GOSAT XCO2 data set with improved accuracy.
 In addition to RemoTeC retrievals, we also perform retrievals under the assumption of a nonscattering atmosphere. First, this allows us to characterize the performance and added value of accounting for aerosol scattering in the RemoTeC retrievals. Second, we can study the range of aerosol-induced errors in the vicinity of TCCON stations to evaluate the representativeness of this validation network. In this context, two different methods for matching TCCON measurements and colocated GOSAT overpasses are evaluated.
 In section 2, the GOSAT data set and recent developments of our full physics algorithm are presented. Insection 3, we evaluate methods to filter for clouds and thin cirrus, using TCCON measurements as a reference. In section 4, we investigate the residual errors caused by aerosols and evaluate the performance of RemoTeC. In section 5, we further characterize our product by studying potential biases with instrumental or geophysical parameters and propose a bias correction to cancel out some correlation of error in our retrievals. Finally, we discuss in section 6 the impact of the choice of colocation method on validation studies and the representativeness of TCCON stations at global scale in terms of scattering errors, before concluding in section 7.
2 Data and Methods
2.1 GOSAT Observations
 GOSAT is a joint project of the National Institute for Environmental Studies (NIES), the Japanese Space Agency (JAXA), and the Ministry of the Environment (MOE). The TANSO-FTS instrument records backscattered solar spectra in three channels in the short wavelength infrared (SWIR) centered at 0.76 μm (band 1), 1.6 μm (band 2), and 2 μm (band 3). It is also equipped with a band in the thermal infrared, not used here. It has a spectral resolution of ∼0.36 cm−1 in band 1 and ∼0.27 cm−1 in bands 2 and 3. Its instantaneous field-of-view of 15.8 mrad maps into a circular footprint of ∼5 km radius at the subsatellite point. GOSAT follows a polar sun-synchronous orbit with a 3 day repeat pattern and crosses the equator around 1 P.M. local time.
 Backscattered sunlight is recorded in two orthogonal polarization directions from which we calculate the total backscattered radiance (Stokes parameter I) as suggested by Yoshida and Ota . The radiometric calibration of the spectra is based on the Mueller matrix calculus of Kuze and Suto  and on the prelaunch measured calibration data available from the GOSAT User Interface Gateway (https://data.gosat.nies.go.jp/GosatUserInterfaceGateway/guig/GuigPage/openTechInfo.do). The latter also provides the tabulated instrument line shape used by our algorithm. In this paper, we used the calibrated L1B data up to version v110110.
 Onboard GOSAT, the CAI instrument aims at characterizing clouds and aerosols. It has a spatial resolution of 0.5 km and delivers cloud confidence levels for several hundred ground pixels within a single TANSO-FTS footprint, which are used here for cloud screening.
2.2 Retrieval Algorithm
 RemoTeC is a flexible algorithm developed to accurately retrieve CO2, CH4, and other absorbing species from SWIR satellite observations of backscattered sunlight. It has been described in detail in the framework of synthetic studies by Butz et al. [2009, 2010, 2012] and applied to a first analysis of GOSAT data in Butz et al. . It is based on an efficient radiative transfer model developed by Hasekamp and Butz .
 In RemoTeC, scattering particles are parameterized as spherical particles with a fixed refractive index (1.400 −i × 0.003), and their size distribution follows a power law, , with r is the particle radius and the exponent αs is called the size parameter. We consider a single scattering layer with a Gaussian height distribution of central height zs. The strength of the algorithm is its capability to simultaneously retrieve the 12-layer profiles of CO2 and CH4 column number densities along with three effective aerosol parameters: the mean height of the scattering layer (zs), the size parameter of the power-law distribution (αs), and the total column number density of aerosols (Ns). Our initial guess for the aerosol layer is given by zs=3 km, αs=3.5, and a scattering optical thickness (SOT) in the O2 A-band =0.1.
 For this so-called full physics method, we analyze radiances in four spectral windows: 12920 −13195 cm−1 (covering the O2 A-band), 6170 −6277.5 cm−1 (weak CO2 band), 6045 −6138 cm−1 (2 ν3 CH4 band), and 4806 −4896 cm−1 (strong CO2 band). Other retrieved parameters are the water vapor total column, a second-order polynomial albedo per window and spectral shifts per window. We also retrieve an intensity offset in the O2 A-band window to account for nonlinearity of the analogue electrical circuit and contribution from plant fluorescence [Frankenberg and Butz, 2011].
 At some points in this paper, we also discuss retrievals performed under the assumption of a nonscattering atmosphere. For these retrievals, we only use spectra recorded in the CO2 band at 1.6 μm and retrieved a four-layer profile.
 Since RemoTeC version 1.0, discussed in Butz et al. , several modifications have been made to the algorithm, currently RemoTeC version 1.9. The main ones are the following:
 We modified the side constraint in the Phillips-Tikhonov regularization scheme so that the degree of freedom for CO2 is pulled down to values between 1 and 1.5 (instead of 2 to 2.5 in the previous version, that was unrealistic);
 We now retrieve a radiance instead of a reflectance offset in the O2 A-band;
 We retrieve a spectral shift of the solar spectra instead of a shift of the solar line list;
 European Center for Medium-Range Weather Forecasts meteorological data, from which surface pressure and vertical profiles of pressure, humidity, and temperature are extracted, now comes from a higher resolution grid (0.75° × 0.75° instead of 1.5° × 1.5° previously).
 Among these updates, the modification of Phillips-Tikhonov regularization parameters had the main impact on the retrievals.
2.3 Data Selection and Filtering
 Currently, 2 years of TANSO-FTS data have been processed with RemoTeC v1.9, from April 2009 to mid-April 2011, at global scale. In this paper, we only consider data acquired with TANSO-FTS high gain setting and over land. The medium gain setting is used over regions of high albedo, mainly covering the Sahara desert and part of central Australia, but these data suffer from a scan speed instability due to microvibrations within the instrument [Kuze and Suto, 2012] and will not be discussed here.
 For validation purposes, we select GOSAT data acquired in the vicinity of 12 TCCON stations, at latitudes ranging from 67°N (Sodankyla, Finland) to 45°S (Lauder, New Zealand). In order to match TCCON measurements and GOSAT overpasses, spatiotemporal colocation criteria are applied. The temporal colocation method considers only TCCON measurements acquired the same day and within ± 2 h of each GOSAT measurement. For spatial colocation, we consider two distinct criteria. For the first one, we consider all GOSAT soundings within a 5° circle of a TCCON station to be colocated with that station. The second criterion is motivated by the fact that the observed XCO2 field is a convolution of surface fluxes with atmospheric transport. Given a TCCON station at location x, we want to delineate the area A over which the XCO2 field is identical to XCO2(x) (within some tolerance δ) under this convolution. A GOSAT sounding at location y∈A should, given a “perfect” instrument, be within δ of XCO2(x). Therefore, for the purpose of validation, we consider all soundings inside A to be colocated with the TCCON station at x. To implement this criterion, we generate XCO2 fields by propagating bottom-up estimates of CO2 surface fluxes detailed in Basu et al.  with the TM5 atmospheric transport model [Krol and Houweling, 2005] run globally at 1°×1° resolution. The idea behind it is that even if CO2 prior fluxes are not accurately known in the model, the XCO2 modeled gradients (not the absolute values) at short timescales are expected to be accurate. Since neither our transport model nor our fluxes are perfect, we put further restrictions by (a) averaging the modeled XCO2 field over weekly periods to minimize the impact of short-term transport errors, (b) demanding A to be a continuous, simply connected region, and (c) restricting A to be within ±7.5° latitude and ±22.5° longitude of x, keeping in mind that zonal transport is faster than meridional transport. The concept is illustrated around Lamont for a typical week in Figure 1. The area between the two dark red lines has a modeled XCO2 value within δ=0.5 ppm of the modeled value at Lamont (cyan star). GOSAT soundings taken that week are the green circles, of which those that fall within A are colored red. At the time we performed this analysis, bottom-up CO2 emission estimates were available till 31st of December 2010, and thus GOSAT-TCCON comparisons presented in this paper stop at that date.
 This method is similar to the one described by Oshchepkov et al. , but here we use a different transport model at a higher resolution (1°×1° grid instead of 2.5 °×2.5° grid for Oshchepkov et al. ) and a stricter threshold for the difference between sampled XCO2 modeled values at and around TCCON stations (0.5 ppm instead of 1 ppm). Our colocation method is different from that of Wunch et al. [2011b], who use potential temperature as a proxy for dynamical patterns, whereas our approach considers variations in XCO2 due both to transport and surface fluxes. In addition, their assumption is only valid at midlatitudes in the Northern Hemisphere, whereas our method can be applied to the entire globe. A world map highlighting the location of the 12 TCCON stations used as well as selected GOSAT overpasses using the second colocation method is shown in Figure 2. Unless stated otherwise, in this paper, we discuss in detail the results obtained with this second colocation method, while results from both methods will be compared in a dedicated section 6.1.
 Following Butz et al. , we filter and exclude GOSAT spectra with low signal-to-noise ratio (< 70), high solar zenith angle (> 70°), or high variability of surface elevation within the field of view (standard deviation of elevation variability within the field of view > 75 m) prior to retrieval. We identify cloudy scenes using TANSO-CAI Level 2 products, which provide cloud confidence levels for each CAI pixel ranging from confidently cloudfree (16th level) to confidently cloudy (first level). In a preprocessing step, we compute the number of CAI pixels that have a given cloud confidence level within an area twice as large as the TANSO-FTS inner field of view. Selecting an area larger than the FTS field of view for the cloud mask limits the effects of straylight by surrounding clouds and pointing errors of the satellite instrument. A threshold value on the fraction of confidently cloudfree CAI pixels can then be used to a priori filter out cloudy scenes. Details on the evaluation of the cloud filter are discussed in section 3.1.
 The CAI instrument is not able to detect subvisual cirrus, however, since it has neither a SWIR H2O band nor an appropriate thermal channel that is sensitive to clouds in the upper troposphere [Nakajima and Higurashi, 2008]. Recently, it was shown that the strong water vapor absorption bands in the 2 μm region (5150 −5179 cm−1) could be used to detect cirrus contamination in TANSO-FTS measurements [Yoshida et al., 2011]. Because the H2O mixing ratio decreases strongly with height, the bulk of atmospheric water vapor is situated in the middle and lower troposphere. Thus, for clear sky conditions, the solar radiation is entirely absorbed by atmospheric water, and as a result, the measured signal is close to the instrument noise level. However, in the presence of cirrus, radiation is scattered in the upper troposphere and only absorbed by a small amount of water vapor above the cirrus layer. Therefore, it was assumed by Yoshida and Ota  that whenever the radiance at strong H2O absorption bands exceeds the noise level, this indicates the presence of cirrus clouds in the air mass observed by the spectrometer. A cirrus filter based on this assumption is evaluated in section 3.2.
 A posteriori we screen and exclude additional retrievals, based on convergence and quality of the fit (χ2> 4, Degree of Freedom for Signal < 1 for CO2, number of iterations > 20, XCO2 retrieval error > 1.2 ppm). We also filter for high values of the retrieved scattering optical thickness (SOT) in the O2 A-band (SOT < 0.25, justified in section 4.1), extreme values of the retrieved size parameter (allowed range: 3<αs<4.7), and extreme values of the retrieved intensity offset in the O2 A-band (−1×10−9<Ioffset<7×10−9), and we define an empirical aerosol filter ωs combining the three aerosol parameters [Butz et al., 2010, 2011]:
 We filter out data for which ωs> 300 m, which corresponds to difficult scenes where many large particles were retrieved at high altitudes (justified in section 4.1).
 We summarize in Table 1 the number of data acquired over land that remains after each filtering step at global scale for 2 months: July 2009 and January 2010. These particular months represent two extreme cases in the number of data that passed different filters, due to changing solar elevation and cloud coverage with season. From Table 1, we see that we only analyze a small fraction of the global data set: about 8% and 17% of the data pass the a priori filters for January and July, respectively. After data processing by RemoTeC, about a third of the retrievals are filtered out due to nonconvergence or bad quality of the fit, and another 25% due to aerosol and/or cirrus filters. After all filters are applied, about 4000 data points per month (on average) are retained over land (gain H) surfaces.
Table 1. Number of Data that Passed Different Quality Filters for July 2009 and January 2010 (Data Over Land Only)a
aSZA, solar zenith angle; SNR, signal-to-noise ratio.
bSaturation flag is one of GOSAT quality flags, that partially filters for clouds already. Here we choose to show the effect of the saturation flag filter after CAI cloud filter is applied as there is some overlap between the two.
cOther filter parameters are the retrieved size parameter and the retrieved intensity offset in the O2A-band.
3 Evaluation of Errors Due to Clouds and Thin Cirrus
3.1 Evaluation of the Cloud Filter and Related Cloud-Induced Errors
 In this section, we investigate to what extent the full physics retrievals are degraded by cloud contamination. To address this topic, we performed test retrievals by running RemoTeC on all GOSAT data, unfiltered for clouds, acquired in the vicinity of the 12 TCCON stations.
 Histograms of the error on XCO2 as a function of the fraction of cloudfree pixels (as determined by CAI data) are presented in Figure 3. Error on XCO2 is defined as the difference between colocated GOSAT and TCCON XCO2 retrievals. GOSAT retrievals discussed here have been filtered for quality of the fit and convergence, but no other filters were applied. At this stage, scattering errors can be caused by clouds, cirrus, and/or aerosol, and here we only interpret the trend of the error as a function of cloud fraction.
 As the fraction of cloudfree pixels decreases, GOSAT retrievals have increasing low biases compared to TCCON, and they exhibit an increased scatter. We also note that as the cloudfree fraction decreases, more and more GOSAT retrievals did not converge or lead to a bad fit of the spectra: for instance, in the range 50 to 80% cloudfree, only 12% of data points passed the convergence and χ2 quality filters. This results in poor statistics for the cloudy cases, and hence in Figure 3, we only show the results for the range 80% to 100% cloudfree (along with their fraction of converged data). Failure to converge provides useful information in its own right: nonconvergence is expected to occur to some extent in cloudy scenes as the full physics algorithm tries to account for scattering events from cloud particles by fitting effective aerosol parameters, that have different optical properties. Also, in RemoTeC, a retrieval is interrupted and flagged as nonconvergent if the retrieved SOT is higher than one, which is often the case for clouds. The fact that nonconvergence occurs more frequently as the fraction of cloudfree pixels decreases, together with the degraded performance of the retrievals (increased bias and scatter) can thus be seen as evidence that the CAI cloud filter is working as expected. Best performances are obtained for the cases at least 99% and more cloudfree, and only this subset of data will be considered in the following sections.
 Finally, we also note that even for these 99% cloudfree cases, the mean bias (∼0.8%, i.e., 3.1 ppm) is offset from the median of the error distribution (∼0.6%, i.e., 2.3 ppm). This is due to a tail of low values of retrieved XCO2 probably caused by residual clouds or scattering by thin cirrus and/or aerosols particles, not captured by the CAI cloud flags.
3.2 Evaluation of the Cirrus Filter and Related Cirrus-Induced Errors
 As in the previous section, here we investigate to what extent are RemoTeC XCO2 retrievals degraded in the presence of cirrus clouds. We tested the efficacy of the cirrus filter proposed by Yoshida and Ota  (and described in section 2.3) with the help of simulations. We calculated synthetic spectra in the 2 μm region in the presence of a cirrus cloud located at 10 km altitude, composed of ice crystals, taking the optical properties of Hess and Wiegner . Figure 4 presents examples of synthetic spectra for different cirrus optical thicknesses (COT): 0, 0.02, 0.05, and 0.1. We also show the typical noise level of TANSO-FTS spectra. As COT increases, the signal in strong H2O absorption bands increases, as expected. Our cirrus filter considers a small spectral window encompassing two water absorption bands (5154.8–5157.8 cm−1). Figure 4 shows that such a cirrus detection method will flag as cirrus-contaminated scenes with typically COT > 0.02.
 We note that the presence of an elevated aerosol layer would produce a similar spectrum (see an example Figure 4, where a scene with an aerosol layer located at 10 km altitude and SOT =0.13 leads to a similar spectrum as one with a cirrus optical thickness of 0.05). Cases of very high altitude aerosol layers would thus be flagged as cirrus with this method. However, except in the case of volcano plumes, aerosols should not be found at such high altitudes. For scattering layers located at moderate altitudes (less than 5 km), the resulting signal in the strong water absorption bands drops to the noise level; hence, these cases would not be flagged as cirrus.
 We also investigated if this detection method is reliable when there is a low atmospheric water vapor content, since it could be that for clear dry scenes, the signal in the considered spectral window significantly exceeds the noise level due to a lower absorption by water (leading to a false positive detection). No particular bias of the method was found with water content, and both simulated and observed spectra for cirrus-free regions display a saturation of the water bands even in the case of a relatively dry atmosphere. To illustrate the different cases, Figure 5 shows examples of actual TANSO-FTS measurements in this spectral region in the vicinity of Orléans (France), for cases flagged as cirrus-contaminated or not, including a relatively dry scene (with a retrieved water column of 1.5 ×1022 molecules cm−2).
 To summarize, this method efficiently detects high altitude scattering layers that are most likely cirrus (or occasionally aerosol volcanic plumes) and is efficient even in the case of relatively dry scenes.
 In the following, we study the corresponding scattering errors caused by cirrus (as detected by the method described above) in RemoTeC retrievals colocated with TCCON measurements. The error on XCO2 as a function of signal in the selected strong water absorbing bands is shown in Figure 6. Results are presented for both RemoTeC and the nonscattering retrievals. We note that a small fraction of the data contaminated by cirrus did not converge using the full physics algorithm but passed the quality filters of the nonscattering setup. To compare the performance of both methods, we thus only plot the common GOSAT soundings that passed quality filters of each algorithm.
 Figure 6 shows that as the cirrus signal increases, retrieved XCO2 values exhibit low biases of typically 2 to 8 ppm compared to the average error (the bulk of the retrievals are offset by about −2.3 ppm compared to TCCON). Such a low bias is expected in the case of nonscattering retrievals, as the presence of (unaccounted) cirrus over low or moderate ground albedo scenes yields an overestimation of the actual light path by the algorithm, which in turn yields an underestimation of XCO2. For instance, according to Aben and Hasekamp , a cirrus with an optical thickness of 0.05 over an area of surface albedo at 1.6 μm of 0.1 yields an underestimation of 8 ppm in the CO2 column if scattering effects are neglected. The correlation between retrieval errors and cirrus signal is less pronounced in the full physics case and outliers have less extreme values, which suggests that RemoTeC can partly account for scattering errors by cirrus by fitting effective aerosol parameters. However, filtering for cirrus remains necessary to improve accuracy by removing low-biased XCO2 values. From this analysis, we set the threshold of our cirrus filter at a radiance level of 2.5 ×10−9 in the region 5154.8–5157.8 cm−1, as for higher values, the error in XCO2 becomes too significant. Using this filter, the cirrus-flagged cases represent ∼13% of GOSAT data at global scale (after CAI cloud filtering is applied).
 Another way of looking at the impact of cirrus contamination is shown in Figure 7, which displays time series of retrieved XCO2 by RemoTeC around each TCCON station and highlights cases flagged by the cirrus filter. These cirrus cases are present around all stations at all seasons. A fraction of cirrus cases of typically 15% is observed in the vicinity of European stations (except at Sodankyla) and Tsukuba and is found to be the lowest around Sodankyla, Wollongong, and Lamont (6–7% only). During the period November to February, the occurrence of cirrus seems to drop for European stations and Park Falls. This is due to an observational bias, as a large fraction of GOSAT data are already filtered by our CAI cloud filter due to high cloud contamination in northern hemispheric winter.
 From Figure 6 and 7, we see that these cirrus cases are mostly outliers, but we note that a small fraction of the cirrus-contaminated data seems in agreement with the bulk of retrieved XCO2. It might be explained by the fact that some scenes are more or less challenging: depending on albedo, cirrus, and aerosol load, light path enhancing and shortening effects can occasionally cancel out, and the resulting scattering error could be negligible even if a lot of scattering events take place [see also O'Dell et al., 2012]. We estimate these cases with negligible error to represent about 10% of the cirrus-flagged data (at least based on the study of the TCCON stations surroundings), and we conclude that the cirrus filter's efficacy is satisfactory.
 Statistics of the agreement between RemoTeC retrievals and TCCON are shown in Table 2 for the at least 99% cloudfree data set, before and after cirrus filtering and using the 2 h temporal colocation. In the following, we refer to the bias as the mean difference between colocated TCCON and GOSAT individual measurements, the single-sounding precision as the 1- σ scatter of this difference and the interstation bias as the standard deviation of the set of 12 individual biases (sometimes used as an estimate of regional accuracy).
Table 2. Statistics of the Comparison of TCCON XCO2 Measurements With Colocated GOSAT Full Physics Retrievals Using the Large Colocation Box and Model Fields (First Three Groups of Rows) or the 5° Radius Colocation Region (Last Group of Rows)
 Adding the cirrus filter significantly improves the precision of full physics retrievals, from 3.60 to 2.75 ppm on average while removing 10% of the colocated data points. The mean bias is increased by 0.6 ppm on average, as low outliers are removed. The interstation bias is only slightly improved (from 0.9 to 0.8 ppm) after cirrus filtering, as most stations exhibit similar cirrus contamination and hence a similar change of bias when cirrus are filtered out.
 In the case where we neglect scattering, filtering for cirrus improves the precision of nonscattering retrievals as well, going from 6.1 to 3.8 ppm, while the interstation bias is improved from 1.8 to 1.3 ppm (Table 3).
Table 3. Statistics of the Comparison of TCCON XCO2 Measurements With Colocated GOSAT Retrievals That Neglect Scattering Using the Large Colocation Box and Model Fields (First Two Groups of Rows) or the 5° Radius Colocation Region (Last Group of Rows)
4.1 Aerosol-Induced Errors and Added Value of Full Physics Retrievals
 The previous section justified the use of two cloud filters, which can be used as a priori filters as they do not depend on retrieved quantities. We can thus assume that light path modifications that remain at this stage are mostly due to scattering events by aerosols. Here we continue with the evaluation of RemoTeC with the study of errors caused by aerosols. We aim to both justify the use of additional a posteriori filters for the full physics method and to evaluate the ability of RemoTeC to account for aerosol scattering by comparing its performance with retrievals performed under the assumption of a nonscattering atmosphere, once both data sets are filtered for clouds and thin cirrus.
 The main additional filters in the full physics setup are the aerosol filter (ωs, see equation ((1))) and the total retrieved SOT. First, we note that a strong overlap exists between the cirrus filter and the filter ωs. This is illustrated in Figure 8, where the retrieved central height of the parameterized aerosol layer is plotted as a function of retrieved SOT. Most cases flagged as cirrus-contaminated have a retrieved effective height between 7 and 12 km and have a value of ωs that exceeds the threshold of 300 m. It shows that even if cirrus clouds are not included in our radiative transfer model, our algorithm tends to retrieve an effective height of scatterers that is typical of cirrus layers, i.e., located close to tropopause level. The aerosol filter ωs is thus a valuable tool that can be a strong asset when information on cirrus is not available. This will be the case for the future instruments OCO-2 (Orbiting Carbon Observatory-2) and Tropospheric Monitoring Instrument, as their spectrometers will not cover the strong water bands.
 The remaining cases with high values of ωs that are not flagged as cirrus-contaminated (blue squares in Figure 8) represent about 3–4% of the data set. They mostly correspond to cases where high SOT (>0.2) values were retrieved at rather low altitudes (2 to 5 km) and with values of the size parameter αs smaller than average (corresponding to particles larger than average). A few of these cases with high values of ωs also correspond to high altitude scattering layers (>12 km) with a low SOT (<0.1), which could be residual thin cirrus. These cases (ωs>300 but not cirrus-flagged) are represented in blue in the XCO2 time series of Figure 7 and mostly correspond to outliers compared to TCCON. Two illustrative examples are Park Falls and Bialystok during summer, where a significant number of these cases are observed and correspond to low values of XCO2. It is thus justified to add this filter to the set of quality filters of full physics retrievals, as it is quite efficient: it removes only a small fraction of data (once the cirrus filter is applied) that are mostly strong outliers. The last combination of cases corresponds to cirrus-flagged data with low values of ωs (green diamonds in Figure 8 and purple stars in Figure 7). These cases are a minority (less than 1%); about half of them seems to correspond to outliers but not as strongly as the other cases.
 RemoTeC retrievals are further improved when an upper limit is additionally set on SOT. The reason for this is not that we observe a predominance of low- or high-biased retrievals, but rather the scatter of XCO2 retrievals increases with SOT, hence the necessity to add this filter to further improve the precision. This is illustrated in Figure 9, which shows the error on XCO2 as a function of retrieved SOT up to values of SOT =0.5.
 In the end, best performances are achieved when both cloud and aerosol filters are applied. Corresponding statistics are summarized in Table 2. Precision is now improved to 2.45 ppm and interstation bias to 0.7 ppm. We find that the largest gain in precision comes from applying the cirrus filter, and not the combination of aerosol filters (ωs and SOT). This is in agreement with other studies that showed that scattering errors in solar backscattered measurements are primarily coming from light path shortening effects from cirrus layers and that aerosols play a secondary role [e.g., Aben et al., 2007; Heymann et al., 2012]. Of course, as our cirrus filter and ωs strongly overlap, the outcome and influence of these filters depend on which filter is applied first to the data. However, as demonstrated by calculation of synthetic spectra and by the analysis of retrieved parameterized height, we show that this ensemble of cases where the filters overlap corresponds to an elevated (7–12 km) scattering layer, which are most likely cirrus.
 Considering the different TCCON stations individually, we note that for Darwin, Lamont, or Karlsruhe for instance, adding the two aerosol filters only improves marginally GOSAT precision and/or accuracy, whereas for Sodankyla or Bremen, there are more occurrences of difficult scenes (with a high retrieved value of SOT or ωs) that need to be filtered out. In addition, for retrievals that neglect scattering, a precision as low as 3 ppm is achieved around Darwin and Lamont (after strict cloud and cirrus filtering), which is surprisingly good for such a simplistic approach. These two results show that the range of aerosol-induced scattering errors varies significantly from station to station.
 At this stage, looking at the statistics of TCCON-GOSAT agreement, one could thus conclude that the advantage of full physics setup over nonscattering assumption is mainly adding filtering steps to increase precision over nonscattering results (2.4 ppm versus 4 ppm), as the interstation bias is rather similar. To better assess the added value of full physics retrievals, in Figure 9, we first compare the correlation of errors on XCO2 as a function of retrieved SOT for both setups. Only GOSAT data that pass the full physics convergence filters are selected, and we assign the value of SOT that was obtained for the corresponding RemoTeC retrieval to the nonscattering one. As expected, the error and scatter increases with SOT for retrievals that neglect scattering, with a correlation of 0.33. The range of error between values of SOT =0.05 and SOT =0.45 almost reaches 8 ppm, which is significant. On the other hand, full physics retrievals do not exhibit any significant correlation of errors with retrieved SOT.
 We also expect scattering errors to depend on albedo. Indeed, as derived from simulations [Butz et al., 2009, 2010], scattering events by aerosols tend to shorten the light path in regions of low ground albedo (leading to underestimation of XCO2 if scattering is neglected) and enhance the light path in regions of high ground albedo. Figure 10 shows the correlation of errors with albedo for both setups (common data only), and once the filter SOT < 0.25 is applied. As expected, in the nonscattering case, there is a nonnegligible correlation of errors with albedo (0.28) leading to a 2.3 ppm difference in retrieved XCO2 for a variation of 0.15 in albedo. This error is also probably underestimated; as for comparison purposes, we have only plotted data points common to the RemoTeC retrievals, i.e., most challenging scenes with high SOT and/or high ωs values are removed from the nonscattering data set. In the full physics case, we note a small anticorrelation of errors (−0.13) with albedo. This residual error indicates that our algorithm tends to overestimate multiple scattering effects, or backscattering effects. Using a linear regression fit, we estimate the error on XCO2 to be ∼0.8 ppm per increment of albedo of 0.15. In spite of this negative point for RemoTeC, systematic errors with albedo at global scale are thus expected to be about three times less for the full physics setup as compared to nonscattering retrievals. A bias correction scheme will be presented in section 5 to further reduce systematic errors.
 We can use correlation of errors on albedo to estimate the accuracy of the retrievals at regional scales. Globally, the albedo at 1.6 μm is in the range 0.1–0.45 (apart from desert areas, not studied here, for which albedo values up to 0.7 are observed). Systematic errors on XCO2 for variations of 0.15 in albedo (as quoted above) can thus be considered a reasonable estimate of regional accuracy. The corresponding error obtained for the full physics retrievals, of 0.8 ppm, is of the same order of magnitude as the TCCON interstation bias. However, the 2.3 ppm systematic error with an albedo change of 0.15 derived in the nonscattering case is much higher than the corresponding interstation bias, estimated to be 1.3 ppm. We can thus reasonably say that in the nonscattering case, the station-to-station variability (here based on 12 TCCON stations) is a figure of merit that underestimates regional systematic errors. This conclusion could change depending on future network extension.
 Finally, we have also investigated how the validation results presented in Table 2 were impacted by the use of different CO2 prior profiles in the TCCON and GOSAT data analysis. Indeed, because GOSAT averaging kernels are not equal to unity, the XCO2 retrievals are intrinsically dependent on the choice of the CO2 prior profile, which is an effect we neglected in our validation study. To estimate this effect, we have calculated the following quantity for four TCCON stations (Lamont, Darwin, Wollongong, and Sodankyla):
where ak is the column averaging kernel matrix for GOSAT measurements and xapr TCCON and xapr GOSAT are the CO2 a priori profiles used for TCCON and GOSAT XCO2 retrievals, respectively. We find that A has a peak-to-peak amplitude of typically 0.5 ppm, with a 1- σ scatter of 0.2 ppm. The mean value of A varies slightly from station-to-station (between −0.15 and −0.55 ppm for the four stations considered) but is uncorrelated to the station-to-station variability of the TCCON-GOSAT biases. For instance, the mean bias at Lamont reported in Table 2 is −2.4 ppm and that at Sodankyla is 0.0 ppm, whereas A has a mean value of −0.15 ppm for both these stations. We thus conclude that the values reported in Table 2 are only marginally influenced by the use of different priors.
 Here we demonstrated that the strong added value of full physics results is not only to improve single-sounding precision but also to reduce the dependency of errors on aerosol optical thickness and albedo (linked to multiple scattering errors). This is expected to greatly improve accuracy at global scale compared to nonscattering retrievals. We also showed that the interstation bias, as calculated from these 12 TCCON stations alone, is currently not an appropriate figure of merit by itself for estimating accuracy but that the study of correlation of errors is at the moment more reliable for this purpose.
5 Residual Errors and Bias Correction
5.1 Assessment of Biases
 In the previous section, a residual dependence on the albedo at 1.6 μm was found for the full physics retrievals. To better evaluate and characterize our products and their accuracy, here we also look for potential correlations of errors with other parameters than SOT and albedo: with instrumental, geophysical, meteorological, retrieved parameters, etc. A similar approach was followed by Wunch et al. [2011b] to estimate biases in the ACOS B2.8 and 2.9 data products and by Cogan et al.  in UoL-FP XCO2 retrievals. As a reference, Wunch et al. [2011b] used the property that XCO2 fields south of 25°S undergo very little variations, after the mean annual increase and seasonal cycles are subtracted. Wunch et al. [2011b] found a correlation of their retrieved XCO2 with the following four physical parameters:
 Blended albedo, defined as follows: ;
 Signal in O2 A-band;
 Difference between retrieved and meteorological pressure (ΔP);
 Air mass, defined as 1/cos(SZA) +1/cos(VZA), where SZA and VZA are the solar and viewing zenith angles, respectively.
 They subsequently derived a bias correction based on linear regression fits to the data to improve the precision and accuracy of their XCO2.
 Unlike Wunch et al. [2011b], here we look for correlation of errors in the difference between GOSAT and TCCON colocated XCO2 measurements, instead of using detrended XCO2 fields south of 25°S. Correlation coefficients of the error on XCO2 with 11 parameters are listed in Table 4, while examples of correlation plots are shown in Figure 11 as a function of six parameters: air mass, retrieved water column, blended albedo, signal in O2 A-band, SOT ×zS, and the inverse of the retrieved aerosol size parameter, 1/αs. The main correlation of errors (0.28) is found with 1/αs. To some extent, this error may be linked to the aerosol effective parametrization in RemoTeC. For instance, it could mean that the power law size distribution is not a valid approximation within the whole range of particle size considered.
Table 4. Correlation Coefficients of Several XCO2 Data Products (The Original Retrievals, and the Updated XCO2 Data Sets After Each Step of Bias Correction) With Various Instrumental, Geophysical, and Retrieved Parameters
Albedo at 1.6 μm
Albedo at 2 μm
Signal in O2 A-band
 The range of error in XCO2 over the complete range of values of αs considered (3 to 4.7) is about 4 ppm, which is significant. Such correlation of error may lead to regional and/or seasonal systematic errors and hence may hinder the accuracy of our XCO2 product and its use for inverse modeling. If possible, we aim at reducing correlation of errors in the framework of future updates of RemoTeC, for instance by modifying our effective parametrization of aerosols. In the meantime, a bias correction was developed and tested to reduce potential systematic errors in the current data product.
5.2 Bias Correction: Method and Evaluation
 We first applied a linear correction to the XCO2 data as a function of 1/αs, based on a linear regression fit. Then, we repeated the analysis of correlation of errors of this new data set. Values of the different correlation coefficients are summarized in Table 4. Applying this first correction reduces the errors with most parameters, for instance with albedo in each band and air mass, even if these parameters are not taken into account in the bias correction. However, a nonnegligible correlation is introduced as a function of SOT ×zS (correlation of −0.23). To correct for it, we apply the following bilinear bias correction (derived after repeated tests):
 This time, when we apply equation ((3)), no significant correlation of errors remains in the updated data set (Table 4), as far as the comparison with TCCON and the studied parameters are concerned. Corresponding correlation plots with the bias-corrected product are shown in Figure 12. The fact that the dependency of errors with albedo is much reduced after bias correction (even though this parameter is not taken into account in equation ((3))) indicates that this ad hoc correction partly cancels out residual scattering errors, as they depend on ground albedo. We thus expect a significant overall improvement in terms of reducing systematic errors. We note that other bias corrections were tested, with other parameters or applied in different orders, but they were not found to be satisfactory.
 We then evaluated this new product by (1) repeating our validation study to estimate in particular the gain in precision and (2) comparing the bias corrected product to independent data (CarbonTracker2010) [Peters et al.,2007] at global scale.
 The comparison to TCCON now yields a bias near zero by construction. The new statistics indicate that the single-sounding precision of GOSAT retrievals, once bias corrected, is systematically better at each station, by values between 0.1 and 0.4 ppm. Now on average, a precision of 2.3 ppm is obtained, varying between 1.9 ppm for Lamont and 2.6 ppm around Garmisch. This improvement is a direct consequence of the cancellation of error dependency, and the conclusion that a better scatter is obtained at each station is quite robust. The interstation bias is similar, about 0.7 ppm, but as explained previously, it is not to be taken as an exhaustive representation of accuracy. We find that Sodankyla and Wollongong are the two outliers in terms of individual biases, where GOSAT retrievals are higher than TCCON by, respectively, 1.8 and 1.1 ppm, after bias correction. These two sites were already outliers before the bias correction, with values about 1.7 and 0.9 ppm higher than the average bias (Table 2). We could not find the reason for these high biases, which cannot be explained by a correlation of errors with the different studied parameters. These biases could result from an unidentified source of error left in RemoTeC, and/or a bias in TCCON measurements at these stations. For the former possibility, it could be that we overlooked a correlation with a given parameter or with a complex combination of various parameters, or that because the ground-based network is rather sparse, some source of systematic errors at global scale are not visible in the GOSAT subset of data colocated with TCCON. We also note that Wollongong is a challenging site for validation purposes as it is an urban area on a narrow coastal plain, located approximately 2 km from both the ocean and a 400 m high escarpment, which is hardly ideal for comparing to large-scale XCO2 measurements obtained from the GOSAT 10 km footprint.
 Finally, we then investigated the effect of the bias correction at global scale. The change in XCO2 induced by the bias correction is plotted in Figure 13 for the combined months August and September 2009. After a global 2 ppm bias was removed, regional patterns in the range ± 2 ppm, are observed. This range of variation in XCO2 at global scale is in agreement with the range of errors induced by the main correlation of errors with 1/αs observed around TCCON stations, mentioned previously (a 4 ppm amplitude, see also Figure 11). In addition, we recall that at global scale, the range of αs is the same as in TCCON vicinity, as we chose to restrict its values to 3 to 4.7 as part of one of the postfiltering steps, to remove a few outliers in the TCCON-GOSAT comparison. In other words, the extrapolation of the bias correction from TCCON surroundings to global scale seems consistent.
 Over some regions, systematic changes in XCO2 are observed: for instance, the bias correction has the effect of systematically increasing XCO2 by 1 to 1.5 ppm over the Mongolian region throughout the year. Over the United States, we note a systematic asymmetry between the west, that sees its XCO2 increased by 0.5–1 ppm on average and the east, where the bias correction has the effect of reducing XCO2 by 0.5–1 ppm. These two regions are highlighted in Figure 13. In order to evaluate whether these changes are reasonable or if they degrade the quality of our product, we performed comparisons with our prior XCO2 coming from CarbonTracker assimilation system. The difference between CarbonTracker and our retrievals, for 6 months of data, are plotted as a function of longitude in Figure 14, before and after bias correction. Concerning the United States, we find that the difference between CarbonTracker and our original XCO2 retrievals increases with longitude, which is not expected. CarbonTracker XCO2 estimates are supposedly the most reliable in the United States, as the model assimilates flask samples data from numerous U.S. surface stations. When the bias correction is applied, this trend with longitude is efficiently canceled out. We can thus assume that the asymmetry present in the bias correction function actually corrects for an erroneous asymmetry observed in the original data set. Over Mongolia, we also find that applying the bias correction has the effect of reducing the difference between GOSAT and CarbonTracker (Figure 14). Indeed, in our original data product, the difference between model and observed XCO2 over Mongolia shows an offset, of about −1 ppm, compared to the surrounding longitudes. Such variations with longitude are not expected in this region, where there are no strong sources or sinks of CO2. For these two examples, we find that the reported biases are primarily caused by the dependency of errors with αs. We conclude that although the bias correction is based on comparisons with TCCON, it seems to efficiently reduce systematic biases of the order of 1 to 2 ppm at global scale, which gives us confidence in this new data product.
6 Scattering Errors and Representativeness of the TCCON Network
 In this section, we address the issue of the representativeness of the 12 TCCON stations used in this study in terms of the range of light path modifications by aerosols surrounding each site. Indeed, the larger the range of scattering effects in the surrounding scenes are, the more robust the validation exercise of satellite data is. We first investigate the impact of the choice of the colocation method in validation studies, then estimate the range of scattering errors in GOSAT data at global scale.
6.1 Impact of the Choice of the GOSAT-TCCON Colocation Criterion
 We have investigated the added value of using the large colocation box (with additional constraints from model fields) compared to the 5° radius colocation area. The validation results obtained with the two different colocation methods are reported in Tables 2 and 3. The first difference lies in the number of colocated pairs, that is on average a factor of 3 larger when the larger colocation box is used, which renders the statistics more robust. We can first check that these additional data points, coming from regions farther away from TCCON stations, do not contain especially high- or low-biased XCO2 values compared to those within 5° of TCCON stations by looking at the full physics results: going from the 5° colocation to the larger box, both the bias and precision do not change significantly for most stations. Exceptions are found for Sodankyla and Wollongong, where the bias changed by ∼1 ppm, but at the same time, the precision was improved by 1 ppm, which is why we consider the larger colocation method to give more robust and representative biases.
 However, a higher number of data in itself is not the only advantage of using this new colocation method. We also show that the range of scattering errors due to aerosols in TCCON surroundings is enhanced using the large colocation box, more or less significantly depending on the station considered. This can be seen by looking at the range of errors in the nonscattering retrievals compared to TCCON results. Two examples are shown, for Sodankyla (Finland) and Park Falls (USA), in Figure 15, that compares the range of scattering errors in the vicinity of these stations using the two colocation methods. These results are plotted as a function of albedo, and we also color-code data points with an SOT value, as determined from the RemoTeC retrievals, of greater than 0.15. For Sodankyla, extending the colocation area has the effect of adding more data that have a high SOT. As Sodankyla surroundings exhibit a low albedo at 1.6 μm (0.14 on average), higher SOT means even more light path shortening effects, and these data points correspond to low outliers. As a result, the scatter of the GOSAT nonscattering retrievals is increased significantly (from 3.9 to 5.5 ppm), and the mean bias is pulled down from −2.7 to −4.2 ppm. In the case of Park Falls, extending the colocation area significantly changes the distribution of ground albedo sounded, with a greater fraction of data toward larger albedo (the albedo range in itself is not significantly increased). As a consequence, because of the positive correlation of errors with albedo, the mean bias is increased, by 0.8 ppm, when going from a 5° colocation criterion to a larger colocation box. We also note that a larger fraction of high SOT cases are found when using the large colocation box, as for Sodankyla. Overall, when the large colocation box is used, nonscattering results present a slightly degraded precision and a larger interstation bias, as the coverage of albedo and/or SOT values is modified and the range of scattering errors increased. The overall range of albedo covered by the ensemble of all colocated TCCON-GOSAT data is not significantly extended when the larger colocation box is used; the main impact of the change of colocation method is rather that more challenging scenes are included in the validation studies at the scale of individual TCCON stations.
 To summarize, the large colocation box, additionally constrained by modeled fields, is found beneficial for the analysis of correlation of errors as it covers more challenging scenes around individual TCCON stations, even though it does not extend significantly the range of albedo. In the nonscattering case, the statistics of the agreement with TCCON change quite considerably depending on the choice of the colocation method, whereas the statistics are more stable for the full physics setup. This also demonstrates the robustness of our RemoTeC retrievals.
6.2 Estimation of Scattering Errors at Global Scale
 In the previous section, we investigated the range of scattering errors due to aerosols in the vicinity of the TCCON stations. However, many regions are not covered by the TCCON network, and it could be that more challenging scenes for satellite retrievals are not currently covered by this network and cannot be validated. This is the issue we want to address in this section.
 We expect that the more challenging scenes are located in desert areas such as Sahara, where the combination of high albedo and high aerosol load during dust storms induce complex light path modifications. No TCCON station lies in the Sahara vicinity (except one at Tenerife, which is located on Canary islands at 2370 m altitude and makes direct comparisons not straightforward); however, on the other hand, the corresponding GOSAT data over Sahara are acquired with the medium gain setting, and these data are not discussed here.
 We thus focus at the moment on the range of scattering errors by aerosols at global scale for TANSO-FTS high gain data only, to investigate if TCCON surroudings are representative of the whole variability of errors at least in this subset of GOSAT data. Because we cannot compare our retrievals to the “truth” at global scale, we choose here to study the range of the differences between RemoTeC (bias-corrected) and nonscattering XCO2 retrievals. The differences between the two retrievals are linked to the range of scattering errors, as RemoTeC partially accounts for them. Indeed, we showed in the previous section that no significant correlation of errors remains in the bias-corrected product with SOT, albedo, aerosol parameters, etc. We note that RemoTeC retrievals are already filtered for difficult aerosol scenes; hence, this study should give an estimate of the range of aerosol-induced scattering errors at global scale for the “good” RemoTeC retrievals only.
 We show in Figure 16 the difference in XCO2 from the two sets of retrievals at global scale for 1 year (June 2009 to May 2010), as a function of albedo at 1.6 μm. Color-coded are different ranges of SOT. We note that the difference between the two XCO2 data sets increases with albedo and SOT. Hence, the data in this figure nicely reproduces the expected trend, as the error from retrievals that neglect scattering should also increase with albedo and SOT (see for instance Aben and Hasekamp ). At global scale, the range of error in nonscattering retrievals (compared to RemoTeC) is within ± 5.8 ppm, spanning a range of albedo from 0.1 to ∼0.46. About 12% of the data points have retrieved values of SOT between 0.2 and 0.25, which is quite significant. On the other hand, when the same analysis is performed for TCCON surroundings only, with the large colocation box, the range of error is slightly less (−5.4 to 5 ppm), and the albedo values are smaller, up to 0.38. In addition, only 7% of these data points have SOT values between 0.2 and 0.25. We thus conclude that TCCON surroundings, even when the more appropriate colocation criterion is used, are currently not completely representative of the whole globe and exhibit a lack of coverage of high albedo (higher than 0.38 at 1.6 μm) regions, and of regions with larger SOT, compared to GOSAT global coverage (with high gain setting only). We note that with the recent addition of several new TCCON stations (Caltech in the Los Angeles basin, Réunion Island, Ascension Island), TCCON is getting more representative all the time, and this situation may be mitigated. This will be tested when enough TCCON-GOSAT colocated pairs are available from these stations to update our analyses.
 In the meantime, we conclude that for validation purposes, it would be interesting to choose future locations of validation stations not only according to scientific relevance and other practical considerations but also with respect to the difficulty of the scene for satellite retrievals. Regions with combined low albedo (< 0.15) and frequent occurrence of SOT larger than 0.2 were mostly found at latitudes higher than 50°N in Eurasia and North America as well as in central Africa; whereas regions of higher albedo (0.35–0.45) and high SOT were mostly found in the middle East and central Asia.
7 Summary and Conclusion
 In this paper, we presented a detailed characterization of our full physics retrieval algorithm, RemoTeC. In particular, we evaluated how RemoTeC handles scattering errors due to the presence of water clouds, thin cirrus, and aerosols based on GOSAT measurements colocated with TCCON. Comparisons with nonscattering retrievals were performed to estimate the added value and performance of the full physics retrievals, and this comparison study also allowed us to broaden the discussion to the field of validation methodology. Finally, we also investigated potential biases and systematic errors in our data product and proposed a bias correction based on two parameters.
 We found that RemoTeC retrievals need to be strictly filtered for water clouds (99% of the outer field of view must be cloudfree) and cirrus to avoid systematic errors and obtain the best performance, even though we note that the full physics results are slightly less affected by cirrus than the nonscattering retrievals. Applying the cirrus filter to RemoTeC retrievals is quite efficient, as it removes about 10% of the cloudfree data set while improving the single-sounding precision from 3.6 to 2.8 ppm. We then justified the use of two additional filters for the full physics method, based on retrieved scattering parameters, to remove difficult scenes that RemoTeC cannot currently process with sufficient accuracy. Scattering errors caused by (low layers of) aerosols were found secondary compared to errors caused by cirrus (or by elevated layers of aerosols, indistinguishable in our retrievals).
 We next compared the performance of both RemoTeC and nonscattering retrievals in terms of their ability to account for light path modifications caused by aerosols, once the data sets are filtered for clouds and thin cirrus (and most difficult aerosol scenes). We showed that the full physics algorithm, compared to nonscattering retrievals, significantly reduces the correlation of XCO2 errors with albedo and removes the correlation with SOT, hence reduces regional and/or systematic errors. However, correlations are introduced with other effective aerosol parameters, in particular the size parameter αs, and we propose a bias correction that improves both precision and accuracy.
 The detailed error analysis that we performed can provide insights into which elements of our algorithm need further improvements. For instance, a retrieval of two types of particles, with a fine mode and a coarse mode, could be investigated to reduce this error dependency, as well as other size distributions. However, one has to keep in mind that the degrees of freedom for aerosol parameters are rather low (∼2.5 in the current setup), hence the number of retrieved effective aerosol parameters should remain low, which makes the task challenging.
 Our study also has several implications for validation methodology in general:
 We showed that the range of scattering errors due to aerosols varied quite significantly from station to station. It is thus very important to include as many TCCON stations as are available in validation studies of spaceborne retrievals, in order to cover different aerosol scenarios.
 Due to the lack of the global coverage of the 12 TCCON stations used in this study, the interstation bias is currently not, by itself, an exhaustive measure of retrieval accuracy and should be complemented by analysis of the correlation between measurement errors and/or biases and retrieval parameters.
 The size of spatial colocation region may be relaxed using additional constraints based on modeled XCO2 gradients within the colocation area, to increase the size and diversity of validation data sets. Wunch et al. [2011b] reached a similar conclusion using a colocation criterion based on potential temperature at 700 hPa and showed that the number of colocated data points increased significantly. Here we additionally show that the added value of this colocation method is also to extend the range of scattering errors in TCCON surroundings (mostly at the scale of individual stations). As a result, not only are the overall statistics more representative but also it provides a more robust evaluation of the biases in the GOSAT retrievals.
 Highly accurate validation of XCO2 at global scale, based on GOSAT-TCCON comparison only, is still challenging to this day. While the TCCON network surroundings represent a fairly large subset of conditions encountered at global scale, they also lack of coverage of more challenging scenes for satellite retrievals (with higher albedo and/or higher SOT). Therefore, for validation purposes, an extension of the TCCON network taking into account this aspect would be very valuable.
 We note that the precision currently reached by the satellite retrievals themselves is another limiting factor in the characterization of satellite retrieval biases and accuracy. Simultaneously, the satellite retrievals are starting to approach the accuracy of the TCCON validation network itself, estimated to 0.1% (1- σ value) [Wunch et al., 2010]. Further developments are thus still needed to improve the precision and accuracy of satellite-derived XCO2, to meet the demanding user requirements for inverse modeling of sources and sinks. Validation of satellite data will also benefit from TCCON network extensions and an improved characterization of TCCON accuracy and site-dependent biases.
 Future efforts will focus on validation of glint retrievals over the ocean obtained with RemoTeC, not discussed here, which play an important role in obtaining a more global coverage. We will also investigate in more detail the biases in retrievals obtained from the medium gain setting of TANSO-FTS over deserts, which should be among the most difficult scenes for RemoTeC (high albedo and large amount of dust aerosols). We note that for these two subsets of GOSAT data, validation will remain challenging as currently; only a few stations are located in the vicinity of the ocean or desert areas.
 In the meantime, the developments presented in this paper show that we have improved the quality of our retrievals compared to previous work: the precision achieved is now ∼2.45 ppm (2.3 ppm after bias correction), instead of 2.8 ppm as reported in Butz et al. . Furthermore, the bias correction seems to result in an improved accuracy; as no significant bias with, for instance, albedo, is left after the correction is applied, even though we do not consider the albedo in the bias correction. Improved accuracy at global scale is also assessed by independant comparison with CarbonTracker model fields. Retrieved global XCO2 fields from GOSAT using RemoTeC v1.9 are now being used in inverse models to evaluate their capacity to better constrain sources and sinks of CO2.
 Access to GOSAT data was granted through the second GOSAT research announcement jointly issued by JAXA, NIES, and MOE. SG acknowledges funding from ESA's Climate Change Initiative on GHGs and the European Commission's seventh framework program under grant agreement 218793. AB is supported by the Emmy-Noether programme of Deutsche Forschungsgemeinschaft (DFG) through grant BU2599/1-1 (RemoTeC). DS is funded by the Dutch User Support Program under project GO-AO/21. SB was supported by the Gebruikersondersteuning ruimteonderzoek program of the Nederlandse organisatie voor Wetenschappelijk Onderzoek (NWO) through project ALW-GO-AO/08-10. We wish to thank Jean-Michel Hartmann and Ha Tran for providing line-mixing parameters. CarbonTracker 2010 results were provided by NOAA ESRL, Boulder, Colorado, USA from the website at http://carbontracker.noaa.gov. TCCON data were obtained from the TCCON Data Archive, operated by the California Institute of Technology from the website at http://tccon.ipac.caltech.edu/. U.S. funding for TCCON comes from NASA's Terrestrial Ecology Program, grant NNX11AG01G, the Orbiting Carbon Observatory Program, the Atmospheric CO2 Observations from Space (ACOS) Program and the DOE/ARM Program. Some of the research described in this paper was performed at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration. The Darwin TCCON site was built at Caltech with funding from the OCO project and is operated by the University of Wollongong, with travel funds for maintenance and equipment costs funded by the OCO-2 project. We acknowledge funding to support Darwin and Wollongong from the Australian Research Council, Projects LE0668470, DP0879468, DP110103118, and LP0562346. Lauder TCCON measurements are funded by New Zealand Foundation of Research Science and Technology contracts C01X0204 and CO1X0406. The Garmisch TCCON team acknowledges funding by ESA (GHG-CCI project via subcontract with University of Bremen) and by the EC within the INGOS project. We acknowledge financial support of the Bialystok and Orléans TCCON sites from the Senate of Bremen and EU projects IMECC and Geomon as well as maintenance and logistical work provided by AeroMeteo Service (Bialystok) and the RAMCES team at LSCE (Gif-sur-Yvette, France).