Possible representation errors in inversions of satellite CO2 retrievals

Authors


Abstract

[1] Owing to global spatial sampling and sheer data volume, satellite CO2 concentrations can be used in inverse models to enhance our understanding of the carbon cycle. Using column measurements to represent a transport model grid column may introduce spatial, local clear-sky, and temporal sampling errors into inversions: the footprint is smaller than a grid cell, total column concentrations are only retrieved in clear skies, and the mixing ratios are only sampled at one time. To investigate these errors, we used a coupled ecosystem-atmosphere cloud-resolving model to create CO2 fields over fine (∼1° × 1°) and coarse (∼4° × 4°) grid columns from 1 km2 and 25 km2 pixels that utilized explicit microphysics. We performed two simulations in August 2001: one in central North America and one in the Brazilian Amazon. Differences between satellite and grid column concentrations were calculated by subtracting the domain mean column concentration from 10-km-wide simulated satellite measurements. Spatial and local clear-sky errors were less than 0.5 ppm for the fine grid column; however, these errors became large and biased over the coarse grid column in North America. To avoid these errors, transport models should be run at high resolution. Using satellite measurements to represent bimonthly averages created large (>1 ppm) errors for all cases. The errors were negatively biased (approximately −0.4 ppm) in the North American simulation, indicating that inverse models cannot use satellite measurements to represent temporal averages. Simulated representation errors did not arise because of differences in ecosystem metabolism in cloudy versus sunny conditions; rather, they reflected large-scale CO2 gradients in midlatitudes that were organized along frontal boundaries and masked under regional cloud cover. Such boundaries were not found in the dry-season tropical simulation presented here and may be less prevalent in the tropics in general. To avoid incurring errors, inversions must accurately model synoptic-scale atmospheric transport and CO2 concentrations must be assimilated at the time and place observed.

1. Introduction

[2] Variations of atmospheric CO2 concentrations contain information about sources and sinks which air interacts with as it is transported from place to place. Using atmospheric tracer transport models, inverse modelers can quantitatively estimate the strengths and spatial distribution of sources and sinks around the world from concentration data [Gurney et al., 2002; Rödenbeck et al., 2003; Baker et al., 2006]. These flux estimates are still highly uncertain in many regions because of sparse data coverage [Gurney et al., 2003]. Satellite CO2 measurements have the potential to help inverse modeling studies by improving the data constraint because of their global spatial sampling and sheer data volume. Previous studies have indicated that using spatially resolved, global measurements of the column-integrated dry air mole fraction (XCO2) with precisions of ∼1 ppm will reduce the uncertainties in regional estimates of sources and sinks of atmospheric CO2 [Rayner and O'Brien, 2001; Miller et al., 2007; Chevallier et al., 2007].

[3] Two existing satellites, the Atmospheric Infrared Sounder (AIRS) and the Scanning Imaging Absorption Spectrometer for Atmospheric Chartography (SCIAMACHY), provide information about CO2 concentrations. AIRS, on the Aqua platform launched in May 2002, measures 2378 spectral channels in the infrared (IR) from 3.74 to 15.4 μm [Aumann et al., 2003]. AIRS has a 1330 LST equator crossing time, nine 1.1° by 0.6° footprints in a single FOV, and scans ±48.95° from nadir, making 90 measurements per scan. A study by Engelen et al. [2004] demonstrated the feasibility of global CO2 estimation using AIRS data in a numerical weather prediction data assimilation system. Since AIRS measures IR radiances rather than reflected sunlight, it can be used to measure upper tropospheric-weighted CO2 concentrations during the day and at night; however, atmospheric mixing makes the upper tropospheric CO2 concentrations rather zonal, indicating that AIRS data can only inform about very broad features of the surface fluxes [Chevallier et al., 2005]. SCIAMACHY, which embarked on board the European Space Agency (ESA) Envisat satellite in 2001, is a polar-orbiting nadir looking instrument that measures reflected sunlight in the UV, visible, and near IR regions from 240 to 2400 nm. SCIAMACHY has a 30 × 60 km2 footprint that scans across a 960-km-wide swath and a 35 d repeat cycle with global coverage in ∼6 d. Studies by Houweling et al. [2004] and Buchwitz et al. [2005] indicate that SCIAMACHY measurements may be capable of detecting regional CO2 surface source/sink regions; however, accurate SCIAMACHY CO2 retrievals are limited to land regions because of low surface reflectivity over the ocean and are difficult because of calibration issues and spectral and spatial resolution [Houweling et al., 2004; Buchwitz et al., 2005].

[4] Two satellites designed specifically to measure XCO2 with ∼0.3–0.5% (1–2 ppm) precision are scheduled to launch in late 2008: the Orbiting Carbon Observatory (OCO) [Crisp et al., 2004; Miller et al., 2007] and the Greenhouse gases Observing Satellite (GOSAT) [National Institute for Environmental Studies, 2006]. Both satellites will collect high-resolution spectra of reflected sunlight in the 0.76 μm O2 A-band and the CO2 bands at 1.61 μm and 2.06 μm. A single sounding will consist of simultaneous observations from all three bands. OCO and GOSAT will fly in a polar Sun-synchronous orbit to provide global coverage with an equator crossing time ∼1300 LST. OCO will orbit just ahead of the Earth Observing System (EOS) Aqua platform in the A-train, which has a 16-d repeat cycle. To obtain an adequate number of soundings on regional scales even in the presence of patchy clouds, OCO will have a 10-km-wide cross-track field of view (FOV) that is divided into eight 1.25-km-wide samples with a 2.25 km down-track resolution at nadir. GOSAT will orbit at an altitude of 666 km with a 3-d recurrence. GOSAT is designed with cross-track pointing ability and will sample points with a variable width from 88 to 800 km.

[5] CO2 concentration fields retrieved from satellites will be used as inputs to synthesis inversion and data assimilation models to help reduce uncertainties in flux estimates; however, to utilize these measurements, care must be taken to sample the models following the satellite sampling strategy as closely as possible. Spatial representativeness errors may be introduced into inversions that compare CO2 concentrations from a model grid column to satellite concentrations sampled over only a fraction of the domain. Local clear-sky errors may exist in inversions that compare concentrations in a grid column that may be partially cloudy to total-column CO2 concentrations sampled at the same time but only over clear areas. Temporal sampling errors can result from comparing satellite measurements to temporally averaged concentrations. Incorrectly accounting for these errors could lead to errors in the flux estimates, particularly if they are biased. Spatially coherent biases as small as 0.1 ppm will alter flux estimates and must be accounted for [Miller et al., 2007]. Chevallier et al. [2007] simulated the impact of undetected biases and showed that regional biases of only a few tenths of a ppm in column averaged CO2 can bias the inverted yearly subcontinental fluxes by a few tenths of a gigaton of carbon. To avoid incurring errors in inversions, the spatial, clear-sky, and temporal sampling errors need to be investigated and quantified.

[6] Spatial representation errors are determined by the spatial variability: as horizontal spatial heterogeneity increases, observations characterize smaller areas and representation errors increase [Gerbig et al., 2003; Wofsy and Harriss, 2002]. Gerbig et al. [2003] used aircraft data to investigate spatial representation errors of mixed layer averaged CO2 mixing ratios and concluded that spatial representation errors reach 1–2 ppm for a typical 200–400 km horizontal resolution grid cell. Expanding on Gerbig's analysis, Lin et al. [2004] found column CO2 spatial representation errors of ∼0.6–0.7 ppm over North America and ∼0.2–0.3 ppm over the Pacific Ocean. Consistent with the results from Lin et al. [2004], an analysis of regional XCO2 variability using coarsely modeled (5.5° × 5.5°) total column CO2 shows that the spatial variability is smaller over oceans than over land and reveals that the spatial variability varies seasonally as well as geographically, with higher variability during the northern hemisphere summer and lower variability in winter [Miller et al., 2007].

[7] Although studies have investigated the spatial variability and associated representation errors of total column CO2, little research has been focused on clear-sky and temporal representation errors. This study analyzes spatial, local clear-sky and temporal sampling errors using a cloud resolving, coupled ecosystem-atmosphere model, SiB2-RAMS. We performed simulations over a temperate forest region and a tropical region, and we investigated these errors for both fine (∼1° × 1°) and coarse (∼4° × 4°) grid columns by simulating CO2 concentrations over these regions using explicit microphysics and grid cell increments of 1 km and 5 km, respectively.

2. Methods

2.1. Model Description, SiB2-RAMS

[8] The Simple Biosphere Model (SiB2) calculates the transfer of energy, mass, and momentum between the atmosphere and the vegetated surface of the earth [Sellers et al., 1996a, 1996b]. The coupled meteorological model is the Brazilian version of the Colorado State Regional Atmospheric Modeling System (RAMS) [Freitas et al., 2006]. RAMS is a comprehensive mesoscale meteorological modeling system designed to simulate atmospheric circulations spanning in scale from hemispheric scales down to large eddy simulations of the planetary boundary layer [Pielke et al., 1992; Cotton et al., 2002]. Details of the coupled model are given by Denning et al. [2003], Nicholls et al. [2004], Wang et al. [2007], and L. Lu et al. (Simulating the two-way interactions between vegetation biophysical processes and mesoscale circulations during the 2001 Santarem Field Campaign, submitted to Journal of Geophysical Research, 2007, hereinafter referred to as Lu et al., submitted manuscript, 2007).

[9] This study focuses on two simulations, one in North America (NA) and one in South America (SA). Both simulations consist of four levels of nested grids down to a fine domain of 97 km by 97 km with a grid increment of 1 km (Figures 1 and 2) . The NA simulation has 45 vertical levels extending up to 23.5 km, and the SA simulation has 32 vertical levels up to 24 km. To simulate cloud and precipitation processes explicitly, both simulations use the bulk microphysical parameterization in RAMS [Meyers et al., 1997; Walko et al., 1995]. We use the Mellor and Yamada [1982] scheme for vertical diffusion, the Smagorinsky [1963] scheme for horizontal diffusion, and the two-stream radiation scheme developed by Harrington [1997]. At the lateral boundaries we utilize the radiation condition discussed by Klemp and Wilhelmson [1978].

Figure 1.

Grid setup over North America, with the nested grids outlined in red. The vegetation classifications for the coarse grid column (grid 3) and the fine grid column (grid 4) are shown in the bottom left and right images, respectively. The red cross indicates the location of the WLEF tower.

Figure 2.

Grid setup in the South American simulation. The four grids in the simulation are outlined in red. The red cross displays the Tapajos Km 67 tower.

2.2. Input Data

[10] The vegetation cover is derived from the 1-km AVHRR land cover classification data [Hansen et al., 2000], and this study used 1-km resolution Normalized Difference Vegetation Index (NDVI) data from SPOT-4 (Systeme Probatoire d'Observation de la Terre polar orbiting satellite; United States Department of Agriculture/Foreign Agriculture Service and Global Inventory Modeling and Mapping Studies). The meteorological fields in NA are initialized by and the lateral boundaries are nudged every 3 h by the National Center for Environmental Prediction (NCEP) mesoscale Eta–212 grid reanalysis with 40-km horizontal resolution (AWIPS 40-km). The SA simulation is initialized and driven by 6-hourly lateral boundary conditions derived from Centro de Previsao de Tempo e Estudos Climaticos (CPTEC) analysis products.

[11] Surface carbon fluxes due to fossil fuel combustion, cement production, and gas flaring are prescribed from the 1995 CO2 emission estimates of Andres et al. [1996], with a 1.112 scaling factor applied to adjust the strength for August 2001 [Marland et al., 2005; Wang et al., 2007]. The air-sea CO2 fluxes are the monthly 1995 estimates from Takahashi et al. [2002]. The initial CO2 field and the lateral boundaries in SiB2-RAMS are set to 370 ppm for NA and 360 ppm for SA. A more detailed description of the input data and initialization can be found in Wang et al. [2007] and Lu et al. (submitted manuscript, 2007) for NA and SA, respectively.

2.3. Case Descriptions

[12] The NA simulation was centered on the WLEF tower in Wisconsin (Figure 1) (see Davis et al. [2003], Bakwin et al. [1998] and Ricciuto et al. [2007] for a description of the site and measurements). We analyzed the third grid (Figure 1, bottom left), which will be referred to as the coarse grid column and the fourth grid (Figure 1, bottom right), which we denote as the fine grid column. The coarse grid column was 450 km by 450 km with a 5 km grid increment. The northeastern portion of the domain included Lake Superior, the upper and middle regions are dominated by mixed forest, and the southern third contained significant areas of agriculture and cropland. The fine grid column, which was 97 km × 97 km with a 1 km grid increment, was primarily mixed forest and broadleaf deciduous trees with a few patches of evergreens. This grid had several small lakes, with one of the larger lakes just north of the WLEF tower.

[13] This case ran from 0000 UT 11 August to 0000 UT 21 August 2001. During this 10-d time period, three cold fronts passed over the WLEF tower. The first simulated front passed at 0200 local standard time (LST) on 12 August, the second front passed at 2300 LST the night of 15 August, and the third front passed over the tower at 1800 LST on 17 August. During the simulation, the wind was light and southwesterly except during the fontal passages, when the wind strengthened and rotated clockwise to northerly flow. For a more complete description of this case and the meteorological conditions, see Wang et al. [2007].

[14] This 10-d time period was chosen to capture the front on 15–16 August, which caused the most significant CO2 concentration variation seen at the WLEF tower that summer. Investigating the representation errors over a time period when the concentration at 396 m varied by more than 40 ppm in 36 h provides an estimate of the errors during a significant synoptic event. Since the simulation is characterized by considerable CO2 variability, the error estimates from this case are likely to be the maximum errors associated with this site.

[15] The simulation in SA was centered over the Tapajos River in Brazil (Figure 2), and ran from 0000 UT 1 August to 0000 UT 16 August 2001. Similar to the NA case, we analyzed the third (Figure 2, bottom left) and fourth (Figure 2, bottom right) grids, denoted as the coarse and fine grid columns, respectively. The coarse column was 335 km by 335 km with a 5-km grid increment. The Tapajos River flowed northward through the center of the domain, and the region was covered primarily by broadleaf evergreen forest and short vegetation, which consisted of pasture and mixed farming. The fine domain was 97 × 97 km, with a 1-km grid increment. The dominant land type for this region was pasture and mixed farming, inland water comprised ∼30% of the domain, and the remaining vegetation was broadleaf evergreen forest. On the east side of the Tapajos River, the Km-67 eddy covariance flux tower measured heat, moisture and trace gas fluxes, CO2 concentrations, and radiation profiles [Saleska et al., 2003]. This case occurred during the dry season and was characterized by calm conditions without fronts or squall lines. During the simulation, intense trade winds blew almost constantly, little precipitation fell over most of the domain, and the clouds were predominantly cumulus. Lu et al. [2005, also submitted manuscript, 2007] provide a further discussion of this simulation.

[16] The unique physical setting of the SA case with respect to the topography and the Tapajos River produces a unique mesoscale and micrometeorological environment [Lu et al., 2005]. This time period was chosen to avoid squall lines and organized weather patterns, highlighting CO2 variability due to the heterogeneous river and vegetation cover and mesoscale circulations. Analyzing this simulation will provide estimates of the representation errors expected from water-vegetation interactions including a low-level convergence line. The error estimates from this simulation represent estimates from local circulation patterns rather than from large-scale features, and these errors provide the expected maximum error of CO2 due to surface heterogeneity.

2.4. Model Evaluation

[17] The two simulations analyzed in this study are evaluated against observations in complementary publications. Wang et al. [2007] focused on the 15 August frontal passage in the North American simulation to analyze the impact of fronts and synoptic events on the CO2 concentration. A high CO2 air mass built up in the southern Great Plains on 14–15 August because of the slow photosynthesis rate caused by hot and dry air over Oklahoma and Texas and strong nighttime respiration in the southeast. This air mass traveled north and was primarily responsible for the high concentrations just prior to the front on 15 August, although weak local photosynthesis on 15 August and strong nighttime respiration under overcast sky conditions also contributed to the accumulation of CO2. Wang et al. [2007] concluded that the atmospheric CO2 variations during this time period were dominated by coherent regional anomalies that were advected by synoptic-scale systems. In the study, Wang et al. [2007] compared the near-surface meteorological fields between observations and SiB2-RAMS for the period 11 August 2001 through 20 August 2001, including evaluations of temperature, water vapor mixing ratio, wind speed, wind direction, net ecosystem exchange (NEE), and CO2 concentration anomalies.

[18] Lu et al. (submitted manuscript, 2007) analyzed the SA simulation depicted here to investigate mesoscale circulations and atmospheric CO2 variations over a heterogeneous landscape during the Santarem Mesoscale Campaign (SMC) of August 2001. They evaluated the modeled CO2 concentrations and fluxes, sensible and latent heat fluxes, temperature, and winds compared to observations, showing that the model captured the temperatures, winds, NEE, and daytime CO2 concentrations reasonably well. Lu et al. (submitted manuscript, 2007) found that the topography, the differences in roughness length between water and land, the juxtaposition of the Amazon and Tapajos Rivers, and the resulting horizontal and vertical wind shears all facilitated the generation of local mesoscale circulations and a low-level convergence line.

[19] To evaluate the effect of clouds on the carbon flux and CO2 concentration, we compared modeled NEE and CO2 to the observations sampled at the towers located in the domains (see section 2.3 for the tower descriptions). For the NA case, we sampled the model at the WLEF tower location and compared hourly net radiation, CO2 concentrations at 396 m, and NEE over the 10-d simulation to the corresponding hourly observations at the WLEF tower. We performed a similar comparison for the SA simulation: we sampled the model at the location of the Km-67 tower and compared the modeled shortwave radiation, the CO2 concentration sampled at 60 m, and NEE over the simulation to the corresponding hourly data sampled at the flux tower.

[20] To investigate the response of the carbon flux to various cloud conditions, we compared the modeled and observed NEE to incoming radiation and overlaid a 2-harmonic function fit to both the model output and the tower observations (Figure 3). At both locations for conditions with radiation values higher than 650 W m−2, which corresponds primarily to clear or mostly clear conditions, the fits to both the model and the in situ observations have a constant uptake of ∼10 μmol m−2 s−1 and ∼13.5 μmol m−2 s−1 for NA and SA, respectively. As the radiation decreases from 650 W m−2 the carbon uptake also decreases; however, the observed decrease occurs at higher radiation values. Simulated uptake remains relatively constant until the radiation decreases to ∼400 W m−2, while the observed uptake has a higher light saturation and thus begins decreasing at higher radiation values. SiB2.5 calculates photosynthesis for a single sun leaf and uses an empirical adjustment to extinction law in conjunction with satellite information to adjust carbon flux up to canopy scale [Baker et al., 2005]. Using this technique is known to result in model photosynthesis reaching light saturation too soon, resulting in enhanced uptake for moderate radiation values [e.g., Dai et al., 2003; Dickinson et al., 1998]. The enhanced uptake in the model could decrease the surface CO2 concentrations during moderately cloudy to overcast conditions and just after sunrise and before sunset.

Figure 3.

Observed (solid) and modeled (shaded, sampled from the grid cells including the towers) NEE, in μmol m−2 s−1, versus radiation, in W m−2. (top) Evaluation at NA and (bottom) results from SA. For NA, the radiation includes longwave, and the values have been subtracted by 200 W m−2 for easier comparison. The SA radiation is shortwave only. A two-harmonic fit to each time series has been overlaid. Mean NA and SA model/data NEE values for radiation >650 W m−2 are −9.7/−10.1 and −13.8/−13.1 μmol m−2 s−1, respectively. For moderate radiation values between 300 and 650 W m−2 the resulting NA and SA model/data NEE means are −8.5/−6.7 and −14.9/−7.3, respectively. Finally, for radiation <300 W m−2 the NA and SA model/data NEE mean values are 0.2/2.8 and −4.2/2.3 μmol m−2 s−1, respectively.

[21] To investigate the relationship between cloud cover and CO2 concentrations, we compared modeled and observed CO2 concentrations to the corresponding radiation (Figure 4). Since the CO2 concentration in the model has a prescribed background, we compared the concentration anomalies, which are calculated by subtracting the mean of the CO2 concentrations during the simulated time period from the data sets. In both NA and SA, the variability of the CO2 concentration increases with decreasing radiation, and this characteristic is seen in both the model and the observations. For clear-sky conditions with radiation values above 650 W m−2, the concentrations are lower than the mean. Over NA, the concentrations are highest for moderate radiation (between 650 and 300 W m−2), while over SA the concentrations increase as radiation decreases. Despite the model having enhanced uptake for moderate to low radiation, the mean values for these radiation bins remain within ∼1 ppm. The relatively small differences between the modeled and observed concentrations indicate that the model does a reasonable job of capturing the overall behavior of the CO2 concentration in various sky conditions.

Figure 4.

Observed (solid) and modeled (shaded, sampled from the grid cells including the towers) CO2 anomalies (ppm) versus radiation (W m−2). (top) NA results and (bottom) SA results. The anomalies are calculated by subtracting the mean CO2 concentration over each case from the corresponding series. Mean NA and SA model/data CO2 values for radiation >650 W m−2 are −1.6/−1.1 and −4.3/−3.1 ppm, respectively. For moderate radiation values between 300 and 650 W m−2 the resulting NA and SA model/data CO2 mean anomalies are 3.5/1.6 and −2.9/−4.0 ppm, respectively. Finally, for radiation <300 W m−2 the NA and SA model/data CO2 mean values are −0.4/−0.2 and 0.7/1.9 ppm, respectively.

2.5. Simulating Satellite Measurements Using SiB2-RAMS Output

[22] To simulate satellite CO2 retrievals over the two simulations, we mimic the OCO sampling strategy. Since OCO will estimate total column CO2 concentrations, the modeled concentrations are vertically integrated by pressure weighting using a standard atmosphere. All simulated tracks are sampled at 1300 LST to approximate the satellite overpass time. Since we are investigating small domains that satellites will fly over very quickly, we assume that OCO travels due south and that all the footprints in a track will be averaged together to yield only one concentration for the grid. We created a track width of 10 km by averaging the appropriate number of pixels in the x direction, which corresponds to 10 pixels for the fine domain and 2 pixels for the coarse domain. To create one satellite value for each possible track, we meridionally averaged the pixels to create a single measurement. Using these criteria, the fine domains have 88 different possible satellite tracks: the first track is on the western edge of the domain (x = 1:10), the second track is one pixel eastward (x = 2:11), and the final track is along the eastern edge (x = 88:97). The coarse domain in NA has 87 different tracks, and the SA coarse domain has 65 possible satellite tracks.

[23] Since the satellite retrieval requires clear conditions, only pixels with clear-sky are included in the simulated satellite concentrations (unless otherwise specified). A pixel is considered clear if the cloud optical depth τ < 0.2. This threshold was selected as it is the approximate threshold for which precise XCO2 retrievals are possible [Miller et al., 2007; Crisp et al., 2004]. In the NA simulation, 2 d are primarily clear, 5 d are partly cloudy, and 2 d are overcast. Over SA, 6 d are completely clear and 9 d are partly cloudy.

3. Results

3.1. Total Column CO2 Concentrations

[24] In the NA simulation, the main driver of total column CO2 temporal variability is synoptic-scale systems (Figure 5). A Fourier analysis of the CO2 concentrations reveals a significant spectral peak at ∼3.5 d (at the 95% confidence level using an F test), which indicates the dominant timescale of variability is the fronts in the simulation. The diurnal cycle also has a significant spectral peak, although it is much smaller. Rather than displaying a strong diurnal cycle, the simulated total column concentration sampled from the grid cell that includes the WLEF tower has three spikes associated with the three frontal passages. The column CO2 range is ∼6 ppm and the standard deviation is ∼1 ppm. The fronts, which are associated with clouds, advect high concentrations from the southwest, where a heat wave reduces carbon uptake causing high CO2 anomalies [Wang et al., 2007]. The lowest concentrations during the simulation occur in clear conditions, when the main influence on CO2 is the local vegetation rather than advection.

Figure 5.

Simulated total column CO2 concentrations at the WLEF tower (solid line) and the sky conditions (shaded line), where 0 indicates clear sky and 1 indicates the tower was cloud covered. The vertical dashed lines indicate the three frontal passages.

[25] The NA total column CO2 spatial variability is also predominantly affected by the weather via the frontal passages. The range of column CO2 at 1300 LST over the fine grid column varies from 0.2 to 1.8 ppm, with an average of 0.8 ppm (Table 1). Over the coarse grid column, the CO2 range at 1300 LST varies from 1 ppm to 13.7 ppm, with an average range of 3.5 ppm across the domain and a mean standard deviation of 0.6 ppm. Although the surface heterogeneity of the coarse domain contributes to increased CO2 variability, the greatest concentration ranges occur when the southwestern portion of the domain has high concentrations from advection while the northeastern half of the domain has low concentrations. Optically thick clouds that are associated with the fronts contribute to higher concentrations by reducing photosynthesis due to light limitation.

Table 1. Range and Standard Deviation (σ) of the Simulated Grid Columns at 1300 LSTa
 Rangeσ
MeanMaxMeanMax
  • a

    Both the mean values over the entire simulation and the maximum values are displayed. Unit is ppm.

NA Fine0.761.810.150.4
NA Coarse3.5313.710.641.9
SA Fine1.462.10.40.53
SA Coarse2.152.910.440.58

[26] Ground-based measurements of total column CO2 are being made at the WLEF tall tower site [Washenfelder et al., 2006]. The observatory utilizes a similar technique as OCO, GOSAT, and SCIAMACHY to measure XCO2 using an upward looking Fourier Transform Spectrometer (FTS). The observatory has been measuring XCO2 since May 2004. At WLEF, XCO2 is minimally influenced by the diurnal rectifier effect. Washenfelder et al. [2006] present results from a validation study involving aircraft data where column observations were measured on five dates in July and August of 2004. The column average concentration varies ∼7 ppm between these samples, which is similar in magnitude to the column variations seen in the SiB-RAMS simulations due to the frontal passages. A plot of the seasonal cycle of daytime daily averaged XCO2 shows day-to-day variability of ∼6–7 ppm during the summer [Washenfelder et al., 2006].

[27] The dominant cause of column CO2 temporal variability in SA is the diurnal cycle and mesoscale circulations (Figure 6), since this simulation occurs in the dry season and is characterized by steady trade winds, nocturnal decoupling, river breezes, boundary layer cumulus clouds, and no air masses or fronts. A power spectrum of this series shows the only significant spectral peak is at 1 d. The temporal CO2 variability in SA is smaller than in NA, as the range and standard deviation of the simulated column concentrations sampled at the Tapajos tower is only 3.1 ppm and 0.7 ppm, respectively. The amplitude of the mean diurnal cycle is 1.1 ppm. Unlike in NA, there is no correlation between cloud cover and mixing ratios. Since this simulation was selected to isolate the influence of local vegetation and circulations, the clouds are midafternoon cumulus clouds primarily seen on the east bank of the Tapajos River due to the low-level convergence line [Lu et al., 2005, also submitted manuscript, 2007].

Figure 6.

Simulated total column CO2 concentrations at the Tapajos tower (solid line) and the modeled sky conditions (shaded line), where 0 indicates clear sky and 1 indicates the tower was cloud covered.

[28] Since the SA case has significant surface heterogeneity due to the rivers, the spatial variability in this simulation is larger for the fine grid column compared to the NA simulation; however, the spatial variability over the coarse domain is smaller, which is due to the lack of synoptic-scale features which advected high CO2 in NA. The average total column spatial range at 1300 LST is 1.46 ppm and 2.15 ppm for the fine and coarse grid columns, respectively (Table 1). The CO2 spatial pattern at 1300 LST was similar for all days, with a low concentration on the eastern half of the domain and higher concentrations in the northwest corner, which is primarily due to the topography and surface cover [Lu et al., 2005].

[29] The total column measurements in SiB-RAMS are consistent with results presented by Olsen and Randerson [2004]. Using the Model of Atmospheric Transport and Chemistry (MATCH) three-dimensional atmospheric transport model, Olsen and Randerson [2004] investigated the total column CO2 concentrations. They found that at WLEF the greatest variability of column CO2 was linked to synoptic events on the order of 2 to 6 d. In order to influence the column, CO2 flux anomalies had to accumulate in the lower troposphere over a period of several days or there had to be a large-scale replacement of air in the column. Day-to-day variations of up to ∼8 ppm can be seen at the WLEF tower during the summer because of synoptic events. Similar to SiB-RAMS, results from Olsen and Randerson [2004] show the main driver of column CO2 variability over WLEF during the summer is synoptic-scale systems, as midlatitude air masses with distinct CO2 concentrations develop in response to surface fluxes and are separated by fronts [Parazoo, 2007].

[30] In the Amazon, modeled vertical CO2 profiles were qualitatively similar to the observed profiles near the surface, but did not exhibit the same degree of variability [Olsen and Randerson, 2004]. The amplitude of the average diurnal cycle within the Amazon basin was 0.9 ppm in July, which is slightly weaker than the diurnal cycle from SiB-RAMS; however, comparison of MATCH column CO2 to column CO2 profiles from aircraft data revealed that MATCH tended to have lower diurnal variability than observed. In the tropics, the dominant cause of CO2 variability is the diurnal cycle because of the productive ecosystems and the lack of synoptic-scale features.

3.2. Spatial Representativeness Errors

[31] Since satellite track widths are not the same size as an inverse model grid column, using satellite concentrations to optimize a grid column may introduce spatial representativeness errors into the inversion. In this study, the size of the coarse and fine domains in both NA and SA correspond roughly to a global model grid size. We calculated the spatial errors that inversions would incur from using satellite measurements to represent grid columns in central NA and in the Amazon by subtracting the domain-averaged 1300 LST total column concentrations from the simulated satellite concentrations, which use only clear-sky pixels. The daily results are compiled into a single sampling distribution for each domain and location (Figure 7). The mean and standard deviation of the sampling distributions for the fine and coarse domains for both NA and SA are provided in Table 2.

Figure 7.

Sampling distributions of the spatial representativeness errors in NA (solid) and SA (shaded) at 1300 LST compiled from all days of the simulations. The x axis is the difference between the simulated satellite concentration and the domain mean concentration, and the y axis is the number of satellite tracks that correspond to each difference. Negative values indicate an underestimation by the simulated satellite measurements and positive values indicate an overestimation. (top) Results from the fine grid columns and (bottom) distribution of the errors from the coarse grid columns.

Table 2. Mean (μ) and Standard Deviation (σ) of the Sampling Distributions of the Spatial Representation Errors, the Local Clear-Sky Errors, the Diurnal Sampling Errors, and the Temporal Sampling Errors for All Four Casesa
 SpatialLocal Clear-SkyDiurnalTemporal
μσμσμσμσ
  • a

    Unit is ppm.

NA fine−0.010.06−0.020.06−0.190.33−0.440.31
NA coarse−0.130.43−0.120.51−0.250.51−0.420.5
SA fine−0.040.21−0.040.180.10.260.060.66
SA coarse−0.040.24−0.030.1900.25−0.010.64

[32] The spatial errors for both fine grid columns are unbiased, as the mean of the distributions are close to 0. Over NA, all of the errors are within 0.3 ppm; however, over SA only 13% of the simulated satellite concentrations were within 0.3 ppm of the mean. The standard deviation for SA is 0.2 ppm and the maximum error is −0.72 ppm. 97% of the simulated SA tracks are within 0.5 ppm, which is only half of the expected spectroscopic retrieval error [Miller et al., 2007]. The larger errors over SA are due to the heterogeneity in that domain and cloud masking in NA, since the greatest NA variability occurred when there were clouds and hence no satellite retrievals. The relatively small magnitude of the errors is due to the limited total column CO2 variability in the domains.

[33] The errors that would be introduced into inversions that use satellite measurements to represent coarse grid columns are larger than the errors for a fine grid column, which is not surprising since the total column CO2 is more variable. The spatial errors over SA remain unbiased and have a standard deviation similar to that of the fine domain. 95% of the satellite tracks capture the domain mean within 0.5 ppm. The errors for the NA coarse domain are much larger and negatively biased, with a mean of −0.13 ppm and a standard deviation of 0.43 ppm. Although nearly 25% of the tracks are within 0.1 of the mean, 18% of the tracks have errors larger than 0.5 ppm and 6% of the tracks have errors larger than 1 ppm, which is larger than the expected retrieval error. The large and negatively biased spatial errors are due to the large gradients of CO2 due to the frontal passages and the cloud masking of the higher concentrations associated with the fronts.

3.3. Local Clear-Sky Errors

[34] We define local clear-sky errors as errors that are introduced into inversions that use clear-sky satellite concentrations to represent a transport model grid column that includes clouds. These errors are calculated by subtracting the simulated satellite concentrations at 1300 LST using all pixels from the simulated satellite concentrations using only clear-sky pixels. The resulting errors are smaller than the retrieval error and are unbiased for the fine domains (Figure 8). All the NA satellite tracks over the fine grid column that only use clear-sky footprints capture the true mean track value within 0.3 ppm. For the SA case, the standard deviation is larger and 87% of the tracks are within 0.3 ppm of the true mean. The largest error is 0.7 ppm. The SA coarse domain errors are very similar to the errors in the fine domain, with 85% of the errors less than 0.3 ppm. The similarity between the fine and coarse sampling distributions indicates that differences in carbon uptake due to local cloud cover has a minimal impact on the concentration at a single snapshot in time. The local clear-sky errors over the NA coarse grid column are negatively biased with a sampling distribution mean of −0.12 ppm. The negative bias is due to a few tracks that have large negative errors. Although 80% of the simulated satellite concentrations using only clear footprints have errors less than 0.3 ppm, 3% of the tracks have errors greater than 1 ppm, with errors as large as 4 ppm. Similar to the spatial errors, large and negatively biased local clear-sky errors are due to cloud masking of high frontal CO2.

Figure 8.

Local clear-sky total column CO2 errors for NA (solid) and SA (shaded), which are the differences between the simulated satellite concentrations at 1300 LST using only clear-sky pixels and the simulated satellite concentrations at the same time using all the pixels in the satellite track. (top) Errors from the fine grid and (bottom) results from the coarse grid.

[35] To further examine the clear-sky errors, we analyzed local clear minus all-sky differences in net ecosystem exchange (NEE), which were calculated in a similar manner by subtracting the mean NEE value in a satellite track containing all pixels from the corresponding satellite track NEE mean utilizing only clear-sky pixels. The resulting errors are very small (<1 μmol m−2 s−1). For the fine domains, the errors are shifted toward enhanced uptake in clear conditions due to reduced photosynthesis under clouds; however, the errors in the coarse domains are symmetrical about 0. Since the clear-sky NEE errors are small, their effect on the column CO2 concentration is minimal, indicating that the main driver of the large errors seen in the clear-sky CO2 is the organization of regional CO2 gradients along frontal boundaries, which are masked by large-scale cloud systems and not observed by satellites.

3.4. Temporal Sampling Errors

[36] Temporal sampling errors can occur in inversions that use satellite concentrations to optimize temporally averaged concentrations in the model. We calculated temporal errors that arise from using satellite measurements to represent diurnal averages and bimonthly averages.

3.4.1. Diurnal Sampling Errors

[37] To calculate the diurnal errors, we subtracted the domain average diurnal mean (0000 UT to 0000 UT) from the simulated 1300 LST satellite tracks (Figure 9). All the standard deviations for the diurnal errors are larger than the standard deviations seen for both spatial and local clear-sky errors. Over SA, the mean of the sampling distribution is positively biased by a tenth of a ppm, and the entire distribution is positively shifted, indicating that on a fine domain satellite concentrations at 1300 LST are slightly higher than the domain mean. 94% of the simulated satellite tracks have errors less than 0.5 ppm, and all the tracks have representation errors less than the expected retrieval error. For the SA coarse grid column, the diurnal errors are unbiased, the sampling distribution is symmetric about 0, and 95% of the errors are less than 0.5 ppm. The errors indicate that, in the absence of synoptic systems, 1300 LST satellite measurements over productive ecosystems are generally within 0.5 ppm of the diurnal mean and actually become less biased as the domain size increases. This result is similar to results from Olsen and Randerson [2004] and Miller et al. [2007] that indicate that column measurements over productive ecosystems have a diurnal maximum in the early morning, a minimum in the late afternoon, and are near the diurnal mean at 1300 LST.

Figure 9.

Diurnal sampling errors for NA (solid) and SA (shaded), which are the differences between the simulated satellite concentrations from each track using only clear-sky pixels and the diurnal mean CO2 concentration from the entire domain, from 0000 to 0000 UT.

[38] The diurnal sampling errors for NA are negatively biased by ∼0.2 ppm for both the coarse and the fine grid columns, indicating that sampling at 1300 LST underestimates the diurnal average for this case. Over the fine domain, 85% of the tracks capture the diurnal average within 0.5 ppm. The remaining tracks underestimate the mean by ∼1 ppm. Since the total column concentration over the domain is driven by synoptic variability associated with cloud cover rather than the diurnal cycle due to vegetation, the large errors are idiosyncratic, resulting both from clouds masking the high concentrations and the timing of the fronts. The bias and standard deviations on the NA coarse domain is even larger. Rather than having a small subset of tracks underestimating the diurnal mean, the distribution is negatively shifted. Only 65% of the tracks have errors less than 0.5 ppm, indicating that over regions that have large synoptic variability the diurnal mean is not well sampled with a clear-sky satellite measurement taken at a single snapshot in time.

3.4.2. Bimonthly Sampling Errors

[39] We calculated temporal sampling errors that arise from comparing satellite concentrations to a domain average bimonthly mean by subtracting the domain-averaged CO2 mean for the entire simulation from the 1300 LST satellite tracks (Figure 10). These errors are very large for all cases, as evidenced by the large standard deviations. The errors are biased by −0.4 ppm over NA. The NA sampling distributions for both the fine and coarse domain are negatively shifted, showing that the clear-sky simulated satellite concentrations systematically underestimate the temporal average. Over the both domains in NA only ∼50% of the tracks had errors less than 0.5 ppm. The large positive errors seen by a few tracks in the large domain is a result of the satellite concentrations sampled on 15 August, as a few pixels in the northwest corner of the domain were clear just prior to the frontal passage as the CO2 concentration was increasing. Sampling between clouds enabled the satellite to observe higher concentrations associated with the front, but the front caused such a large anomaly in column CO2 that the concentrations were actually higher than the domain-averaged temporal mean. At synoptic scales, horizontal and vertical mixing work together to cause these strong CO2 variations along cold fronts [Parazoo, 2007]. Since synoptic weather patterns can carry large CO2 anomalies and since these weather disturbances and frontal passages are associated with clouds, clear-sky satellite measurements have large errors compared to temporal averages over regions with synoptic variability.

Figure 10.

Temporal sampling errors for NA (solid) and SA (shaded), which are the differences between the simulated satellite concentrations from each track using only clear-sky pixels and the 10-d domain average. (top) Fine grid column and (bottom) coarse grid column results.

[40] Over SA the standard deviation is also large for bimonthly errors; however, the sampling distributions are unbiased. Even in a case driven by local vegetation and circulation, a substantial number of simulated satellite tracks have errors larger than 1 ppm. On the fine domain, only 40% of the tracks have errors less than 0.5 ppm, and only 45% of the simulated satellite concentrations have errors less than 0.5 ppm on the coarse domain. The large errors indicate that even in conditions dominated by local fluxes and circulation patterns, clear-sky satellite measurements sampled at 1300 LST cannot represent bimonthly temporal averages without a substantial chance of introducing large errors.

4. Conclusions

[41] Using a coupled ecosystem-atmosphere cloud-resolving model, we investigated sampling errors that may be introduced into inversions that use satellite retrievals of total column CO2 in clear conditions. We performed two simulations: one over the midcontinental United States and one in the Brazilian Amazon. The main driver of column CO2 variability in the NA case was synoptic systems associated with cloud cover, while the source of CO2 variability in SA was the diurnal cycle and mesoscale circulations.

[42] Spatial representation errors were unbiased and less than 0.5 ppm for a 100 × 100 km domain; however, the errors increased in the NA case when a single satellite track was used to represent a coarse (450 × 450 km) grid column. The local clear-sky errors exhibited the same patterns as the spatial errors: the majority of the errors were <0.3 ppm for a 100 × 100 km domain, but the errors became negatively biased and large (>2 ppm) for the coarse grid column of the NA simulation. Both the spatial and local clear-sky errors did not increase over the coarse SA grid column, where the variability was due to surface heterogeneity and local circulations. The main cause of large and biased spatial and clear-sky errors was not surface heterogeneity but rather synoptic systems associated with cloud cover. CO2 observations across North America showed large day-to-day CO2 variations associated with passing weather disturbances manifested as surface cold fronts [Parazoo, 2007]. Parazoo [2007] found that although ecosystem response to frontal weather played a role, the majority of the CO2 variations (70–90%) along fronts was due to horizontal and vertical mixing. Resulting strong coherent CO2 patterns were then transported across the continent by horizontal advection. Since frontal systems create large gradients of CO2 that are masked by clouds and cannot be sampled, inversions that use satellite measurements to represent coarse regions may incur large and biased spatial and local clear-sky errors. As inversions are influenced by a bias as small as a tenth of a ppm in the total column [Chevallier et al., 2007; Miller et al., 2007], satellite concentrations cannot be used to represent large regions with significant CO2 variability due to synoptic systems. Our analysis suggests that transport models should be run at high resolution to avoid introducing biases.

[43] Using satellite measurements to represent bimonthly temporal averages created large and biased errors. Even in a location where the main temporal variability was due to the diurnal cycle and local circulations, the bimonthly errors were larger than the expected retrieval error. Over NA, the errors were substantially negatively biased (approximately −0.4 ppm) for both a fine and coarse grid column. Frontal systems that created CO2 gradients and that could not be sampled because of cloud cover caused not only errors larger than the expected spectroscopic retrieval error, but sampling biases. Since sampling biases are harmful to inversions, satellite measurements cannot be used to represent temporal averages. As our case study chose the synoptic event with the strongest CO2 signal, the errors presented here are likely maximum error estimates; however, it is likely that biases will exist for all synoptic systems that are associated with clouds. In addition, the model overestimated the photosynthetic uptake for moderate radiation values, which could cause the role of large-scale advection relative to local changes in carbon flux to be overestimated. However, decreasing the uptake would increase the concentrations in cloudy conditions not visible by the satellite and would thus increase the negative bias in NA, making the results presented here robust despite this model deficiency.

[44] Systematic variations of CO2 along midlatitude fronts makes model transport a priority. The model and the atmosphere must be sampled consistently, and observation operators in inversions must be accurate, including precise modeling of winds, clouds, fronts, and frontal timing. To avoid temporal sampling errors and biases, atmospheric transport must be modeled accurately and satellite mixing ratios must be used to optimize modeled concentrations sampled at the same time.

Acknowledgments

[45] We thank Pedro Silva Dias, Maria Assuncăo Silva Dias, and Saulo Freitas for their SiB2-RAMS coupling work. Flux measurements at the WLEF tower were supported by the Department of Energy's Office of Science (BER) through the National Institutes for Global Environmental Change and the Terrestrial Carbon Processes program. The carbon dioxide mixing ratio measurements at the WLEF tower are maintained by the National Oceanographic and Atmospheric Administration. Flux and CO2 data in South America were supported by NASA grant NCC5-684. We thank Kenneth Davis for the data from the WLEF tower and Steven Wofsy for the Km 67 tower data. This research was funded by the NASA Earth System Science Fellowship NCC5-61 and NASA contracts NNG0SGD15G and NNX06AC75G. We gratefully acknowledge the constructive comments by two anonymous reviewers, which improved the quality of the manuscript.

Ancillary