A geostatistical approach to surface flux estimation of atmospheric trace gases



[1] Inverse modeling methods have been used to estimate surface fluxes of atmospheric trace gases such as CFCs, CH4, and CO2 on the basis of atmospheric mass fraction measurements. A majority of recent studies use a classical Bayesian setup, in which prior flux estimates at regional or grid scales are specified in order to further constrain the flux estimates. This paper, on the other hand, explores the applicability of using a geostatistical approach to the inverse problem, a Bayesian method in which the prior probability density function is based on an assumed form for the spatial and/or temporal correlation of the surface fluxes, and no prior flux estimates are specified. The degree to which surface fluxes at two points are expected to be correlated is defined as a function of the separation distance in space or in time between the two points. Flux estimates obtained in this manner are not subject to some of the limitations associated with traditional Bayesian inversions, such as potential biases created by the choice of prior fluxes and aggregation error resulting from the use of large regions with prescribed flux patterns. In essence, they shed light on the information contained in the measurements themselves. The geostatistical algorithm is tested using CO2 pseudodata at 39 observation locations to recover surface fluxes on a 3.75° latitude by 5.0° longitude grid. Results show that CO2 surface flux variations can be recovered on a significantly smaller scale than that imposed by inversions that group surface fluxes into a small number of large regions.

1. Introduction

[2] The use of inverse modeling methods as a tool for estimating surface fluxes of atmospheric trace gases has become increasingly common as the need to constrain their global and regional budgets has been recognized [Houghton et al., 2001; Committee on the Science of Climate Change, Division on Earth and Life Studies, National Research Council, 2001; Wofsy and Harriss, 2002]. Inverse methods attempt to deconvolute the effects of atmospheric transport and recover source fluxes (typically surface fluxes) on the basis of atmospheric measurements. Information about regions that are not being directly sampled can potentially be inferred from downwind atmospheric measurements. Inverse modeling methods have been used to estimate regional contributions to global budgets of trace gases such as CFCs, CH4, and CO2, and a review of recent applications is presented by Enting [2002, chap. 14–17].

[3] The ill-conditioned nature of the inverse problem constitutes a principal difficulty in constraining trace gas emissions. Substantially differing source/sink configurations do not necessarily lead to substantial differences in modeled mixing ratios at observational network sites. Therefore small uncertainties in the observational data correspond to much higher uncertainties in the estimated emission magnitudes [Enting and Newsam, 1990; Brown, 1993; Hein et al., 1997]. In order to extract a meaningful solution, either the number of unknowns has to be decreased by substantially limiting the number of flux regions that are estimated [e.g., Brown, 1993; Tans et al., 1990], or additional information on the sources and sinks has to be introduced into the calculation.

[4] In atmospheric science, this additional information has often been introduced by requiring that the source estimates resulting from the inversion be close to a first guess, or a priori information, on the sources. This can be done in a consistent way by adopting a classical Bayesian approach, in which all parameters are expressed as statistical probability distributions. This paper investigates the applicability of an alternate geostatistical approach, where the prior information is defined solely on the basis of a spatial and/or temporal correlation between the fluxes.

[5] In the classical Bayesian approach, the solution to the inverse problem of flux estimation is defined as the set of parameter values that represent an optimal balance between two requirements. First, the optimized, or a posteriori, fluxes should be as close as possible to the first-guess, or a priori, fluxes. Second, the measurement values that would result from the inversion-derived (a posteriori) fluxes should agree as closely as possible with the actual measured concentrations. Mathematically, this solution corresponds to the minimum of a cost function Ls, defined as

equation image

where z is an n × 1 vector of observations, H is a known n × m matrix, the Jacobian representing the sensitivity of the observations z to the function s (i.e., Hi,j = ∂zi/∂sj), s is an m × 1 vector of the discretized unknown surface flux distribution, R is the n × n model-data mismatch covariance, sp is the m × 1 prior estimate of the flux distribution s, Q is the covariance of flux deviations from the prior estimate sp, and the superscript T denotes the matrix transpose operation. Typically, both R and Q have been modeled as diagonal matrices. A solution in the form of a superposition of all statistical distributions involved can be computed, from which a posteriori means and covariances can be derived [e.g., Enting et al., 1995]. The solution is [Tarantola, 1987; Enting, 2002]

equation image
equation image

where equation image is the posterior best estimate of s and Vequation image is its posterior covariance.

[6] As will be presented in more detail in section 2.3, the geostatistical approach entails modifying the Bayesian objective function to

equation image

where X is a known m × p matrix, β are p × 1 unknown drift coefficients, and is the model of the mean of the surface flux distribution. The covariance matrix Q is based on a spatial and/or temporal correlation structure for the flux distribution s and will therefore have nonzero off-diagonal components. The inverse problem involves solving for both β and s. In addition, the parameters (e.g., variance and correlation length) of R and Q can also be estimated using the data themselves.

[7] One of the limitations of the classical Bayesian approach is that it is often difficult to estimate the prior uncertainty and model-data mismatch, making it difficult to estimate the reduction in uncertainty and the absolute a posteriori uncertainties of source magnitudes resulting from the integration of atmospheric data [Hein et al., 1997; Houweling et al., 1999; Bousquet et al., 1999; Rayner et al., 1999]. Also, similar data are sometimes used in defining and updating prior flux estimates [Hein et al., 1997; Houweling et al., 1999; Bousquet et al., 1999], which is not strictly correct given the assumptions of the Bayesian approach. In addition, erroneous prior flux estimates can lead to estimated fluxes that are inconsistent with the atmospheric data and/or do not correspond to actual flux patterns [Brown, 1993]. This can be due to narrow uncertainty bounds being assigned to unrealistic prior flux estimates or to incorrect spatial flux patterns being assigned within regions, which can lead to aggregation errors [e.g., Kaminski et al., 2001]. Finally, if all available data are used in defining and/or updating the prior flux estimates, no additional data are available for independently validating the obtained final flux estimates.

[8] A second issue to be considered is the resolution at which fluxes are estimated. The vast majority of studies conducted up to this point have attempted to identify fluxes at continental or ocean basin scales, which can be referred to as a “big regions” perspective. As such, fluxes are aggregated into a few large regions, and emission distributions over predefined regions are assumed to be perfectly well known. The result of such a setup is that the number of unknowns, i.e., the total number of fluxes to be estimated, is greatly reduced relative to the number of surface grid cells used in the transport model. The advantage of such an approach is that it typically renders the overall problem overdetermined, in the sense that the total number of available observations is greater than the number of fluxes to be estimated. Therefore, even if certain regions are less well sampled than others, they can usually be constrained to some extent. The disadvantage is that variations in fluxes at scales smaller than the selected regions cannot be estimated. In addition, aggregation errors can occur when incorrect flux patterns are assigned within regions. If measurements are sensitive to these prescribed flux patterns, the inferred total fluxes for given regions will not be representative of the actual overall fluxes for these regions [Kaminski et al., 2001; Peylin et al., 2002; Law et al., 2002; Rödenbeck et al., 2003].

[9] As a result of these issues, certain researchers have moved toward grid-scale inversions, where the fluxes are estimated at a resolution close to that of the atmospheric transport model used [Kaminski et al., 1999b; Houweling et al., 1999; Rödenbeck et al., 2003]. These studies have used resolutions as fine as 8° latitude by 10° longitude. In such a setup, the problem is strongly underdetermined, with the number of fluxes to be estimated being significantly greater than the number of available observations, and results in infinite variances on the recovered fluxes if no other information is used to constrain the problem. For this reason, grid-scale inversions have all relied on a Bayesian framework to introduce additional prior information into the solution and help constrain the estimates.

[10] Because of the underdetermined nature of the problem, most studies that have coupled prior flux information about grid-scale fluxes with atmospheric measurements have found that the reduction in uncertainty relative to the specified prior flux estimate uncertainty was small and the inversion yielded flux estimates that were similar to the prior flux estimates used to constrain the solution [Kaminski et al., 1999b; Houweling et al., 1999]. These studies have minimized dependence on prior information by modeling the uncertainties in the fluxes as fully uncorrelated between grid cells [Kaminski et al., 1999b; Houweling et al., 1999]. This is opposite to the big regions approach, where either fluxes over a region are assumed to be fully correlated (i.e., uniform), or their variation is assumed to be perfectly known, with a prescribed flux pattern within regions.

[11] It is reasonable to assume, however, that reality lies somewhere in between the two extremes of either perfectly correlated or completely uncorrelated fluxes at the grid scale and that small-scale spatial patterns exist in surface fluxes that the data themselves can help in defining. In fact, Rödenbeck et al. [2003] recently made a first attempt at introducing spatial correlations within a traditional Bayesian framework, presenting a method that required the specification of flux patterns and correlations among source strengths in addition to prior flux estimates.

[12] Spatial correlation can offer useful additional information that can be used to reduce the uncertainty of source estimates. That is precisely the goal of the geostatistical approach to inverse modeling, which uses inferred information about spatial and/or temporal correlations in the unknown function (in this case, surface fluxes of atmospheric trace gases) in addition to available measurements to constrain the estimate of the function, without specifying a prior estimate. Because prior flux estimates are not used, the inversion is strongly data-driven and sheds light on whether useful flux information can be derived from the data themselves. The feasibility of implementing such an approach for atmospheric inverse modeling, specifically for the estimation of surface fluxes of atmospheric trace gases, is the subject of this paper.

[13] The objective of this paper is twofold. First, it develops the implementation of a geostatistically based inversion method for estimating surface fluxes of atmospheric trace gases. The presented application is for the recovery of a yearly averaged global CO2 surface flux distribution using monthly averaged concentration measurements. This sample application uses pseudodata in order to isolate the behavior of the inversion algorithm from other factors, such as the accuracy of the transport model and the measurement error associated with observations [Hartley and Prinn, 1993; Plumb and Zheng, 1996; Mulquiney and Norton, 1998; Law et al., 2002]. In addition, the use of pseudodata allows for a direct comparison between the “actual” fluxes (which would be unknown in a real-data case) and the fluxes inferred from the limited available measurements. Second, this paper is also intended to describe the presented methodology in enough detail to make it possible for interested parties to implement it and use it for their specific applications. To this end, several additional references, partial derivations, and examples are provided wherever practical.

[14] The remainder of this paper is organized as follows. Section 2 discusses the geostatistical approach to inverse modeling and provides a detailed description of the methodology as applied to atmospheric problems. Section 3 presents the sample pseudodata application involving the estimation of yearly averaged CO2 surface fluxes. Section 4 presents a discussion of the results, and section 5 draws conclusions and discusses future avenues for the application of the geostatistical approach to inverse modeling.

2. Geostatistical Approach to Inverse Modeling

[15] This section describes the geostatistical approach to inverse modeling, along with an outline of its implementation for atmospheric inverse modeling.

2.1. Basic Principles

[16] The field of geostatistics, or the theory of regionalized variables, was introduced by Matheron [1963, 1971] and is an adaptation of least squares methods to quantities that are correlated in space. Geostatistical inverse modeling methods have been used extensively in groundwater systems, mainly in estimating spatial patterns of hydraulic conductivity or transmissivity based on transmissivity and hydraulic head measurements [e.g., Kitanidis and Vomvoris, 1983; Hoeksema and Kitanidis, 1984; Gelhar, 1993; Kitanidis, 1995; Yeh and Zhang, 1996; Zimmerman et al., 1998]. Similar methods have also been used for subsurface characterization using data such as ground-penetrating radar (GPR) and seismic measurements [e.g., Rea and Knight, 1998; Doyen, 1988]. Recently, geostatistically based methods have also been applied to the identification of contaminant sources in groundwater systems [Snodgrass and Kitanidis, 1997; Michalak and Kitanidis, 2002, 2003, 2004a, 2004b]. Source identification problems in groundwater contaminant hydrology typically involve the estimation of the release history from a given source or the identification of the location of sources of contamination. The problem of groundwater contaminant source identification is similar to the one being examined here. The transport of solute in groundwater is modeled as being governed by a second-order advection-dispersion equation, with optional reactive terms. The heterogeneity of the subsurface complicates analyses in a similar manner as spatially and temporally variable wind fields affect inverse modeling in atmospheric applications.

[17] The geostatistical approach to inverse modeling is a Bayesian approach, as was presented in equation (4). As such, it is based on the principle of combining prior information with information supplied by available measurements. In the geostatistical approach, however, the prior information is not an initial estimate of source fluxes for given regions or grid cells. Instead, the prior information is in the form of a spatial and/or temporal correlation. What is prescribed is the degree to which the deviations of surface fluxes from their mean behavior at two different locations or times are expected to be correlated, as a function of the distance in space or in time between the two points at which the flux is to be estimated. The correlation structure prescribed in the geostatistical approach is specified as a prior covariance matrix for deviations from the mean. The mean of the flux distribution is specified either as a constant or as a function of auxiliary variables such as time, latitude, population density, etc.

[18] The first key component of the approach is the model of the mean, , which defines the factors that are expected to affect the mean behavior of the surface fluxes. For example, if we expect land fluxes to behave differently from ocean fluxes, a separate mean can be defined for ocean and land grid cells. In addition, if the mean behavior is expected to vary with other variables, this can also be specified. The actual parameters of the model of the mean are not prescribed a priori, but are instead inferred from the data as part of the inversion, in a manner somewhat analogous to multiple linear regression. The model of the mean is therefore not equivalent to the priors used in a traditional Bayesian framework (see equation (1)). This point will be discussed further in section 2.3.2.

[19] The second key component of the geostatistical approach is the model used for the prior covariance. The prior covariance function of the surface fluxes s is

equation image

where Q(θ) is a known function of parameters θ, where θ can encompass parameters such as a correlation length and variance and where E[ ] designates the expected value of a variable. For example, in most applications of the traditional Bayesian approach, deviations from the prior estimates of s are assumed uncorrelated, and the prior covariance matrix is typically

equation image

where σi2 are prescribed variances. In geostatistical applications the prior covariance model is based on a selected covariance or generalized covariance function. Covariance models define the rate at which the correlation of the surface flux distribution's deviation from its mean behavior decays with the separation between two points (for a thorough discussion of this topic, see, e.g., Cressie [1991] and Kitanidis [1997a]). Parameters required by the selected covariance model, such as the variance and correlation length of the process being estimated, can also be estimated using the data themselves. Note that when spatial correlation is taken into account by applying a covariance model, the prior covariance matrix Q has nonzero off-diagonal elements.

2.2. Potential Advantages for Atmospheric Modeling

[20] The geostatistical approach to inverse modeling has the potential to offer additional information relative to that obtained using Bayesian inversions that rely on prior flux estimates being assigned to regions or grid cells. Also, for some applications, the geostatistical approach has distinct advantages over methods applied in the past. First, because the geostatistical approach does not require a prior estimate of fluxes, it does not suffer from the risks associated with using prior flux estimates discussed in section 1. Second, the geostatistical approach can be used to estimate a variety of parameters that have had to be specified in the past, such as the spatial correlation parameters of the surface fluxes (e.g., correlation length and variance) and the variance associated with the model-data mismatch (which encompasses measurement error and transport error). Third, because the geostatistical approach is based on a compromise between reproducing available measurements and conserving spatial correlation in the unknown function, the method can be applied at any resolution. As the grid resolution increases, the resolved correlation length, posterior best estimate, and covariance function all converge. Thus the method allows for the estimation of surface fluxes on a much finer scale than is possible with region-scale models, while still resulting in meaningful confidence intervals.

[21] Overall, the geostatistical approach minimizes the number of assumptions that go into the solution of the inverse problem. The geostatistical method maximizes the extent to which the data can “speak for themselves” by allowing each component of the inversion to be data-driven. This does not mean to imply that the geostatistical approach is the best choice for every study. If part of the goal of a project is to quantify the degree to which atmospheric data themselves can constrain surface fluxes, however, the geostatistical approach offers a unique opportunity. In this way, it provides clarity to the evidence for fluxes, separating what can be deduced from atmospheric observations from what rests on other lines of evidence. We consider this to be a distinct advantage.

2.3. Methodology and Algorithm

[22] The linear geostatistical inverse modeling methodology as applied to atmospheric surface flux estimation is briefly described here. The reader is referred to Kitanidis and Vomvoris [1983], Hoeksema and Kitanidis [1984], and Kitanidis [1995] for additional details and background. Note that throughout this discussion, m refers to the number of points at which the surface flux distribution is to be estimated, n refers to the number of observations to be used to constrain the problem, and p refers to the number of terms in the model of the mean.

2.3.1. Summary of Algorithm

[23] The overall method proceeds as follows:

[24] 1. Select the prior covariance model Q of the fluxes s. See section 2.3.2.

[25] 2. Optimize model parameters Φ. These can include both parameters of the covariance model Q and of the model-data mismatch covariance R. See section 2.3.6.

[26] 3. Solve the linear inverse problem to obtain equation image, the posterior best estimate of the unknown function, and Vequation image, the posterior covariance of the estimate. See sections

[27] 4. If needed, generate conditional realizations, sci. See section 2.3.7.

[28] Note that if the covariance parameters are known or have been estimated independently, step 2 can be omitted.

2.3.2. Setup

[29] Overall, the objective is to estimate an unknown surface flux distribution. The standard estimation problem may be expressed in the following form:

equation image

where z is an n × 1 vector of observations and s is an m × 1 “state vector” obtained from the discretization of the surface fluxes that we wish to estimate. The vector r contains other parameters needed by the transport model function h (s, r). The model-data mismatch is represented by the vector ν. This error encompasses both the measurement error associated with collecting the data and any random numerical or conceptual inaccuracies associated with the evaluation of the function h (s, r). When the function h (s, r) is linear in the unknown s, as is the case with linear transport models such as the one that will be used here, it can be written as

equation image

where H is a known n × m matrix, the Jacobian representing the sensitivity of the observations z to the surface fluxes s (i.e., Hi,j = ∂zi/∂sj).

[30] Following geostatistical methodology, s and ν are represented as random vectors. We assume that ν has zero mean and known covariance matrix R. The covariance of the measurement errors is most commonly modeled as

equation image

where σR2 is the variance of the measurement error and I is an n × n identity matrix. The variance σR2 can either be derived independently (e.g., by estimating the actual measurement error as well as the variance of the error introduced by the application of the chosen transport model) or can be estimated from the data themselves, as described in section 2.3.6. Also, although the variance σR2 will here be considered as constant for all measurements, a variable variance could instead be identified, distinguishing between measurements that are easier or more difficult to reproduce. For example, sites could be broken up into categories that exhibit similar properties in terms of data reproduction, yielding different σR2 values for different sites. Clearly, this could not be taken to the extreme of identifying a different variance for each available measurement.

[31] We model s as a random vector with a priori expected value (i.e., mean)

equation image

where X is a known m × p matrix, β are p × 1 unknown drift coefficients, and is the model of the mean. For example, for a constant mean, p = 1,

equation image

and β is the prior mean of the surface fluxes, an unknown scalar. Note that this formulation is appropriate for fluxes expressed in mass or moles per unit area and time. If total fluxes per grid cell were to be estimated, the elements of X would need to be scaled by the relative area of individual grid cells. For a system where fluxes in individual grid cells are expected to form two distinct populations, each with a constant mean (e.g., if ocean and land grid cells are expected to have different means), X and β take the form

equation image

where β1 and β2 are again unknown scalars. For a system where there is a single population and the mean of the fluxes is expected to have a linear trend with an additional variable t,

equation image

The model of the mean could also take on more complex forms to include other factors that are known to correlate with flux intensities (e.g., population, vegetation cover, patterns obtained through remote sensing). In general, the model of the mean is chosen to have the simplest form that captures the essential behavior of the mean of the unknown function, and the number p of drift parameters is very small. The general guideline is to only use components of the model of the mean that are known to be a determining factor in the function's behavior. Using a simple model may lead to higher uncertainty in the posterior fluxes, but using an erroneous, more complex model can lead to biased results, which is more of a concern. Statistical tests can be performed to test the validity of incorporating additional terms in the model of the mean [Kitanidis, 1997b]. Note that even when the model of the mean is quite simple, the resulting best estimate of the flux distribution can be quite complex because the prior covariance matrix Q prescribes a correlation structure to deviations from the mean behavior.

[32] Given the model of the mean in equation (10), the prior covariance matrix of s defined in equation (5) takes the form

equation image

This prior covariance matrix can be based on a covariance function or a generalized covariance function (GCF). GCFs extend the applicability of the methods to nonstationary functions [Matheron, 1973; Kitanidis, 1993]. Both covariance functions and GCFs are associated with corresponding variograms. A variogram defines the expected variance of the deviation of function values from their mean behavior as a function of separation distance [see, e.g., Cressie, 1991; Kitanidis, 1997a]. Various forms of the covariance matrix have been used in groundwater contaminant source identification. These include the Gaussian covariance function [Snodgrass and Kitanidis, 1997], the linear GCF [Michalak and Kitanidis, 2003], and the cubic GCF [Michalak and Kitanidis, 2002, 2004a]. For the applications presented here, the exponential covariance function will be used, which, for a set of correlated points, is defined as

equation image
equation image

where θ = {σ2, l}, σ2 is a variance, l is an integral scale, h is the separation distance between two points at which s is estimated, and ∣ means “given.” The corresponding variogram is defined as

equation image

For this model, as the separation distance h between two points goes to infinity, the mean square difference between the unknown function's deviations from its mean behavior at these points approaches σ2, and the covariance between the points approaches zero.

[33] The choice of covariance model should be based on our understanding of the problem and the expected behavior of the function to be estimated. An examination of the characteristics of various covariance functions is given by, for example, Cressie [1991] and Kitanidis [1993, 1997a]. The choice of covariance function used in this work will be discussed further in section 3.3.1.

[34] The parameters needed by the selected covariance model (e.g., σ2 and l in the case of the exponential covariance) can either be known a priori or can be estimated as described in section 2.3.6.

2.3.3. Bayesian Framework

[35] Geostatistical inverse modeling follows a Bayesian approach. Bayes' theorem states that the posterior pdf of a state vector s given an observation vector z is proportional to the likelihood of the state given the data (or, conversely, the pdf of the data given the state) times the prior pdf of the state. Because we are assuming that the drift parameters β are unknown as well, they are estimated along with s. Symbolically,

equation image

In this context, prior and posterior probability density functions are with respect to using the data z. In the geostatistical approach, the prior represents the assumed spatial or temporal structure of the unknown surface fluxes, as described by a covariance function. The likelihood of the data represents the degree to which an estimate of the unknown function s reproduces the available data z.

[36] The prior is modeled as

equation image

where ∣ ∣ denotes matrix determinant and the prior probability density function of β is assumed to be uniform over all values (p′(β) ∝ 1).

[37] The likelihood function is defined in the same way as in past Bayesian atmospheric studies:

equation image

[38] The posterior probability density of the unknown flux distribution s therefore becomes

equation image

Its negative logarithm, which is the objective function that will be minimized in obtaining a best estimate of the flux distribution, is as presented in equation (4).

2.3.4. Solution

[39] Because the geostatistical approach does not incorporate a prior estimate of the fluxes and sp is therefore not defined, equations of the form presented in equations (2) and (3) cannot be used for the solution of the problem. Instead, a linear system of equations is derived, the solution of which is then used to define the posterior estimate and covariance of s.

[40] The objective function to be minimized is the negative logarithm of the posterior probability density function of s, as defined in equation (4). takes the place of the prior estimate of s used in traditional Bayesian modeling. The posterior best estimates of s and β, denoted equation image and equation image, minimize Ls,β. Taking the derivative of the objective function Ls,β with respect to s and β and setting these equal to zero yields, respectively,

equation image


equation image


equation image

[41] These equations can be rearranged to define a m × n matrix of coefficients Λ according to

equation image


equation image

In addition, we define a p × m matrix of multipliers M, where

equation image

Manipulating the above equations and expressing the results in matrix form, we obtain

equation image

Once this system is solved for Λ and M, we obtain equation image from equation (25). The size of the matrix to be inverted (equation (28)) is (n + p) × (n + p), whereas the inversion in the classical Bayesian approach (equations (2) and (3)) is n × n. Given that p is generally very small, the numerical cost of a geostatistical inversion is comparable to that of an equivalent classical Bayesian inversion.

2.3.5. Posterior Covariance

[42] The posterior covariance of equation image and equation image is given by the inverse of the Hessian of the objective function, which is

equation image

Note that taking the inverse of the above matrix is not equivalent to taking the inverses of its parts. In fact, the uncertainty associated with the estimation of equation image is incorporated in the posterior covariance of equation image. After taking the inverse of equation (29) analytically by using properties of partitioned matrices [e.g., Schweppe, 1973, pp. 495–496] and performing some linear algebra manipulations, the portion of the matrix defining the posterior covariance of equation image can be shown to be [Kitanidis, 1995]

equation image

which does not require taking the inverse of Q or R and is therefore more numerically stable than taking the inverse of equation (29) directly. The diagonal elements of Vequation image represent the posterior variance of individual elements of equation image.

2.3.6. Parameter Optimization

[43] This section outlines the optimization of structural parameters that can be estimated in addition to s and β. Typical parameters to be estimated in this way are the parameters θ of the covariance matrix Q (for example, σ2 and l for an exponential covariance function) and the variance(s) of the model-data mismatch σR2. These parameters will be jointly termed Φ in the discussion that follows.

[44] The approach used to obtain the structural parameters is detailed by Kitanidis [1995]. In short, the parameters are estimated by maximizing the probability of the measurements, which is defined as

equation image

where the inside of the integral is as defined in equation (21) (where Q and R are functions of Φ). By integrating out all possible values of s and β in equation (31), the marginal probability density function of the observations z with respect to the parameters Φ is defined as

equation image

where Ψ is defined in equation (23) and is a function of Φ and

equation image

The objective is to find the values of Φ that maximize equation (32) or, alternately, minimize its negative logarithm:

equation image

The number of parameters Φ is relatively small, and a number of search algorithms can be implemented to find the minimum of equation (34) with respect to Φ. Common algorithms include the Gauss-Newton and Levenberg-Marquardt methods [Gill et al., 1986, pp. 134–137]. In cases where there is insufficient data to estimate these structural parameters well, the full pdf of the parameters (equation (32)) can be used in the solution of the inverse problem [Kitanidis, 1986], thereby explicitly taking into account the uncertainty on Φ. However, such an approach can be significantly computationally more expensive. Alternately, some or all of the structural parameters in such cases can be estimated independently from other information.

2.3.7. Conditional Realizations

[45] Using geostatistical methodology, it is also possible to generate realizations of the surface fluxes that are conditional on all the observations. The procedure for generating conditional realizations is discussed by Gutjahr et al. [1994] and Kitanidis [1995]. Conditional realizations are equally likely realizations that follow the spatial correlation structure dictated by Q and also reproduce the observations z to within the estimated or specified model-data mismatch. Conditional realizations represent individual possible flux histories, given the available data. The average of a large number of such realizations would reproduce the best estimate of the function (equation (25)), which is smoother than the individual realizations. Although the best estimate represents the maximum of the posterior pdf of the fluxes, it is the conditional realizations that represent the range of possible actual flux distributions.

[46] To obtain a conditional realization, an unconditional realization must first be generated that follows the correlation statistics specified in Q. Although there are a variety of ways to do this, one of the simplest (although not necessarily computationally most efficient) approaches is to decompose the covariance matrix by Cholesky decomposition to

equation image

An unconditional realization following the correlation structure defined by Q is then generated according to

equation image

where the values of β are arbitrary and can be set to zero and ui is a vector of normally distributed random numbers with zero mean and unit variance. Finally, the conditional realization is defined as

equation image

where ν is a normally distributed random number with zero mean and variance σR2. In other words, ν is a random sample from the model-data mismatch error covariance R.

[47] The resulting conditional realizations are equally likely realizations of the surface fluxes from the posterior pdf presented in equation (21). If a large number of conditional realizations is generated, their mean and covariance will reproduce those derived in equations (25) and (30).

3. Sample Application

[48] The following section presents a sample application of linear geostatistical inverse modeling to the estimation of surface fluxes of CO2 on a 3.75° latitude by 5.0° longitude grid, an even finer grid than those used in past grid-scale inversion studies. Because this application is meant primarily as an illustration of the features and capabilities of the presented methodology, the inversion is kept relatively simple, with a single year of monthly averaged pseudodata being used to estimate fluxes that are constant in time over that same year. The background concentration in the atmosphere prior to the start of the year of interest is considered perfectly known, and only the component of the observed CO2 mole fraction resulting from the flux from the current year is used in the inversion.

3.1. Available Tools and Generation of Pseudodata

3.1.1. Flux Data

[49] The flux data that were used to generate the pseudodata were selected to reflect a realistic set of fluxes for CO2. The estimates used for both the fossil fuel and oceanic components of the global fluxes were the same as those applied in the Atmospheric Tracer Transport Model Intercomparison Project 3 (TransCom3), an atmospheric carbon budget inversion intercomparison study [Gurney et al., 2002]. The fossil fuel emissions were based on Brenkert [1998] and Andres et al. [1996], who assume constant fossil fuel sources throughout the year. The net oceanic carbon exchange was taken from Takahashi et al. [2002]. Monthly fluxes were averaged to obtain a yearly flux equivalent. For the net ecosystem production (NEP) component of the land fluxes, the TransCom3 estimates were based on a neutral biosphere assumption that results in a zero average yearly flux on a grid cell by grid cell basis [Randerson et al., 1997]. Because we were interested in a yearly inversion, this set of fluxes was not very interesting for our application. Therefore we instead used yearly averaged land biospheric fluxes from McGuire et al. [2001]. These fluxes represent the average net ecosystem production (NEP) as generated using the Lund-Potsdam-Jena (LPJ) terrestrial biosphere model.

[50] All flux data were defined on a 3.75° latitude by 5.0° longitude grid, which yields a 48 × 72 surface grid with a total of 3456 points at which the surface fluxes are defined and will be estimated. The fluxes used to generate the pseudodata are presented in Figure 1.

Figure 1.

Surface flux distributions used in generating pseudodata, in units of μmol/(m2s). (a) Sum of yearly averaged land fluxes (fossil fuels and NEP). (b) Yearly averaged net ecosystem production (NEP). (c) Yearly averaged oceanic exchange. Note that the color scales have been set to agree with those presented in Figures 47.

3.1.2. Basis Functions

[51] As in other Bayesian inversions, the geostatistical approach requires the formulation of a Jacobian matrix H, relating the unknown flux field to the available observations. This sensitivity matrix is typically obtained by sequentially running the transport model with pulses located in each region or grid cell, for each time at which a flux is to be estimated, and observing the response at times and locations where measurements are available. Recently, adjoints have been developed for certain transport models, which allow for H to be inferred from adjoint simulations, with one run required for each observation, instead of each source region [Kaminski et al., 1999a]. The adjoint formulation results in computational savings when the total number of flux values to be estimated is greater than the number of available observations. This computational savings is therefore particularly significant for grid-scale inversions.

[52] For the application presented here, results from an adjoint implementation of Tracer Model 3 (TM3) were used to define H [Kaminski et al., 1999a]. Basis functions relating monthly averaged CO2 observations at a subset of the Climate Monitoring and Diagnostics Laboratory (CMDL) observation network sites to monthly averaged grid-scale fluxes were calculated by Rödenbeck et al. [2003] for 1982–2001. The 2001 subset of these same basis functions were used for the work presented here.

[53] We are using monthly averaged data to infer fluxes that, for the application presented here, have been defined as being constant throughout the year. We are interested in inferring these fluxes on the basis of the additional contribution of these fluxes over a single year to the atmospheric mass fraction of CO2. Therefore the sensitivity matrix integrates the effect of the fluxes for all months leading up to a given observation:

equation image

where Hi,j,k is the sensitivity of observation i to a flux in grid cell j that occurred in month k and l is the month in which the observation is taken. In other words, we are summing the influence of the constant fluxes up to the times when observations are taken.

3.1.3. Observation Pseudodata

[54] In an effort to generate a set of pseudodata that is consistent with the amount of data typically used in inversion studies, the basis functions generated for 2001 by Rödenbeck et al. [2003] were used to generate pseudodata for months and CMDL sites where actual CO2 data are available. Therefore, although the observational data have been numerically generated, their spatial and temporal distribution represents a subset of the CMDL Cooperative Global Air Sampling Network's data collected for 2001. Overall, the data set consists of 433 monthly averaged datapoints, collected over 12 months at a total of 39 sites. Random error was added to the pseudodata to simulate the effect of measurement and transport errors (see section 3.2). Note that not every station has data at every month. A map illustrating the sites at which data were modeled, as well as the number of months for which these sites were sampled, is presented in Figure 2. Given the 433 observations and the 3456 grid points at which the flux distribution is to be estimated, the inversion is strongly underdetermined.

Figure 2.

Locations of pseudodata measurements. The numbers indicate the number of monthly averaged measurements available at each location. Note that the two locations listing 12 × 2 measurements are areas where two observation locations are too close to one another to be resolved on the plot. This occurs for (i) St. Davids Head, Bermuda (BME), and Tudor Hill, Bermuda (BMW), and (ii) Mauna Loa, Hawaii (MLO), and Cape Kumukahi, Hawaii (KUM).

3.1.4. Calculation of Separation Distance

[55] The separation distance h needs to be defined between all points at which the fluxes are to be estimated in order to construct the covariance matrix Q. Because we are working at a global scale, h was calculated using the great circle distance between two points on the surface of the Earth:

equation image

where the coordinates xi = (ϕi, ϑi) are the latitude and longitude, respectively, of the grid points at which the fluxes are to be estimated, r is the mean radius of the Earth (6378 km), and latitude, longitude, and all angles are in radians.

3.2. Test Cases

[56] A total of four test cases was examined. We were interested not only in the general behavior of the methodology but also in the effect of certain parameters on the inversion results. Two different levels of model-data mismatch were used, with standard deviations of 0.50 ppm and 0.10 ppm (σ2 = 0.25 ppm2 and 0.01 ppm2). Note that these levels of model-data mismatch are not necessarily those that one would expect to use with real data. Most surface flux estimation studies have used higher model-data mismatch variances, and these variances were also variable between observation sites. Such effects will be incorporated in future work using atmospheric data, but for the purposes of this pseudodata example, we had the option of keeping the system relatively simple. Second, the effect of recognizing that land and oceans are known to have distinctly different source characteristics was examined by treating all fluxes as correlated in one case and treating land fluxes as uncorrelated to ocean fluxes in other cases. Finally, although the method is capable of inverting for the total of all fluxes (fossil fuel, net ecosystem production, and oceanic exchange), one inversion was carried out where the fossil fuel sources were considered known because such an assumption has been made in some past CO2 inversion studies [e.g., Peylin et al., 2002; Rödenbeck et al., 2003]. The four examined scenarios are outlined in Table 1. The model-data mismatch error was numerically added to the generated pseudodata by adding to each observation a normally distributed random number with the variance specified in Table 3 (see Introduced Error).

Table 1. Examined Inversion Scenarios
Model-data mismatchlowerlowerhigherlower
Number of zones1222
Fossil fuel fluxesestimatedestimatedestimatedknown

3.3. Results

[57] Note that for all the cases, the inferred surface fluxes will be presented separately for land and oceans, although they were estimated simultaneously, using a single inversion. We do this because, when fossil fuel sources are considered, the magnitude of fluxes over land is much greater than that over oceans and using a single scale to present results for both domains would mask much of the variability in the oceans. Also, although conditional realizations of the inferred fluxes could be generated for all examined cases, they will be presented here only for case B.

3.3.1. Covariance Model Selection

[58] The solution method follows the algorithm described in section 2. On the basis of the setup presented in Table 1, the model of the mean was selected to reflect a single zone for case A, as described in equation (11), and two zones for cases B, C, and D, as in equation (12). For cases B, C, and D, each grid cell needed to be assigned to either the land or ocean zone. An index of the land fraction on the required grid was obtained from the TM3 input files [Heimann, 1996]. For grid cells that were neither fully land nor ocean, the sum of the fossil fuel and NEP fluxes would tend to dominate the signal whenever the land fraction in a grid cell was greater than approximately 10%. Therefore, for cases B and C, grid cells with a land fraction over 10% were pooled as land, yielding 1444 land grid cells and 2012 ocean grid cells. For case D, once the fossil fuel source was assumed known, the magnitude of the remaining land sources was similar to that of the ocean sources, and grid cells were assigned to the land or ocean zone on the basis of the surface type that constituted more than 50% of the grid cell. This division yielded a total of 1138 land grid cells and 2318 ocean grid cells.

[59] The prior covariance matrix was based on an exponential correlation structure, defined as in equation (15). The structure of the covariance matrix for a single zone (case A) was therefore as defined in equation (16). For the two-zone setup (cases B, C, and D), no correlation was assumed between land and ocean grid cells. Therefore the structure of the covariance matrix was

equation image

where each component is itself a matrix, the subscripts l and o represent the land and ocean zones, respectively, and hl and ho are matrices containing the separation distances between all points in the land and ocean zones, respectively. The exponential covariance function model and its corresponding variogram are illustrated in Figure 3. As can be seen from this figure, the exponential model implies that there is a sill in the overall variance of the process as the separation distance increases. Functions following an exponential covariance function have a variability that can be described by a correlation length and an asymptotic variance at large separation distances, and they do not have to have continuous derivatives. These characteristics are consistent with our understanding of the surface flux distribution of greenhouse gases. Note also that the covariance approaches zero for separation distances on the order of three integral scales l, indicating a correlation length of approximately 3l.

Figure 3.

Normalized exponential covariance and variogram functions.

3.3.2. Optimization of Structural Parameters

[60] The variance of the model-data mismatch and the parameters needed by the covariance function were considered unknown and were optimized using the method described in section 2.3.6. The covariance parameters estimated for the various cases are presented in Table 2 (see Inferred Parameters). Note that, as will be discussed in section 4, the land parameters were not inferred for case D. This table also presents the same statistics, but for the actual fluxes used in generating the pseudodata (see Actual Parameters). The spatial correlation structure of the actual fluxes was determined using a method analogous to that presented in section 2.3.6 [Kitanidis and Shen, 1996]. Note that for the actual fluxes, land and ocean structural parameters are identical for cases B and C because the same fluxes were used to generate the pseudodata in both cases and the same zone definition was used to separate land from ocean. Also, the ocean flux characteristics in case D are almost identical to those in cases B and C for similar reasons, the only differences arising from the different definition of land versus ocean grid cells.

[61] The model-data mismatch variance estimated for the various cases is presented in Table 3 (see Inferred Error), along with the actual variance of the error added to the observation pseudodata in each case (see Introduced Error). In addition, the mean square error between the available measurements (which themselves contain the introduced measurement error) and the observations that would result from individual conditional realizations of the surface fluxes obtained from the inversion is also included (see Final Mismatch). Note that the information presented in the first two columns of Table 2 and the “Introduced Error” column of Table 3 would not be available if real data were used.

Table 2. Structural Parameters Calculated From Real Fluxes and Inferred From Data
CaseZoneActual ParametersInferred Parameters
σ2, image × 10−14l, km × 103σ2, image × 10−14l, km × 103
Case Aland + ocean2.
Case Bland4.
Case Cland4.
Case Dland0.0222.4
Table 3. Actual Error Variance Added to the Observation Pseudodata, Model-Data Mismatch Error Inferred From the Data, and Final Model-Data Mismatch Resulting From the Flux Best Estimate
 Model-Data Mismatch Variance σR2, ppm2
Introduced ErrorInferred ErrorFinal Mismatch
Case A1.0 × 10−20.98 × 10−21.0 × 10−2
Case B1.0 × 10−20.98 × 10−20.99 × 10−2
Case C25 × 10−223 × 10−223 × 10−2
Case D1.0 × 10−20.91 × 10−20.92 × 10−2

3.3.3. Solution of Inverse Problem

[62] The inferred parameters in Tables 2 and 3 were used in the solution of the inverse problem using the methodology described in sections

[63] The recovered surface flux distributions are presented in Figures 47 for cases A to D, respectively. Land and ocean fluxes were obtained using a single inversion but are presented separately for visualization purposes, as described earlier. These figures also illustrate the posterior standard deviation associated with the best estimates of the surface fluxes. The actual surface fluxes were presented in Figure 1. Note that an effort was made to use consistent color scales in Figures 1, 4, 5, 6, and 7 wherever possible. The color scales for the ocean flux uncertainty for case A (Figure 4d) and the land fluxes for case D (Figures 7a and 7b), however, are different from their counterparts for the other cases.

Figure 4.

Recovered flux estimate for case A, in units of μmol/(m2s). (a) Best estimate of land fluxes (fossil fuels and NEP). (b) Posterior standard deviation of land fluxes (fossil fuels and NEP). (c) Best estimate of oceanic exchange. (d) Posterior standard deviation of oceanic exchange.

Figure 5.

Recovered flux estimate for case B, in units of μmol/(m2s). (a) Best estimate of land fluxes (fossil fuels and NEP). (b) Posterior standard deviation of land fluxes (fossil fuels and NEP). (c) Best estimate of oceanic exchange. (d) Posterior standard deviation of oceanic exchange.

Figure 6.

Recovered flux estimate for case C, in units of μmol/(m2s). (a) Best estimate of land fluxes (fossil fuels and NEP). (b) Posterior standard deviation of land fluxes (fossil fuels and NEP). (c) Best estimate of oceanic exchange. (d) Posterior standard deviation of oceanic exchange.

Figure 7.

Recovered flux estimate for case D, in units of μmol/(m2s). (a) Best estimate of land fluxes (NEP only). (b) Posterior standard deviation of land fluxes (NEP only). (c) Best estimate of oceanic exchange. (d) Posterior standard deviation of oceanic exchange.

[64] These results were also aggregated into 22 regions corresponding to those used in the TransCom3 study [Gurney et al., 2002], taking into account the area of individual grid cells to yield a total mass flux per unit time. These regions are presented in Figure 8. The uncertainty on this regional scale was determined by summing the entries (both on and off the diagonal) in the area-weighted posterior covariance matrix corresponding to grid cells belonging to each region. In this manner, the effect of inferred correlations or anticorrelations among neighboring grid cells is taken into account in the uncertainty estimate at the regional scale. The results of this analysis are presented in Figure 9, along with the actual fluxes aggregated to the same grid.

Figure 8.

Definition of 22 TransCom3 regions on a 3.75° latitude by 5.0° longitude scale.

Figure 9.

Recovered flux estimates aggregated to 22 TransCom3 regions. Circles and error bars represent aggregated posterior best estimates and standard deviations. Stars represent aggregated fluxes used in generating the observation pseudodata (see Figure 1).

3.3.4. Conditional Realizations

[65] As discussed in section 2.3.7, the geostatistical approach allows for the generation of realizations of the unknown function that are conditional on all the observations. This can help in the visualization process because the conditional realizations represent individual possible scenarios of the flux distribution. The best estimates presented in Figures 47, on the other hand, represent an average of all possible scenarios and only include the features that tend to be common to all these possible scenarios. As a result, the best estimates are significantly smoother than the individual conditional realizations, and it is the realizations, not the best estimates, that have statistical properties that are consistent with those derived in Tables 2 and 3 (see Inferred Parameters). In addition, conditional realizations give a visual indication of the structure of the off-diagonal terms in the posterior covariance matrix Vequation image. For example, correlations and anticorrelations between cells are visible in the conditional realizations, as certain cells or regions are seen to vary jointly. Figure 10 presents three such realizations for case B.

Figure 10.

Three conditional realizations of recovered fluxes for case B. Fluxes are in units of μmol/(m2s).

4. Discussion

[66] As can be seen from Figures 47, the geostatistical approach is effective at identifying mesoscale variability in the surface fluxes in all cases. Variability is clearly visible at scales much smaller than those specified by large regions such as those presented in Figure 8. In addition, the recovered fluxes are in good agreement with the fluxes used in generating the pseudodata (Figure 1). We do not, in fact, expect to be able to recover the distribution perfectly, because of the information loss that inevitably results when sharp gradients are attenuated by the diffusive nature of atmospheric transport, the introduced model and measurement error, and the small number of measurements relative to unknowns. In this hypothetical case, we could easily have come as close to recovering the exact surface flux distribution as we would have wanted to, by increasing the number of measurement locations and decreasing the variance of the error vector ν added to the measurements. This was not the goal of the exercise, however. Instead, we are interested in verifying whether the method can reasonably recover surface fluxes given realistic data availability and quality. We recognize, of course, that although we add random error to the pseudodata, we are dealing with a case where the transport model has no consistent bias, whereas model bias is an additional complicating factor in real-data applications.

[67] In addition to recovering the surface fluxes, the method is effective at inferring the statistical parameters of the surface fluxes and at estimating the model-data mismatch variance. The integral scales and variances inferred from the observations are similar to those calculated from the actual surface fluxes, which would be unavailable in a real-data application (see Table 2). Because we are trying to infer the statistical properties of the fluxes from information available from the measurements, we cannot expect to recover these statistical parameters exactly. The method can also discern, using the available measurements, the fact that fluxes from land and ocean regions exhibit very different variances and integral scales from one another (see cases B and C). Note that this is a separate issue from the variance associated with individual measurement locations. As is clearly visible in Figure 1 as well as in the actual parameters listed in Table 2, the ocean fluxes exhibit a lower variance and longer correlation length relative to land regions. The method is able to discern these differences from the available data (Table 2, Inferred Parameters). The method is also set up to estimate the model-data mismatch variance, which, in this pseudodata example, is known to be the variance of the error added to the pseudodata observations. As can be seen from Table 3, the model-data mismatch is accurately inferred from the data. In addition, the model-data mismatch resulting from the transport of the conditional realizations of the surface flux distribution (see Final Mismatch) is consistent with the inferred model-data mismatch (and therefore the actual introduced error variance).

[68] In the presented application, we have chosen to solve for fluxes over the entire globe, including regions such as Antarctica and Greenland, which are known to have negligible CO2 fluxes. Solving for these regions serves as a good check to verify that the method and data can identify the fact that fluxes from these regions are near zero, and this is in fact the case when looking at Figures 47. Clearly, one could instead enforce the fact that these regions exhibit no CO2 fluxes by not solving for these regions, as was done, for example, in the TransCom3 study [Gurney et al., 2002] (see also TransCom3 regions in Figure 8).

4.1. Case A (Figure 4)

[69] In this simplest case, the Earth is considered to constitute a single zone, with all grid cells tending to a common single mean value, and with a single set of statistical parameters describing the deviations of the surface fluxes from this mean. In reality, it is undeniable that oceanic fluxes tend to display different statistical properties relative to land fluxes (see Figures 1a and 1c), but it is interesting that case A still captures many of the features of the actual fluxes. For example, fossil fuel sources in eastern North America, Europe, and eastern Asia are clearly identified. These flux patterns are statistically significant even at the grid scale, as can be seen by comparing the flux intensities (Figure 4a) with the posterior standard deviations (Figure 4b). Regions of CO2 uptake and release are also recovered in the oceans. The single set of statistical parameters recovered for this case tends toward the higher variance and shorter integral scale characteristic of land sources (see Table 2). Therefore the recovered oceanic fluxes tend to exhibit more variance and less correlation relative to the actual oceanic fluxes (Figures 4c and 4d). Also, because this setup assumes that all fluxes are correlated, flux patterns cross over land/ocean interfaces. This can be seen, for example, in the ocean regions adjacent to eastern North America and Southeast Asia (Figures 4a and 4c). When fluxes are aggregated over the TransCom3 regions (Figure 9), even this simple setup does a reasonable job of recovering the total regional fluxes. Furthermore, for this case as well as all the others, the posterior standard deviations (Figures 4b and 4d) are higher in the Southern Hemisphere. This is consistent with past studies [e.g., Gurney et al., 2002] and is indicative of the sparsity of the observation network in the Southern Hemisphere.

4.2. Case B (Figures 5 and 10)

[70] Case B differs from case A in that we recognize that land and ocean regions will have surface fluxes with different statistical characteristics. The magnitude of the model-data mismatch error added to the generated measurements is unchanged, with σ = 0.10 ppm (σ2 = 0.01 ppm2). The first interesting result is that the method is able to recognize, on the basis of only the sparse atmospheric measurements, that land and ocean regions display significantly different statistical properties (see Table 2, Inferred Parameters). The estimated variance for the land regions is approximately twentyfold larger than that in the oceans, which is similar to the actual ratio between these variances (see Table 2, Actual Parameters). Similarly, a greater integral scale for the ocean fluxes is also inferred from the measurements. The best estimate for land fluxes (Figure 5a) is similar to that found in case A, with somewhat more detail made possible by the slightly larger variance obtained in the parameter estimation stage. The key flux patterns are again statistically significant even at the grid scale, as can be seen by looking at the posterior standard deviations in Figure 5b. In addition, looking at the conditional realizations in Figure 10 confirms that features such as the large sources in eastern North America and western Europe are common among the realizations and therefore are essential features of the flux pattern. The oceanic fluxes look very different from those in case A. No longer bound by statistical parameters that are more representative of land regions, the best estimate of the ocean fluxes now has magnitudes and patterns very similar to the actual fluxes. Also, the land variance estimate in this case is higher than the single variance estimated in case A, resulting in a higher posterior standard deviation for land (Figure 5b). The opposite is true for oceans (Figure 5d). The release in the tropical South Atlantic and the drawdown farther south are not recovered to the same extent as they are present in the actual fluxes, but this is due to the large uncertainty in that region (see Figures 5c and 5d). In fact, in observing the conditional realizations (Figure 10), the release is present in certain realizations. This indicates that the available data not only do not point to a necessary release in that region but also do not rule out such a release, indicating that the region is not well constrained by the available observations. The conditional realizations in Figure 10 also reveal some other interesting patterns of uncertainty and correlation. For example, the land fluxes in the Northern Hemisphere are fairly consistent throughout the realizations, but fluxes in South America and southern Asia vary more significantly. The higher uncertainty in southern oceanic regions also results in larger variability between conditional realizations, which is consistent with the results presented in Figure 5d.

4.3. Case C (Figure 6)

[71] Case C is similar to case B, but the model-data mismatch artificially added to the generated measurements has a variance that is 25 times larger, with σ = 0.50 ppm (σ2 = 0.25 ppm2). Overall, the inversion recognizes that there is less information in the data used in case C relative to case B. This can be seen most clearly by noticing that the uncertainty bounds in Figure 9 are wider in case C relative to case B. Also, the parameter optimization step was able to recognize the higher model-data mismatch error present in this scenario (Table 3). The parameter optimization routine aims to estimate the covariance parameters representing the underlying fluxes. However, as the amount of information decreases (e.g., when the model-data mismatch increases), small-scale features of the flux distribution can no longer be reliably identified, and the parameters that are estimated tend to correspond to those of larger-scale features that can still be resolved (see Table 2). This is not a problem to the extent where these parameters are still reasonable representations of those of the underlying flux patterns, as is the case here (see Table 2). The best estimates for both the land and ocean regions also reflect the increased uncertainty of this scenario, with the estimates being generally smoother relative to case B, and some of the smaller-scale features, such as the strong land flux in the western United States, being less well resolved.

4.4. Case D (Figure 7)

[72] Case D, finally, assumed that fossil fuel sources are relatively well constrained from economic statistics and only the NEP and oceanic exchange are estimated. For this case, as can be seen in Figure 1b, the only significant yearly averaged net land fluxes are in central South America and in southeastern Africa. As can already be seen from the posterior standard deviations for cases A, B, and C (Figures 4b6b), these regions are very poorly constrained by the observation network. The result is that for case D, the optimization routine failed to converge when the statistical parameters of the model-data mismatch and the variance and integral scale of the land and ocean regions were to be identified. In effect, this means that the measurements do not contain information about the statistical structure of the land fluxes on a yearly basis. To circumvent this problem, the land region was assigned statistical parameters (variance and integral scale) equal to those of the actual fluxes (see Table 2, case D, Land, Actual Parameters). The optimization routine was then used to identify the statistical characteristics of the ocean fluxes and the model-data mismatch based on the atmospheric measurements. In a case involving real data the exact variance and integral scale would not have been known for the land regions but could likely have been estimated from outside information. Again as a result of the sparse sampling in the Southern Hemisphere, the best estimate of the land fluxes is much more uniform than the actual fluxes for this case. Generally, higher sources are observed in South America and South Africa, and the biggest sinks are found in Siberia, consistent with the actual fluxes. Averaged over the TransCom3 regions, in fact, this case recovers regional land flux averages well. In the oceans, case D performs very well overall. With a lower portion of the unknown signal being attributed to land fluxes, the inversion is able to pinpoint oceanic fluxes extremely well. The patterns observed in the recovered oceanic fluxes (Figure 7c) are remarkably similar to those in the actual fluxes (Figure 1c).

[73] Case D actually represents a particularly difficult set of fluxes to recover. The NEP component of the land surface fluxes is larger in any given month than its annual average. Therefore, when monthly inversions are performed, the signal is stronger than that which was used to recover the land surface fluxes in case D. This observation raises questions about the precision with which annual NEP fluxes can be inferred from monthly inversions, however, because the underlying annual signal can be very small, as was the case in this pseudodata application.

5. Conclusions

[74] This work presents the first application of a geostatistically based inverse modeling method to recovering the surface fluxes of atmospheric constituents. Although geostatistical methods were developed in the context of subsurface applications, they are in fact applicable to many problems where spatial or temporal correlation is expected in the function to be estimated. The nature of geostatistical methods makes them particularly applicable to grid-scale inversions, which have been difficult to constrain by traditional Bayesian methods unless these methods took into account spatial correlation [Rödenbeck et al., 2003]. The geostatistical approach to inverse modeling avoids certain problems associated with the application of traditional Bayesian approaches at both regional and grid scales. Because the geostatistical approach does not rely on a prior estimate of fluxes, the method allows each component of the inversion to be data-driven. On the basis of the obtained results, it appears that even a subset of the current CMDL network may be sufficient to constrain flux distributions at a scale much smaller than that allowed in typical Bayesian inversions (where the Earth is subdivided into a small number of regions) as long as spatial correlation is taken into account and the transport model errors are not biased and not overwhelmingly large.

[75] The method was applied to the recovery of surface fluxes from CO2 pseudodata. Three inversions involved the estimation of fossil fuel sources along with other land and oceanic fluxes, while one inversion considered the fossil fuel sources as known. The effect of model-data mismatch error and the use of a single zone versus separate land and ocean zones were also investigated. The method performed well in all cases, yielding best estimates consistent with the fluxes used to generate the pseudodata and confidence intervals that represented the precision of the best estimates well. In the case where the fossil fuel sources were assumed known, the remaining land fluxes (representing annually averaged net ecosystem production) were very small and not well constrained by the selected observation network. In that case, the method was not able to estimate the correlation characteristics of the flux distribution. When the correlation parameters over land were specified, however, the method performed well.

[76] The main conclusion that can be drawn from this study is that geostatistical inverse modeling methods show great promise in their application to grid-scale atmospheric inversions. The current study has focused on a pseudodata application, in an effort to isolate certain characteristics of the methodology and investigate the effect of various parameters in a setup where the surface fluxes are known. Future applications will involve the application of the presented methodology to the estimation of surface fluxes of various gases using available observations. Because geostatistical methods do not use a prior estimate of fluxes in the inversion, their application will shed light on the extent to which previous inversion studies have been affected by the choice of prior flux patterns.

[77] There are several possible extensions to the methods presented here that would make them applicable to a wider range of atmospheric problems. The presented methods can be extended to include correlations in time in addition to space, which should prove to be of particular interest in inversions on smaller timescales, such as the estimation of monthly or weekly fluxes. Also, more complex models of the mean could be applied to incorporate correlations between flux intensities and other parameters such as vegetative cover, population, seasonality, etc. In addition, these methods could potentially be merged with traditional Bayesian inverse modeling methods, allowing for the use of both a geostatistical prior and a prior specifying a first estimate of surface fluxes.

[78] Finally, as the number of observations used in inversions increases and the spatial and temporal scale at which we want to estimate fluxes continues to decrease, the numerical costs of the direct geostatistical approach will grow in the same way as those of classical Bayesian inverse modeling. As such, the geostatistical approach may eventually need to be combined with specialized numerical minimization methods that are well equipped to deal with such large systems.


[79] Funding for Anna M. Michalak was provided by a NOAA Climate and Global Change postdoctoral fellowship, a program administered by the University Corporation for Atmospheric Research (UCAR). The authors are grateful to Christian Rödenbeck for providing us with the base functions used in the analysis and to Adam Hirsch, John B. Miller, Wouter Peters, and David Baker for their input and comments on this work.