Probabilistic inversion of a terrestrial ecosystem model: Analysis of uncertainty in parameter estimation and model prediction

Authors


Abstract

[1] The Bayesian probability inversion and a Markov chain Monte Carlo (MCMC) technique were applied to a terrestrial ecosystem model to analyze uncertainties of estimated carbon (C) transfer coefficients and simulated C pool sizes. This study used six data sets of soil respiration, woody biomass, foliage biomass, litterfall, C content in the litter layers, and C content in mineral soil measured under both ambient CO2 (350 ppm) and elevated CO2 (550 ppm) plots from 1996 to 2000 at the Duke Forest Free-Air CO2 Experiment (FACE) site. A Metropolis-Hastings algorithm was employed to construct a posterior probability density function (PPDF) of C transfer coefficients on the basis of prior information of model parameters, model structure, and the six data sets. The constructed PPDFs indicated that the transfer coefficients from pools of nonwoody biomass, woody biomass, and structural litter were well constrained by the six data sets under both ambient and elevated CO2. The data sets also gave moderate information to the transfer coefficient from the slow soil C pool. However, the transfer coefficients from pools of metabolic litter, microbe, and passive soil C were poorly constrained. The poorly constrained parameters were attributable to either the lack of experimental data or the mismatch of timescales between the available data and the parameters to be estimated. Cumulative distribution functions were constructed for simulated C pool sizes on the basis of the six data sets, showing that on average the ecosystem would store 16,616 g C m−2 at elevated CO2 by the year 2010, significantly higher than 13,426 g C m−2 at ambient CO2 with 95% confidence. This study shows that the combination of a Bayesian approach and MCMC inversion technique is an effective method to synthesize information from various sources for assessment of ecosystem responses to elevated CO2.

1. Introduction

[2] The prevention of dangerous anthropogenic interference with climate system requires quantification of carbon (C) sinks in land and ocean. The latest Intergovernmental Panel of Climate Change (IPCC) reports that the terrestrial C sink will continue to sequester 5–10 Gt (×1015 g) C per year by the end of the 21st century [Houghton et al., 2001]. This range is estimated by mostly using terrestrial biosphere models – a major tool developed in the past decades to describe terrestrial C cycles [e.g., Parton et al., 1987; Luo and Reynolds, 1999; Cramer et al., 2001; McGuire et al., 2001]. Although these models are extensively used to predict C sequestration in terrestrial ecosystems, uncertainty in association with model parameters and predictions has not been carefully analyzed. If the uncertainty issue is not adequately addressed, C sink potentials cannot be fully understood. Some of the C sinks may be underestimated while others may be overestimated, even to the extent that contradictory results may appear. In such situation, policies based on current understanding to stabilize CO2 concentrations will fall short in meeting targets of environmental mitigation.

[3] Having realized the importance of uncertainty analysis on policy making, the global change research community has recently directed considerable attention to studying the stochasticity and uncertainty in ecosystem processes and effects of various sources of randomness on prediction of ecosystem changes [Clark, 2005; Murphy et al., 2004; Dose and Menzel, 2004; Forest et al., 2002; Wang et al., 2001]. Expert-specified probability density function (PDF) [e.g., Murphy et al., 2004] has been used to quantify key uncertain properties of climate change simulations. The Bayesian paradigm has been introduced to incorporate a priori probabilistic density functions (PDF) with measurements to generate aposteriori PDFs for parameters of ecosystem models [Braswell et al., 2005; Knorr and Kattge, 2005]. With a probabilistic approach, Mastrandrea and Schneider [2004] presented a cumulative probability function (CDF) to assess dangerous anthropogenic interference and showed its utility by applying it to analysis of uncertainty in model predictions of future changes. On a global scale, the Bayesian approach has been applied to constrain parameters in biosphere models against atmospheric CO2 concentration data and to assess the biosphere C fluxes and uncertainties [Kaminski et al., 2002; Rayner et al., 2005].

[4] This study was designed to assess uncertainty in parameter estimation and model prediction with a terrestrial ecosystem (TECOS) model. The model was applied to the Duke Forest ecosystem, where a Free-Air CO2 Experiment (FACE) has been in progress since August 1996, with a deterministic inversion for parameter estimation in a previous study [Luo et al., 2003]. In this study, we conducted probabilistic inversion within a Bayesian framework by using the same data sets and the same model to facilitate a methodological comparison with the deterministic inversion. Within the Bayesian framework, the measurements were treated as random variables with certain probability distributions. A joint probability density function (PDF) was constructed for model parameters to analyze information content of observed data sets. Samples were taken from the joint PDF using a Markov chain Monte Carlo (MCMC) technique, which is appropriate for sampling high-dimensional PDFs of model parameters and widely used in inverse problems in engineering and geosciences [e.g., Andersen et al., 2003; Dosso and Wilmut, 2002; Oh and Kwon, 2001; Geman and Geman, 1984]. The samples were used to construct marginal distributions for model parameters, to calculate parameter correlations, and to make CDFs for simulated pool sizes in forward modeling.

2. Methods

2.1. Carbon Cycling Model and Data Sources

[5] The model that we used in the study is a terrestrial ecosystem (TECOS) model that is a variation of the CENTURY model developed by Parton et al. [1987, 1988]. The TECOS model has a seven-pool compartmental structure (Figure 1) and has been applied to the Duke Forest FACE to study C sequestration process [Luo et al., 2003]. In the model, C enters the ecosystem via canopy photosynthesis and is partitioned into nonwoody and woody biomass. Dead plant material goes to metabolic and structural compartments and is decomposed by microbes. Part of the litter C is respired and the rest is converted to soil organic matter (SOM) in slow and passive soil C pools. Carbon transfer coefficients are rate variables that quantify amounts of C per unit mass leaving each of the pools per day (Table 1). The inverses of the transfer coefficients are the mean C residence times, which are the key parameters measuring the C sequestration capacity of the ecosystem when combined with primary production [Barrett, 2002; Luo et al., 2003]. Mathematically, the model is given by the following first-order ordinary differential equation:

equation image

where X(t) = (X1(t), X2(t), … X7(t))T is a 7 × 1 vector describing C pool sizes, A and C are 7 × 7 matrices given by

equation image

where diag(c) denotes a 7 × 7 diagonal matrix with diagonal entries given by vector c = (c1, c2, …, c7)T, components cj, (j = 1, 2, …7) represent C transfer coefficients associated with pool Xj, (j = 1, 2, …7) (Table 1), B = (0.25 0.30 0 0 0 0 0)T is a vector that partitions the photosynthetically fixed C to nonwoody biomass and woody biomass, ξ(.) is a scaling function accounting for temperature and moisture effects on C decomposition, U(.) is system input of photosynthetically fixed C given by a canopy photosynthetic model, and X0 is an initial condition.

Figure 1.

Diagram of the carbon process of the Duke Forest FACE site on which model equation (1) is based. The model has seven pools, and therefore the matrix A in section 2.1 is 7 × 7, vector B is 7 × 1, and vector X is 7 × 1. There are seven transfer coefficients, c1, c2, … c7, connecting the seven pools (Table 1). SOM stands for soil organic matter.

Table 1. Description of Carbon Transfer Coefficients Among Carbon Pools Shown in Figure 1
Carbon Transfer Coefficients, g C g−1 d1Description
c1From pool “foliage biomass” (X1) to pools “metabolic litter” (X3) and “structure litter” (X4)
c2From pool “woody biomass” (X2) to pool “structure litter” (X4)
c3From pool “metabolic litter” (X3) to pool “microbes” (X5)
c4From pool “structure litter” (X4) to pools “microbes” (X5) and “slow SOM” (X6)
c5From pool “microbes” (X5) to pools “slow SOM” (X6) and “passive SOM” (X7)
c6From pool “slow SOM” (X6) to pools “microbes” (X5) and “passive SOM” (X7)
c7From pool “passive SOM” (X7) to pool “microbes” (X5)

[6] This study used six data sets: foliage biomass growth, woody biomass growth, litterfall, C content in the litter layers, C content in mineral soil, and soil respiration, collected from year 1996 to 2000 at the Duke Forest, North Carolina, USA, where FACE has been in progress since 1996 [Luo et al., 2003]. Measurement methods of the six data sets were described in papers by DeLucia et al. [2002], Finzi et al. [2001], Schlesinger and Lichter [2001], and Andrews and Schlesinger [2001]. The experiment was set on a 15-year-old loblolly pine plantation with six plots, each with a size of 30 m in diameter. The CO2 concentration in the three treatment plots has been maintained at 200 ppm above ambient, and the other three control plots have been fumigated with ambient air [Hendrey et al., 1999]. The initial pool size X0 = (469 4100 64 694 123 1385 923)T was based on experimental data at the start of the FACE experiment (year 1996). The photosynthetically fixed C inputs U(.) at both ambient CO2 and elevated CO2 were estimated with the mechanistic canopy model MAESTRA [Luo et al., 2001] for the period 1996–2000 (Figure 2). The cumulative C inputs simulated by the MAESTRA model from year 1996 to 2000 are 6535 g C m2 and 8823 g C m2 for ambient and elevated CO2 respectively, making a cumulative difference of about 2288 g C m2 over a five year period.

Figure 2.

Simulated canopy photosynthesis (carbon input to the system) using MAESTRA model under ambient and elevated CO2 from year 1996 to year 2000.

[7] In the study, equation (1) was numerically solved with a finite difference method to give C pool sizes at each time t. In line with the time steps in ξ(.) and U(.), time difference dt was set to one day. The observation mapping operator Φ = (φ1T, φ2T, …, φ6T)T maps the modeled pool sizes at time t to observations by ΦX(t), and

equation image

Each φi, i = 1, 2, …6 maps simulated values in the state space to one of the field observations as φ1X(t) for woody biomass, φ2X(t) for foliage biomass, φ3X(t) for litterfall, φ4X(t) for C in forest floor, φ5X(t) for C in forest mineral soil, and φ6X(t) for soil respiration. For example, φ5 directly maps the modeled total C content in the three soil pools to the observations.

2.2. Application of Bayes' Theorem

[8] A complete description of Bayesian probabilistic inversion approach can be found in Appendix A. In the context of this study, Bayes' theorem states that the posterior probability density function (PPDF) p(cZ) of C transfer coefficients (i.e., model parameters c) can be obtained from prior knowledge of parameters c represented by a prior probability density function p(c) and information contained in the six data sets represented by a likelihood function p(Zc). To apply Bayes' theorem, we first specified the prior PDF p(c) by giving a set of limiting intervals for parameters c, then constructed the likelihood function p(Zc) on the basis of the assumption that errors in the observed data followed Gaussian distributions.

[9] The prior probability density function p(c) of the parameters was specified as a uniform distribution over the following intervals:

equation image

[10] These lower and upper limits were chosen as the same parameter limits over which the cost function of Luo et al. [2003] was minimized. In the Bayesian framework of this study, these limits are our prior knowledge about the approximate ranges of the parameters. We assumed a uniform distribution p(c) for parameters c with an emphasis on the equal probability of all parameter values occurring in the limits. This may be the best prior to choose in the absence of any other knowledge regarding parameter distributions. The parameter space was defined as the product of the above intervals and denoted as Ω.

[11] The likelihood function was specified according to distributions of observation errors. Errors e(t) in each observation Z(t) at time t is expressed by

equation image

where ϕX(t) is the modeled value, which is a product of X(t) from equation (1) and Φ from equation (2). For the six data sets used in this study, equation (5) is expanded as

equation image

Corresponding to each of the data sets, there is one random error component ei(t) = Zi(t) − φiX(t) with φi given in equation (2), tobs(Zi), and obs(Zi) being the sequence of observation times of the ith data set. We assumed that e(t) followed a multivariate Gaussian distribution with a zero mean. This assumption is commonly made in many studies [Braswell et al., 2005; Raupach et al., 2005] mostly because a Gaussian distribution, in general, can well approximate errors of various sources due to the central limit theorem [von Mises, 1964]. With the Gaussian distribution, the probability density function of e(t) at time t is given by

equation image

where cov(et) is a covariance matrix of vector e(t). In the study, the nondiagonal elements in matrix cov(et) measuring error correlations are assumed nil while the diagonal elements specifying variances of the components of e(t) were calculated from the normalizing factors of Luo et al. [2003] which were estimated from observations (Table 2). With an assumption that each component e(t) being independently and identically distributed over the observation times, the likelihood function p(Zc) is then the multiplication of the distributions of ei(t), i = 1,…,6 (equation (7)) at all observation times:

equation image

where constants σ12, σ22, …, σ62 are the error variances of soil respiration, woody biomass, foliage biomass, litterfall, soil C and mineral C respectively (Table 2). Then, with Bayes' theorem, the PPDF of parameters c (Appendix A, equation (A2)) is given by

equation image
Table 2. Standard Deviation of Errors of the Six Data Sets and Normalizing Factorsa
 Standard Deviations σNormalizing Factors w
AmbientElevatedAmbientElevated
  • a

    Normalizing factors are from Luo et al. [2003]. Standard deviation σ is calculated from the normalizing factor w using σ = equation image.

Soil respiration (σ1), g C m−2 d−10.840.951.401.84
Woody biomass (σ2), g C m−2377490284,304481,943
Foliage biomass (σ3), g C m−235.045.62,4534,159
litterfall (σ4), g C m−2yr−149.41004,89420,345
Soil carbon (σ5), g C m−2661578,71249,612
Mineral carbon (σ6), g C m−213434036,180231,200

2.3. Sampling With the Metropolis-Hastings (M-H) Algorithm and Convergence Test

[12] The M-H algorithm is a Markov chain Monte Carlo (MCMC) technique revealing high-dimensional probability density functions of random variables via a sampling procedure [Metropolis et al., 1953; Hastings, 1970; Geman and Geman, 1984; Gelfand and Smith, 1990]. To generate a Markov chain in the parameter space, we ran the M-H algorithm by repeating two steps: a proposing step and a moving step. In each proposing step, the algorithm generates a new point cnew on the basis of the previously accepted point c(k−1) with a proposal distribution q(cnewc(k−1)). In each moving step, point cnew is tested against the Metropolis criterion to examine if it should be accepted or rejected (see Appendix B for a detailed description of the M-H algorithm).

[13] The proposal distribution q(cnewc(k−1)) can strongly affect the efficiency of the M-H algorithm. To find an effective proposal distribution, we first made a test run of the algorithm with 20,000 simulations, using a uniform proposal distribution centered at the currently accepted point: cnew = c(k−1) + r × (cmaxcmin)/D, where r is a random number uniformly distributed between −0.5 and +0.5, cmax and cmin are the upper and lower limits of parameter vector c, D is a value controlling the proposing step size. This study set D = 5 so that the maximum step size is 1/10 of the range between the upper and lower limits of parameters c. Out of 20,000 simulations, the test run accepted about 1,200 updated samples. On the basis of the test run, we constructed a Gaussian distribution N(0,cov0(c)), where cov0(c) is a diagonal matrix with its diagonal being set to the estimated variances of the parameters c from the initial test run and zeros elsewhere, and then we adopted the following proposal distribution to formally execute the consecutive MCMC simulations:

equation image

On the basis of equation (10), in each proposing step of the M-H algorithm a new point cnew is generated from its predecessor c(k−1) from a Gaussian distribution with mean c(k−1), constant variances estimated from the test run and zero parameter covariances.

[14] We formally made five parallel runs of the M-H algorithm with the proposal distribution in equation (10). The five runs started at dispersed initial points in the parameter space Ω and each run simulated 15,000 times. We monitored the trace plots of samples and calculated the running means and standard deviations of the parameters as simulation progressed. The initial number of samples (about 2,500 samples in the burn-in period) was discarded after the running means and standard deviations were stabilized. The acceptance rates for the newly generated samples were about 30 ∼ 40% for the five runs. For statistical analysis of the parameters, we used the union of the samples of the five runs (about 60,000 samples in total) after their burn-in periods.

[15] Theoretically, the M-H algorithm converges to a stationary distribution as guaranteed by the ergodicity theorem in Markov chain theory. In practice, the convergence of the sampling chains is often tested by the Gelman-Rubin (G-R) diagnostic method (Appendix C). In this study, we applied the G-R test and calculated the G-R statistics to examine the convergence of the five parallel runs. Only after the G-R test satisfied the convergence (G-R statistics approaches to 1) were the samples used for statistical inferences.

2.4. Parameter Estimation

[16] We estimated parameter statistics of maximum likelihood estimators (MLEs), means, and correlations on the basis of the union of the five-run samples. Histograms and cumulative distributions (CDFs) were constructed from the series of samples to display distributions of parameters in the parameter space Ω. Uncertainties of estimated parameters were quantified with a 95% highest-probability density interval – the interval of the minimum width containing 95% of the area of the marginal distribution. MLEs were made by observing the parameter values corresponding to the peaks of the marginal distributions. Means of parameters ci (E(ci), i = 1, …, 7) were estimated by

equation image

where k is the number of samples given by the M-H algorithm. Correlations between parameters (corr(c)) were estimated by

equation image

where cov(ci,cj) is covariance between parameter ci and cj and estimated by

equation image

Components in matrix corr(c) are within −1 and +1. A value of +1 (−1) indicates perfect positive (negative) correlation and near-zero values indicates little correlation. By definition, the diagonal components are +1.

2.5. Simulated Pool Sizes

[17] We used Monte Carlo simulation to propagate the parameter uncertainty as expressed by PPDF p(cZ) (equation (9)) forward and constructed CDFs for simulated pools sizes using equation (1) with a time step set to one day. The model simulation was made over ten years from 2000 to 2010. Time courses of photosynthetic C input U(.) and environmental scalar ξ(.), from 1996 to 2000 were replicated two times from 2000 to 2010. At the end of the simulation (year 2010) we collected numerical solutions of equation (1) with input of 12,000 samples of p(cZ) in forward simulation. Cumulative distribution functions (CDFs) of X(t) were constructed from the 12,000 numerical solutions to quantify uncertainty of C pool sizes.

3. Results

[18] Our inversion results are presented in Figure 3 for ambient CO2 and Figure 4 for elevated CO2. Figures 3a–3g and 4a–4g show 10,000 samples from the sampling series of the M-H simulation. Figures 3h–3n and 4h–4n show histograms of all 60,000 samples generated by the five runs since the five runs converged as indicated by the G-R statistic (Table 3). Figures 3o–3u and 4o–4u are cumulative distribution functions (CDFs) constructed from the histograms of each of the five runs for transfer coefficients c. At both ambient and elevated CO2, parameters c1, c2, and c4 were well constrained within their prespecified ranges (Figures 3, 4h, 4i, and 4k). Comparison of parameter distributions shows that parameter c1 is much higher at elevated than ambient CO2 (Figures 3 and 4h). Distributions of parameter c2 were about the same at both elevated and ambient CO2 (Figures 3 and 4i). In contrast, parameters c3, c5 and c7 were poorly constrained (Figures 3, 4j, 4l, and 4n) at both CO2 treatments. To examine their PPDFs in broader ranges, we decreased the lower limits defined in equation (4) by 1/5 and increased the upper limits in equation (4) by fivefold for parameters c3, c5 and c7 at ambient CO2. Similar to Figures 3j, 3l, and 3n, histograms in Figures 5c, 5e, and 5g still did not show statistically meaningful distributions.

Figure 3.

Inversion results under ambient CO2 showing 60,000 samples from M–H simulation, the histograms of all samples from the five runs, and the CDFs constructed from each of the five runs. The y axes represent the prespecified limits of the parameters.

Figure 4.

Inversion results under elevated CO2 showing 60,000 samples from M–H simulation, the histograms of all samples from the five runs, and the CDFs constructed from each of the five runs. The y axes represent the prespecified limits of the parameters.

Figure 5.

Marginal distributions of parameters c where the lower limit of parameters c3, c5, and c7 are reduced by 1/5 and the upper limits are increased fivefold at ambient CO2.

Table 3. Maximum Likelihood Estimates (MLEs), Mean Estimates, 95% High-Probability Intervals (Lower Limit, Upper Limit), and G-R Statisticsa
Parameters, g C g−1 d−1MLEMean95% High-Probability IntervalG-R StatisticsLuo et al. [2003]
  • a

    As a comparison, the result of Luo et al. [2003] was also listed. NA means not available. The G-R statistics were calculated from the five sequences after the burn-in periods.

Ambient
c1 (×10−3)1.821.82(1.72, 1.89)1.01.76
c2 (×10−4)1.211.21(0.99, 1.42)1.01.00
c3 (×10−2)NA1.70(0.66, 2.70)1.02.15
c4 (×10−3)1.041.04(0.80, 1.34)1.00.845
c5 (×10−3)NA5.10(3.10, 6.85)1.08.530
c6 (×10−4)1.701.70(0.55, 2.65)1.00.898
c7 (×10−6)NA5.25(1.51, 9.00)1.03.1
 
Elevated
c1 (×10−3)2.342.34(2.25, 2.46)1.02.17
c2 (×10−4)1.251.25(1.19, 1.52)1.01.41
c3 (×10−2)NA1.71(0.65, 2.71)1.02.268
c4 (×10−3)1.031.10(0.50, 1.71)1.00.965
c5 (×10−3)NA4.84(2.90, 6.80)1.02.534
c6 (×10−4)0.550.66(0.50, 2.40)1.00.558
c7 (×10−6)NA5.19(1.60, 9.00)1.02.700

[19] Histograms of parameter c6 (Figures 3 and 4m) appear to contain more information on parameter constraint than parameters c3, c5 and c7 but less than parameters c1, c2, and c4 at ambient and elevated CO2. The low information content on parameter c6 may be caused partly by a limited number of data points for soil C content and partly by large variation of the soil C measurements. To increase the weight of the limited data points, we decreased the original variances of C contents in the forest floor and mineral soil by half (i.e., increased weighing factors of the two data sets in equation (8) by 100%) and reran the M-H algorithm to construct marginal distributions. Histograms of parameter c6 with the reduced variances were much more concentrated than those with the original variances at both ambient and elevated CO2 (Figure 6). No significant changes in marginal distributions of other parameters were observed (data not presented). The distribution of parameter c6 was well constrained at ambient CO2 with the reduced variances (Figure 6c) and edge hitting at elevated CO2 with either the original or reduced variances (Figures 6b and 6d). Elevated CO2 shifted the marginal distribution downward in comparison to that at ambient CO2 (Figure 6d versus Figure 6c).

Figure 6.

Sensitivity of marginal distribution to reduced error variances: (a and b) marginal distributions of c6 with original variances and (c and d) marginal distributions constructed using reduced error variances of forest floor carbon data set and mineral carbon data set.

[20] For parameters c1, c2, c4 and c6, the maximum likelihood estimates (MLEs, Table 3) were identified by observing the parameter values corresponding to the peaks of their marginal distributions (Figures 3, 4h, 4i, 4k, and 4m). There are no distinctive modes to indicate MLEs for parameters c3, c5 and c7 (Figures 3, 4j, 4l, and 4n) and thus their MLEs were not identified (Table 3). Nevertheless, we were able to give the mean estimates for all parameters ci, i = 1,…,7 by calculating the sample means. The 95% probability intervals were estimated from the CDFs (Figure 3 and 4o–4u) to quantify parameter uncertainty (Table 3). Among the parameters, c1 has the least variability relative to its range, followed by c2, c4 and c6 (mostly symmetric with distinctive modes), while parameters c3, c5 and c7 have the largest variability (widely spread marginal distributions). In general, the 95% confidence intervals cover estimated values by Luo et al. [2003] for all transfer coefficients except c1 at elevated CO2.

[21] Under both ambient and elevated CO2, our cross-correlation analysis based on equation (12) showed that the seven parameters are not significantly intercorrelated with each other (Figure 7) except for the pair c3 and c4. Parameters c3 and c4 were negatively correlated with a correlation coefficient of −0.25 at ambient CO2 and −0.15 at elevated CO2.

Figure 7.

Correlations among model parameters c1, c2, …, c7 under ambient and elevated CO2.

[22] Under both ambient and elevated CO2, the simulated and observed data sets using mean value estimates (Table 3) fitted closely with R2 generally between 0.7 and 1, but mostly more than 0.8 (Figure 8). The fittings are similar to those shown by Luo et al. [2003]. Among the comparisons between the simulated values and observed data, large deviation existed between the simulated and observed foliage biomass (Figures 8c and 8d), probably due to both model assumptions and observation errors as discussed by Luo et al. [2003].

Figure 8.

Comparison between the simulated data sets and the observed data sets under both ambient CO2 and elevated CO2. For each pair of plots, the left plot shows the matching at ambient CO2, and the right plot shows the case at elevated CO2.

[23] Simulated C pool sizes in foliage biomass, woody biomass, structure litter, slow SOM, and passive SOM have symmetric distributions in 2010 (Figures 9a, 9b, 9d, 9f, and 9g). Two C pools: metabolic litter and microbes, have left-skewed distributions (Figures 9c and 9d). The CDFs under elevated CO2 were right shifted except for passive SOM, suggesting that elevated CO2 increased C sequestration in the forest ecosystem. Table 4 lists means and 95% confidence intervals of C pool sizes in all the seven compartments. Mean C contents increased by 1.5% in the passive soil C pool and by 39.2% in the slow soil C pool. The simulated C content in the whole forest ecosystem increased by 23.8% by 2010. The 95% confidence intervals of simulated C pool sizes were significantly shifted to the right for woody biomass and for total C in the system (Figures 9b and 9h). However, the distributions of simulated C pool sizes in several compartments were statistically overlapped (Figures 9c, 9e, and 9g).

Figure 9.

CDFs of simulated carbon pool sizes of year 2010.

Table 4. Summary Statistics for Simulated Carbon Pool Sizes for Year 2010
Pools, g C m−2Ambient CO2Elevated CO2Mean Increment of C Content, %
Mean95% Confidence IntervalMean95% Confidence Interval
Foliage biomass (x1)656(637, 672)686(662, 701)4.6
Woody biomass (x2)7800(7460, 8285)9400(8750, 10000)20.5
Metabolic litter (x3)69(32, 110)77(43, 142)11.5
Structure litter (x4)1250(900, 1460)1700(900, 2400)36.5
Microbes (x5)210(130, 300)280(160, 400)33.3
Slow SOM (x6)2500(1950, 3200)3480(2500, 4100)39.2
Passive SOM (x7)954(930, 992)968(932, 997)1.5
Total Carbon13426(12700,14100)16616(15700,17550)23.8

[24] In our forward simulation, we extended the input from year 2001 to year 2010 by repeating C input to the system twice. The cumulative difference of C input between the ambient and elevated CO2 treatments over the fifteen year period from 1996 to 2010 is 6863 g C m−2, the simulated cumulative difference of soil respiration over the same period is 3,657 g C m−2, and the difference in pool sizes at the end of the simulation period (year 2010) is about 3,190 g C m−2 on the average (Table 4). Thus the extra C that system stored (3,190 g C m−2 on average) and released (3,657 g C m−2 via soil respiration) nearly match the extra C input (6863 g C m−2) over the fifteen year period. Note that the match is not exact because of the fact that the total C amounts (Table 4) are mean estimations derived from the two CDFs in Figure 9h that were constructed from empirical data and thus may contribute estimation error.

4. Discussion

4.1. Probabilistic Versus Deterministic Inversion

[25] When a Gaussian type of error is used in probabilistic inversion, maximum likelihood estimates (MLEs) of parameters are equivalent to optimal estimates from deterministic inversion using the least squares (LS) method [Tarantola, 1987; Raupach et al., 2005]. Luo et al. [2003] used a LS criterion and a Levenburg-Marquardt method coupled with Quasi Monte Carlo (LMQMC) to search for the C transfer coefficients c. In LMQMC, search directions are calculated using gradient vectors and approximated Hessian matrices of a cost function, and quasi Monte Carlo steps are used to find the step size that gives the largest decrease in the cost function along the search direction. A stopping criterion is set to terminate the algorithm when the cost function could not be reduced significantly. With the probabilistic inversion in this study, we exploited the same parameter space and observed data sets as in the work by Luo et al. [2003] to construct a PDF for parameters c, from which we derive statistical inferences (e.g., MLEs, means, and 95% confidence intervals) of c. The MLEs of the relatively well constrained parameters c1, c2, c4 and c6 are generally in good agreement with those by the deterministic inversion as done by Luo et al. [2003] except for c6 in the ambient CO2 (Table 3). The well-constrained parameters in the probabilistic inversion are those parameters to which the cost function in deterministic inversion was mostly sensitive. The MLEs of parameters c3, c5, c7 were not comparable to those estimated by the deterministic inversion since they could not be uniquely determined in this study. However, the PDFs of the poorly constrained parameters in the probabilistic inversion offer broad 95% confidence intervals, which cover the optimal estimates by Luo et al. [2003]. Different estimates of parameter c6 at ambient CO2 between the probabilistic and deterministic inversions most likely resulted from that the LMQMC had not updated its initial value significantly before the stopping criterion ended the search.

[26] The probabilistic approach employed by this study is advantageous over the deterministic approach of Luo et al. [2003] in at least three aspects: First, the probabilistic inversion constructs parameter distributions (such as in Figures 3 and 4) while the deterministic inversion provides only point estimates. The parameter distributions can be used to quantify MLEs, means, and confidence intervals, and thus offer much richer information than the point estimates by the deterministic inversion. Second, the probabilistic approach reveals whether a parameter is well constrained or not (e.g., parameter c7 versus c1) whereas the deterministic inversion could not. From the degree to which parameters are constrained by data, we can assess parameter uncertainties as measured, for example, by confidence intervals. Third, the probabilistic approach can readily analyze correlations among parameters (e.g., Figure 7) while the deterministic inversion may not be always able to reveal such information. The probabilistic approach analyzes parameter correlations from the sampling series. The deterministic optimization approach usually estimates optimal parameter values without quantifying their correlations [e.g., Barrett, 2002; Luo et al., 2003]. Although some applications estimated correlations among parameters from Hessian matrix [e.g., Wang et al., 2001] at the optimal point, analytic solutions of Hessian matrices may not be easily obtainable, especially when models are given as a set of differential equations.

4.2. Constraints of Parameters by Data Sets

[27] The nature of inverse analysis is to exploit information content contained in data, model structure, and prior knowledge on parameters [Raupach et al., 2005]. The six data sets used in this study contain enough information to constrain C transfer coefficients of nonwoody biomass, woody biomass, structural litter, and slow soil C (c1, c2, c4 and c6), but not enough for metabolic litter, microbial, and passive soil C (c3, c5 and c7). The lack of microbial biomass data in this study may cause large uncertainty of c5. We did an exercise by using modeled microbial biomass data (i.e., a virtual data set) in the inverse analysis. Parameter c5 became well constrained (data not presented). That suggests that microbial biomass data are crucial in future inverse analysis. Parameter c3 is the transfer coefficient from the metabolic litter pool, which is small and turnovers fast. Although the concept of metabolic litter may be important in ecology [Berg and McClaugherty, 2003], we did not have data to constrain the transfer coefficient c3 from the pool. We may have to merge this pool with the structural litter pool in future inverse analysis unless data of labile litter compounds become available. Also, seen from Figure 7, parameters c3 and c4 show posterior correlation, which indicates the information content of litterfall measurement is not sufficient to separate these coefficients, and therefore they cannot be constrained separately.

[28] Parameter c7 describes C transfer from the passive soil organic matter pool, which has a residence time of hundreds or thousands of years. It can be hardly constrained by short-term observation. However, this pool is very critical to simulate long-term C dynamics in terrestrial ecosystems [Parton et al., 1987]. We may explore C isotope data to constrain this parameter in a future study.

[29] Parameter correlations are part of the information revealed by the inverse analysis, which likely reflect relationships defined by model structure, correlations among data, or errors, or any combinations of the three. This study identified only one negative correlation between parameter c3 and c4 (−0.25 and −0.15). The negative correlation may suggest a complementary relationship between C transfer rates of metabolic litter and structure litter and is physically reasonable since the total litter amount was partitioned into the two pools. An increase in parameter c3 is supposed to be accompanied with a decrease in parameter c4 and vice versa. Unless we have data of labile versus structural components of litter, the inversion could not independently estimate these two parameters.

4.3. Data Properties and Parameter Uncertainties

[30] Data properties such as error distributions, cross correlations among multiple data sets, and the evolution of self-correlations or cross correlations with time are critical for evaluation of parameter uncertainties. In this study, we reduced the error variances of the forest floor C and the mineral C data sets by half to examine the sensitivity of model parameters to error variances. As shown in Figure 6, a reduction of the error variances substantially reduced the uncertainty of c6. Our exercise demonstrated that error magnitudes in observations play an important role in determining parameter uncertainty. In general, error distributions determine the form of a likelihood function (equation (8)) and correspondingly PPDF p(cZ) (equation (9)). Although Gaussian distribution errors are often assumed in extant work (see Raupach et al. [2005] for a general discussion), other distributions such as skewed or lognormal may be more realistic for particular data. It is yet to examine key properties of uncertainty sources in association of those non-Gaussian distributions in the probabilistic inversion.

[31] It is a tremendously difficult task to obtain the properties of error distributions, cross correlations among multiple data sets, and the evolution of self-correlations or cross correlations with time in experimental observations. When they are not available as in most of the current studies, assumptions about uncertainty properties must be made, for example, with a constant σ across time, Gaussian distributions, or independent random errors among multiple data sets (e.g., Braswell et al. [2005] and this study) to proceed with inverse analysis. Currently there have been initial efforts toward specifying data properties related to key terrestrial C observations based on expert judgment [Raupach et al., 2005], which are helpful in quantifying parameter and prediction uncertainties. In future experimental research, it is highly desirable to quantify those data properties from measurements.

4.4. Data-Model Fitness and Simulation of Pool Sizes

[32] Our probabilistic inversion improved the data-model fitness with higher R2 values than that using the deterministic inversion (Figure 8 versus Figure 3 of Luo et al. [2003]). However, there were still plenty of unexplained variances (Figure 8). In particular, the systematic variation in foliage biomass was not explained by the model with parameter values estimated by the probabilistic inversion. Luo et al. [2003] suggested that a restricted search range for c1 may be partially responsible for the systematic deviation. This study did allow the inversion to search for values of c1 in a much broader range than that by Luo et al. [2003] and still did not make enough improvement of the fitness. The other reason of the discrepancy suggested by Luo et al. [2003] is the quality of foliage biomass data, which was indirectly estimated from diameter at breast height (DBH). It would be ideal to make direct measurements of foliage biomass if technique allows. The systematic deviation may also result from model structure, which may not accurately represent growth and senescence processes of foliage biomass. In addition, this inversion analysis used multiple data sets. We may have to explore various other setups to define the PPDF (e.g., varying weighing factors in equation (8) for different data sets) to improve the model ability to match the multiple data sets simultaneously.

[33] The predicted mean pool sizes were generally larger under elevated than ambient CO2 (Table 4 and Figure 9). However, the 95% confidence intervals of most of the pools did not exhibit significant change except for the woody biomass pool. The CDFs for the passive SOM pool nearly coincided (Figure 9g), showing short-term simulation could not change the size of a long-term pool. The CDFs of simulated pool sizes were constructed by solving the model equation (1) with the sampling series. This approach incorporates information from the posterior parameter estimates, posterior correlations among parameters, and model structure into forward simulation. Even though parameters c3, c5 and c7 were poorly constrained, simulated pool sizes for X3, X5 and X7 are not uniformly distributed, suggesting meaningful information contained in the forward simulation. The information contained in the CDFs of forward simulation is derived from the model structure itself, in combination with the constrained parameters of other pools.

[34] Our simulation was solely based on existing knowledge of the uniform prior distribution over the limit intervals (equation (4)), the model structure (equation (1)) and the six measured data sets with the assumed Gaussian error properties (Table 2). The same data sets and the same model structure were used in this study to facilitate comparisons with results in paper by Luo et al. [2003]. However, there are more, longer data sets available at the Duke site, which will improve model projections. In addition, the C input U(.) and environmental scalars ξ(.) estimated from recorded micrometerological data during the period from 1996 to 2000 were extrapolated to the period from 2000 to 2010 by replicating the time series twice. Environmental conditions at the site are likely to be different during the two periods, resulting in different U(.) and ξ(.). Other factors such as progressively limiting nutrient availability [Finzi et al., 2006; Luo et al., 2004] and forest stand development [Hooker and Compton, 2003] may further complicate projections. In spite of the fact that the quantitative results given by the forward model simulation can be improved, the approach of probabilistic inversion used in our study is very useful and informative for data-model integration in ecology.

5. Conclusions

[35] Using the Bayesian approach and a MCMC inversion technique in this study, we constructed probability distributions of the model parameters (Figures 3 and 4), made statistical estimates (Table 3), analyzed the correlations among the parameters (Figure 7), and developed cumulative probabilistic distributions of simulated pool sizes (Figure 9). Thus the probabilistic inversion provides much more informative outputs than the deterministic inversion. Our study showed that at both ambient and elevated CO2, the transfer coefficients from pools of nonwoody biomass (c1), woody biomass (c2), structural litter (c4), and slow soil C (c6) were well constrained by the six data sets. In contrast, the transfer coefficients from pools of metabolic litter (c3), microbe (c5), and passive soil C (c7) were poorly constrained. The simulated distributions of pool sizes indicated that elevated CO2 stimulated C sequestration in the forest ecosystem. The 95% confidence intervals were significantly higher in the woody biomass and total ecosystem at elevated than ambient CO2.

[36] The Bayesian approach offers a rigorous method to assess uncertainty of model predictions. Nevertheless, its applications to ecological research are still at an infant stage and yet to be developed. For example, correlations among model parameters due to model structure have to be appropriately accounted for in the probabilistic inversion. Uncertainties of the estimated parameters and model projections are sensitive to error variances (Figure 6), other data properties, and assumptions on forms of distributions. Although initial assessments have been made for properties of observational data related to inverse analysis of terrestrial C processes [Raupach et al., 2005], a comprehensive understanding of uncertainty sources and data properties is required to rigorously carry out for the global C cycle.

Appendix A:: Bayes' Theorem

[37] A general description of the Bayesian probabilistic inversion is given by Bayes' theorem [e.g., Box and Tiao, 1973; Tarantola, 1987; Gill, 2002; Leonard and Hsu, 1999] in a form of

equation image

where p(c) is the prior probability density function (PDF) representing prior knowledge about parameters c, p(Zc) is the conditional probability density of observations Z on c (also called the likelihood function of parameters c), p(Z) is the probability of observations Z, and p(cZ) is the posterior probability density function (PPDF) of parameters c. The theorem states that the posterior information of model parameters c represented by p(cZ) can be obtained from the prior information represented by p(c) and the observed information given by p(Zc). p(cZ) is often written in the following form:

equation image

that is, p(cZ) is proportional to p(Zc)p(c).

[38] From the Bayesian viewpoint, p(cZ) in (A1) represents the solution to an inverse problem since it gives a probabilistic description of parameters c over parameter space. The interpretation of p(cZ) leads to the following integrals:

equation image
equation image
equation image

which are the expected value, the covariance and the marginal distributions of c, respectively. These are some of the statistics describing parameter uncertainties.

Appendix B:: Metropolis-Hastings Algorithm

[39] In practice, except for situations where p(cZ) have very simple forms, it is not always possible to draw samples easily from p(cZ) (as is the case in this study). The sampling problem had hindered the applications of the Bayesian theory for a long period of time in history until later solved by the MCMC techniques, after the foundational work of Metropolis et al. [1953], Hastings [1970], Geman and Geman [1984], and the synthesizing paper by Gelfand and Smith [1990]. The basic idea for the MCMC sampling is to design a Markov chain with p(cZ) as the targeted stationary distribution. Once the chain has simulated for sufficiently long period samples in the chain will follow the stationary distribution, then one can collect the samples from the simulation and calculate various statistics associated with the PPDF from them. One of the mostly used techniques for MCMC is the Metropolis-Hastings (M-H) algorithm, which is briefly described below.

[40] For simplicity of notation, we denote L(c) as the targeted stationary distribution p(cZ). A computer implementation of the M-H algorithm consists the following steps: [Spall, 2003].

[41] Step 1: Choose an arbitrary initial point c(0) in the parameter space.

[42] Step 2: (Proposing step). Propose a candidate point cnew according to a proposal distribution q(cnewc(k−1)).

[43] Step 3: (Moving step). Calculate P(c(k−1),cnew) = minequation image, and compare the value with a random number U from the uniform distribution U[0, 1] that is defined on interval [0, 1]. Set c(k) = cnew if UP(c(k−1),cnew); otherwise set c(k) = c(k−1). This test criterion is also called the Metropolis criterion.

[44] Step 4: Repeat steps 2 and 3 until enough samples are obtained.

[45] In most applications, the proposal distribution q(cnewc(k−1)) is usually set as either a uniform distribution or a symmetric Gaussian distribution centered at the current point. The Gaussian distribution may also take into any prior knowledge about the parameters (e.g., estimated covariance) into account. The proposing efficiency of q(cnewc(k−1)) affects the efficiency of the algorithm, and hence should be properly designed to ensure a moderate sample-acceptance rate. Robert and Rosenthal [1998] indicated that a rate of 23% is sometimes an optimal acceptance rate. In practice, it is often desirable to make “test runs” of the algorithm and adjust parameters in the proposal distribution on the basis of the test run until the acceptance rate is approximately 23%. In general, the acceptance rate can be adjusted between 20 ∼ 50%.

Appendix C:: Convergence of MCMC

[46] Since the Markov chain generated by the M-H algorithm is reversible, the standard ergodicity theorem in Markov chain theory states that if it is irreducible and aperiodic, the chain converges to a unique stationary distribution [Spall, 2003]. This means samples c(k) as k becomes sufficiently large are draws from the stationary distribution and can be used to make statistical inferences for the random variable.

[47] There are various techniques for monitoring convergence of MCMC simulation in practice, for example, run several parallel chains and visually inspecting the trace plots and autocorrelation sequences, monitor the running means and standard deviations, or apply the Gelman-Rubin (G-R) diagnostic method. The idea of G-R test is that if the simulated Markov chain has reached convergence, the within-run variation should be roughly equal to the between-run variation [Gelman and Rubin, 1992]. Specifically, denoting for each parameter component ci of vector c the samples from K parallel M-H runs of length N as cin,k (n = 1, 2, …, N; k = 1, 2, …, K), then the between and within-run variances are defined as

equation image

The G-R scale reduction statistics is given by

equation image

Once convergence is reached GRi should approximately equal one.

Acknowledgments

[48] We greatly appreciate the two anonymous reviewers for their insightful comments. This research was supported by grants from the Terrestrial C Program at the Office of Science (Biological and Environmental Research, or BER), U.S. Department of Energy (DE-FG03-99ER62800), from the National Institute of Global Environmental Change South Central Regional Center, and from the National Science Foundation (DEB 0092642 and DEB 0444518). Research at the Duke Forest FACE (Free-Air Carbon Dioxide Enrichment) facility was supported by the Office of Science (BER) program, U.S. Department of Energy.

Ancillary