[1] A computer program (PBUQ) that uses Monte Carlo simulations to propagate uncertainty through regression equations and the equation for the paleosol carbonate CO_{2} paleobarometer is presented. PBUQ includes options for all of the common approaches to determining values for input variables and incorporates several recent advancements relevant to determining values for soil-respired CO_{2} concentrations, δ^{13}C values of respired CO_{2}, δ^{13}C values of atmospheric CO_{2}, and temperatures of soil carbonate formation. PBUQ is intended to improve confidence in paleoatmospheric CO_{2} research by helping researchers draw statistically significant conclusions. PBUQ can also be used to attribute and partition error among various sources and thereby advance this technique. Sensitivity analysis indicates that S(z) is the largest source of uncertainty for most paleosols and that uncertainty is minimized for soils in which CO_{2} is an evenly balanced mixture between soil-derived and atmospheric components. Evenly balanced mixtures are most likely for paleosols formed in deserts and for weakly developed paleosols. Development of proxies for soil-respired CO_{2} concentrations and δ^{13}C values of soil-respired CO_{2} specifically for such soils is perhaps the most crucial next step for improving this technique. Currently, calcic paleosols are best used to test the significance of trends and/or differences among time slices in paleoatmospheric CO_{2} concentration. Application to quantifying Earth System Sensitivity will require large scale averaging of determinations from individual paleosols and/or reduced uncertainty associated with input variables.

[2] Since the early 1990s, concentrations of CO_{2} in Earth's ancient atmosphere have been calculated from the carbon isotope composition of paleosol carbonate [Cerling, 1991]. Such atmospheric CO_{2} records extend back to 420 Ma [e.g., Ekart et al., 1999; Mora et al., 1996; Royer et al., 2004] and provide valuable data for investigation of the geologic carbon cycle and the sensitivity of Earth's climate to atmospheric CO_{2}. There has been recent interest in improving the accuracy of this technique [Breecker et al., 2009, 2010; Cotton and Sheldon, 2012; Montañez, 2013; Montañez et al., 2007; Retallack, 2009], but with the exception of a few efforts [Breecker, 2010; Cerling, 1992; Montañez et al., 2007; Retallack, 2009] there has been little progress toward understanding or quantifying the uncertainty of these CO_{2} determinations, although there has been work on most other CO_{2} proxies including alkenones [Freeman and Pagani, 2005], stomatal index [Beerling et al., 2009], bryophyte fossils [Fletcher et al., 2008], and pedogenic gibbsite [Austin, 2011]. It is important to investigate the uncertainty associated with paleoatmospheric CO_{2} estimates for at least three reasons: (1) Momentum is gathering for the estimation of Earth's climate sensitivity based on past climates. Over the past decade, climate scientists and the IPCC have undertaken substantial efforts to quantify the uncertainty associated with predictions of future climate change. Studies that seek to inform the prediction of future climate, such as the estimation of climate sensitivity from past warm episodes, should undergo similarly rigorous uncertainty analysis [Beerling and Royer, 2011]. (2) Often the motivation for reconstructing ancient atmospheric CO_{2} concentrations involve hypotheses regarding whether or not atmospheric CO_{2} changed over a certain time interval. Quantitative error bars associated with atmospheric CO_{2} determinations would allow statistical testing of such hypotheses. (3) Quantifying uncertainty allows partitioning among the various sources of uncertainty, which leads to a better understanding of the technique and directs the research necessary for its improvement. For these reasons, I have developed a computer program that calculates the uncertainty associated with atmospheric CO_{2} concentrations determined from calcic paleosols.

2. Background

[3] Atmospheric CO_{2} concentrations can be determined from calcic paleosols because (1) pedogenic carbonate nodules record the δ^{13}C values of CO_{2} in soil pore spaces at the time of nodule formation and (2) δ^{13}C values of soil CO_{2} are controlled by mixing of soil-derived CO_{2} (CO_{2} respired by roots and heterotrophic soil organisms) and atmospheric CO_{2} [Cerling, 1984]. The paleosol carbonate CO_{2} paleobarometer, somewhat of a misnomer because it gives CO_{2} concentration unlike some other paleoatmospheric CO_{2} proxies such as stomatal frequency, which record the CO_{2} pressure and must therefore be corrected for paleoelevation to determine sea level pCO_{2} [Kürshcner et al., 2008] is a mathematical expression for this gaseous mixing that has been rearranged such that atmospheric CO_{2} concentrations can be calculated from (1) δ^{13}C values of both end-members (soil-derived CO_{2} and atmospheric CO_{2}), (2) the δ^{13}C value of soil CO_{2}, and (3) the respiration-contributed concentration of CO_{2} in soil pore spaces. The equation used to calculate atmospheric CO_{2} concentration is [Cerling, 1999]:

where [CO_{2}]_{atm} is the atmospheric CO_{2} concentration, S(z) is the concentration of soil CO_{2} that is contributed by soil respiration (i.e., S(z) = soil CO_{2} concentration − [CO_{2}]_{atm}), δ^{13}C is the carbon isotope composition expressed in the standard per mil notation, and the subscripts s, r, and a refer to soil CO_{2}, soil-respired CO_{2}, and atmospheric CO_{2}, respectively. The expression of the right-hand side of equation (1) can be conceptualized by considering its two terms: S(z) and the ratio by which S(z) is multiplied, henceforth referred to as R. R is equivalent to the ratio of atmospheric CO_{2} to S(z), which varies according to the mixing of these two end-members. It is a common misconception that atmospheric CO_{2} penetrates only a shallow distance into soils. This may be true for soils in which CO_{2} is consumed (e.g., by dissolution in downward percolating water) faster than the soil pore spaces can exchange with the atmosphere (water saturation slows soil-atmosphere exchange). In most soils, however, CO_{2} is not consumed, and the atmospheric component exists throughout the soil. In other words, if the rate of soil respiration was zero, pore spaces would be filled with atmospheric CO_{2}. For nonzero soil respirations rates, respired CO_{2} accumulates in the soil and mixes with the atmospheric component. The respired CO_{2} also diffuses out of the soil into the atmosphere (so does the atmospheric component but this flux out is balanced by an equivalent flux into the soil for the atmospheric component). The translational velocity of ^{12}CO_{2} molecules is higher than that of ^{13}CO_{2} molecules, and therefore, ^{12}CO_{2} preferentially escapes from the soil into the atmosphere by diffusion, leaving the residual pore space CO_{2} relatively enriched in ^{13}CO_{2}. This process is accounted for in equation (1) by the factor 1.0044 and the constant 4.4. Respired CO_{2} does not, however, “push” atmospheric CO_{2} out of soils, as there is little pressure change that occurs with respiration (the respiratory quotient for soils is approximately one, although its precise value is poorly known). The influence of the atmospheric component on the δ^{13}C value of total soil CO_{2} decreases hyperbolically, as S(z) increases such that for a modern soil (δ^{13}C_{a} = −8.5, δ^{13}C_{r} = −26‰, [CO_{2}]_{atm} = 395), the atmospheric component has an effect of 6‰ at S(z) = 500 ppmV, 1‰ at S(z) = 5000 ppmV, and 0.2‰ at S(z) = 25,000 ppmV. The barometer relies on the atmospheric CO_{2} component having a measurable influence on the δ^{13}C value of total soil CO_{2} and thus paleosol carbonate.

[4] Values for the variables on the right-hand side of equation (1) have been determined using a number of different methods (Figure 1). All methods involve measuring the δ^{13}C value of paleosol carbonate as a proxy for the δ^{13}C value of soil CO_{2}. Cerling [1992] and Cotton and Sheldon [2012] emphasized the advantage of measuring the δ^{13}C values of bulk organic matter or organic matter occluded in carbonate nodules as a direct proxy for δ^{13}C_{r}. However, the most widely applied approach [e.g., Ekart et al., 1999] involves δ^{13}C_{a} and δ^{13}C_{r} determined from measured δ^{13}C values of contemporaneous marine carbonate assuming carbon isotope equilibrium between the atmosphere and the ocean, a constant carbon isotope fractionation factor between atmospheric CO_{2} and carbon in land plants, and that the value of δ^{13}C_{r} equals the δ^{13}C value of plants. Other approaches involve δ^{13}C_{a} and δ^{13}C_{r} determined from δ^{13}C values of (1) penecontemporaneous coal or other well-preserved organic matter using similar assumptions [e.g., Montañez et al., 2007] or (2) penecontemporaneous fossilized plant materials such as hackberry seeds [e.g., Retallack, 2009]. The relationship between δ^{13}C_{a} and the δ^{13}C value of C_{3} plants was quantified by Arens et al. [2000]. In virtually all studies, the temperature of soil carbonate formation is assumed to calculate the δ^{13}C value of CO_{2} with which the carbonate formed in equilibrium. Values for S(z) are typically based on measured growing season CO_{2} concentrations in surface soils [Brook et al., 1983].

[5] More recently, global relationships that account for aridity have been developed between δ^{13}C_{a} and δ^{13}C values of C_{3} leaves [e.g., Diefendorf et al., 2010; Kohn, 2010]. These relationships are particularly important for application to calcic soils [Cerling, 1992], many of which form in climates where the vegetation undergoes water stress. The temperature of paleosol carbonate formation can now be measured [Passey et al., 2010; Peters et al., 2013; Quade et al., 2013], and S(z) can be determined from depth to the paleosol Bk horizon [Retallack, 2009], paleomean annual precipitation [Cotton and Sheldon, 2012], and/or identification of the soil order that best corresponds to the paleosol of interest [Montañez, 2013]. Furthermore, Cenozoic variations in δ^{13}C_{a} have been well quantified [Passey et al., 2002; Tipple et al., 2010]. These recent advancements lay the groundwork for reducing and quantifying uncertainty associated with [CO_{2}]_{atm} determinations and motivated the work described here.

[6] PBUQ is a MATLAB^{®} code (supporting information) 1 that propagates error through equation (1). Unless otherwise specified, the ± errors reported in this paper represent 1 standard deviation. The values of the variables on the right-hand side of equation (1) are not independent from one another; in particular, δ^{13}C_{r} is controlled by δ^{13}C_{a.}S(z) and δ^{13}C_{r} are also likely related to one another given that respiration rates [e.g., Orchard and Cook, 1983] and δ^{13}C values of vegetation [e.g., Farquhar et al., 1989] are influenced by water stress. Therefore, PBUQ uses Monte Carlo simulations to propagate uncertainty rather than Gaussian error propagation, which assumes independence of variables. PBUQ calculates probability density functions (pdfs) for each of the variables in equation (1), including [CO_{2}]_{atm}. The pdfs are generated using 10,000 iterations. PBUQ starts with measured values and their associated ± errors (assumed to be Gaussian) specified by the user and then asks the user to specify which methods and regression equations (Figure 1) to use to calculate values of S(z) and the three required δ^{13}C values. Error associated with both measured values and with regression equations (Table 1) are considered by PBUQ. During each iteration, values for measured variables are randomly selected from normal distributions defined by the user-specified means and standard deviations of replicate measurements. These randomly selected values are used in regression equations to determine values for variables in equation (1). There are two types of regression equations considered in PBUQ: “Y from X” and “X from Y” (Table 1); the former is used when values of an independent variable are known (± some error) and values of the dependent variable are needed for use in equation (1) (or are needed in order to calculate, using another regression equation, the value of a variable in equation (1)), whereas the latter is used when values of a dependent variable are known. The data used to generate each of the regression equations have been reported previously (see citations in Table 1), although some of the fitted regression lines differ from those previously reported as described later. The type of the regression equation depends on which values the user knows or can independently calculate, so for instance, the regression equation relating CIA-K (chemical index of alteration without potassium) to MAP [Sheldon et al., 2002] would be “X from Y” because the dependent variable CIA-K is measured to determine the value of the independent variable MAP. The regression equation relating S(z) to mean annual precipitation (MAP) [Cotton and Sheldon, 2012] would be “Y from X” because the independent variable MAP is used to determine value for the dependent variable S(z). The two types of regression equations are treated separately in PBUQ.

[7] For regression equations in which the independent variable is measured to determine values of the dependent variable (i.e., “Y from X”), the standard error (SE) of predicted new observations is calculated following Davis [2002]. This SE and a mean value of zero are used to define a Gaussian pdf of ΔY (i.e., a pdf of errors on Y). During each iteration, a value randomly selected from this pdf is added to the value of Y that lies on the regression line at the input value of X. Multiple iterations result in normally distributed pdfs for Y. For regression equations in which X must be estimated from Y, the method described by Sokal and Rohlf [1981] is used to calculate confidence intervals around estimated values of X. During each iteration, a value randomly selected from the t-distribution is used to evaluate the expression for confidence intervals [Sokal and Rohlf, 1981]. The resulting array of confidence limits defines a pdf for X. Both the “Y from X” and “X from Y” methods assume that error is normally distributed about the regression line.

[8] Error is propagated through transfer functions for which the value of the input variable is itself uncertain (i.e., the value of the input variable varies among iterations) in two ways: (1) For “Y from X” regression equations, an error pdf with a mean value of zero is generated from the SE of new observations as described above for pdfs of Y. The value of X assigned for each iteration is used to evaluate Y (using the regression equation), to which a value randomly selected from the error pdf is then added. Therefore, multiple iterations result in a pdf for Y that incorporates uncertainty in X and uncertainty in the regression equation. (2) For “X from Y” transfer functions, the value of Y assigned for each iteration is used to evaluate the expression for confidence intervals around X.

[9] PBUQ reads a text file of measured or otherwise prescribed values and their associated standard deviations. Each row of the text file is for one paleosol and each column is for a certain prescribed variable. Columns are identified by a number in the first row (see program files in supporting information). PBUQ first creates an array that contains the ages of each paleosol included in the text file. Next, PBUQ calculates δ^{13}C_{s}, then S(z), then δ^{13}C_{r} and δ^{13}C_{a} (details below), and finally [CO_{2}]_{atm} and R using equation (1). Values of [CO_{2}]_{atm} and R are calculated during each iteration. Negative [CO_{2}]_{atm} values are reassigned as zeros and a warning message is generated. The values that are calculated by PBUQ vary with user-specified method but at a minimum consist of, for each paleosol, 10,000 values of δ^{13}C_{s}, S(z), δ^{13}C_{r}, δ^{13}C_{a}, [CO_{2}]_{atm}, and R. PBUQ plots median [CO_{2}]_{atm} and R values versus age with error bars extending to the 16th and 84th percentiles (i.e., the middle 68‰). PBUQ also plots histograms of S(z), 1.0044 (δ^{13}C_{r}) + 4.4, δ^{13}C_{s} and δ^{13}C_{a} for the first paleosol listed in the input text file. Output data are matrices and can be copied and pasted for use in other applications. Output plots can be edited using MATLAB and saved as various graphics formats.

3.1. δ^{13}C_{s}

[10] PBUQ asks the user how to determine the temperature of carbonate formation. The options are (1) Δ_{47} measurements, (2) alkali ratio, or (3) other mean annual temperature (MAT) estimate. The mean and standard deviation of other, user-specified values (option 3 above and all user-specified values referred to below) are entered in the input text file. This option allows the user to implement a proxy that is not included in PBUQ. For option 1, PBUQ takes as input the temperatures corresponding to measured Δ_{47} values of paleosol carbonate [e.g., Ghosh et al., 2006]. For options 2 and 3, PBUQ takes as input temperatures corresponding to measured alkali ratios [Sheldon et al., 2002] or MAT determined some other way and assumes that the prescribed values of MAT equal mean annual air temperatures (MAAT). PBUQ then calculates the temperature of soil carbonate formation from MAAT using a transfer function developed here that regresses previously reported Δ_{47} temperatures on MAAT [Passey et al., 2010; Quade et al., 2013]. The equation for this regression line is Y = 0.506 * X + 17.974, where Y is carbonate formation temperature and X is MAAT. PBUQ also assigns errors to the temperatures as (1) prescribed in the appropriate column of the input text file and (2) associated with calculating carbonate formation temperature from MAAT. The SE associated with MAT calculated from alkali ratios is held constant at ± 4.4°C [Sheldon et al., 2002].

3.2. S(z)

[11] Next, PBUQ assigns values relevant to calculating S(z). The options for determining the value of S(z) include: (1) a transfer function between mean annual precipitation (MAP) and S(z) [Cotton and Sheldon, 2012], (2) a transfer function between the depth to the horizon of calcium carbonate accumulation (Bk) and S(z) [Retallack, 2009], (3) soil-order specific ranges assigned based on paleosol morphology [Montañez, 2013], and (4) another user-specified value. Six modifications were made to the S(z) values reported by Montañez [2013]. (1) The formal definition of epsilon presented by Romanek et al. [1992] was used to calculate δ^{13}C_{s} from the measured δ^{13}C values of pedogenic carbonates. (2) An atmospheric CO_{2} concentration of 280 ppmV was used for surface soils. For buried soils in the compilation that are well dated, the mean δ^{13}C values and concentrations of atmospheric CO_{2} during soil formation were estimated from Holocene and deglacial ice core records [Indermühle et al., 1999; Lourantou et al., 2010] and used in the inverted barometer equation to calculate S(z). (3) A constant 1‰ was added to the measured δ^{13}C values of SOM from surface soil A horizons to account for the Seuss effect; no correction for the Seuss effect was made to δ^{13}C values of B horizon SOM or buried soil SOM. (4) A constant 0.5‰ was added to the Seuss effected-corrected δ^{13}C values of A horizon SOM and a constant 1‰ was subtracted from the δ^{13}C values of B horizon SOM to calculate δ^{13}C values of respired CO_{2}. PBUQ makes the same corrections to measured paleosol OM δ^{13}C values as justified later. (5) Three outlier Inceptisol S(z) values were modified from the values calculated by Montañez [2013]. The δ^{13}C value of litter, rather than C horizon SOM, reported by Laskar et al. [2010] for a Western Indian Inceptisol was used as a proxy for δ^{13}C_{r} (2‰ was added to the measured δ^{13}C value of the litter to account for the Seuss effect), which resulted in S(z) values similar to those calculated for other Inceptisols. A large proportion of root respiration might explain correspondence between litter and respired CO_{2} δ^{13}C values in this soil. (6) Four of the soils studied by Kelly et al. [1991] were removed from the compilation because the measured δ^{13}C value of organic matter in these soils substantially decreased with depth below 50–75 cm making estimates of δ^{13}C_{r} highly uncertain. The soils studied by Quade et al. [1989] were also removed from the compilation because δ^{13}C values of SOM were not measured in this study. These modifications decreased the number of negative S(z) values calculated from 25 to 6 and decreased the number of S(z) values greater than 10,000 ppmV from 10 to 1. S(z) values for “equilibrium” and “nonequilibrium” carbonate-SOM pairs (as defined by Moñtanez [2013]) are grouped together in the arrays of S(z) values assigned to each soil order in PBUQ. Most of the “nonequilibrium” pairs in the compilation have Δ^{13}C_{carbonate-OM} > 16‰, which suggests formation of carbonate at low soil pore space pCO_{2} [Montañez, 2013] and does not require that the carbonate formed in equilibrium with CO_{2} from a source other than the measured SOM. Avoiding the use of the phrase “nonequilibrium” for all soils with Δ^{13}C_{carbonate-OM} > 12.5‰ should be considered.

[12] The S(z) values used in the regression of S(z) on depth to Bk correspond to the late growing season “shoulder” of soil CO_{2} concentrations [Retallack, 2009], which is a reasonable choice given the current conceptual model of calcium carbonate accumulation in soils [e.g., Birkeland, 1999]. However, these late growing season shoulder values are substantially higher than the near-minimum growing season values suggested by empirical studies involving the timing of isotopic equilibrium between calcite-CO_{2} and calcite-water [Breecker et al., 2009] and inversion of equation (1) to solve for S(z) using data from Holocene soils [Montañez, 2013]. Furthermore, application of a regression of minimum growing season S(z) on MAP resulted in good agreement between paleosol carbonate-based and other paleoatmopsheric CO_{2} proxies for the late Miocene [Cotton and Sheldon, 2012], a period during which the other paleoatmospheric CO_{2} proxies agree well. Therefore, I include the S(z) from depth to Bk transfer function [Retallack, 2009] as an option for completeness, but I suggest that its application results in overestimation of ancient atmospheric CO_{2} concentrations.

[13] If option 1 is chosen, PBUQ asks the user which of the following methods should be used to determine MAP: (1) chemical index of alteration without potassium (CIA-K) [Sheldon et al., 2002], (2) depth to Bk [Retallack, 2005], or (3) another estimate of MAP. For suboptions 1 and 2, PBUQ uses regressions of the dependent variable (i.e., CIA-K or depth to Bk) onto the independent variable (MAP), and therefore, the regression line equations used are “X from Y” functions and differ from the equations originally reported. The same is true for the S(z) from depth to Bk regression line equation [Retallack, 2009]. A log transformation of the depth to Bk (the dependent variable) was used for the regression of depth to Bk on MAP and a log transformation of MAP (the independent variable) was used for the regression of CIA-K on MAP. Box-Cox transformations [Box and Cox, 1964] indicated that these log transformations are nearly optimum within the family of transformations considered.

3.3. δ^{13}C_{r} and δ^{13}C_{a}

[14] PBUQ then assigns values relevant to calculating δ^{13}C_{r} and δ^{13}C_{a}. PBUQ uses several constant values for relating δ^{13}C values of carbon reservoirs and fluxes. The δ^{13}C value of soil respired CO_{2} is typically less negative than the carbon isotope composition of leaves. PBUQ uses 2.6 ± 0.6‰ for this difference, which is the mean and standard deviation of a compilation of measured δ^{13}C values in forests [Bowling et al., 2008].

[15] It is assumed that the difference between the δ^{13}C values of bulk paleosol organic matter and the original soil organic matter equals 0.0 ± 0.5‰. The magnitude of the difference between the δ^{13}C values of soil-respired CO_{2} and the original soil organic matter (Δ^{13}C_{SRCO2-SOM}) depends on whether the organic matter was collected from the A or B horizon of the paleosol. The value of Δ^{13}C_{SRCO2-SOM} is set to 0.5 ± 0.5‰ and −1 ± 0.5‰ for organic matter collected from the A and B horizons, respectively. The former value is based on δ^{13}C values reported by Bowling et al. [2008]. The latter value of Δ^{13}C_{SRCO2-SOM} used here is based on the former and the mean difference between δ^{13}C values of A and B horizon SOM in an archived soil investigated by Torn et al. [2002]. Many other studies have shown that δ^{13}C values of organic matter increase with depth in modern, surface soils [e.g., Feng et al., 1999; Wynn and Bird, 2007; Wynn et al., 2005]; the values reported by Torn et al. [2002] are used here because they minimize complications associated with the Seuss effect. More rapid decomposition of C_{4}- as compared with C_{3}-derived organic matter may result in δ^{13}C values of respired CO_{2} that are substantially higher than δ^{13}C values of organic matter in mixed C_{3}–C_{4} soils [Wynn and Bird, 2007]. PBUQ does not consider this effect, but its magnitude should be investigated with future work. Bowen and Beerling [2004] concluded based on a numerical model coupling CO_{2} production and diffusion with soil organic matter dynamics that soil-respired CO_{2} has lower δ^{13}C values than B horizon organic matter because most of the CO_{2} in the soil pore spaces (even in the B horizon pore spaces) is produced in the A horizon. However, in several central New Mexico soils during the time of soil carbonate formation, the average depth of soil respiration was substantially deeper than the A horizon and soil-respired CO_{2} was primarily from the rhizosphere as opposed to decomposition of SOM [Breecker et al., 2012]. These conditions resulted in δ^{13}C values of soil-respired CO_{2} that were 1–3‰ lower than δ^{13}C values of the SOM occurring in the top 20 cm of the New Mexico soils [Breecker et al., 2012]. Therefore, deep autotrophic respiration may be more important in controlling the δ^{13}C values of CO_{2} in soil Bk horizons than assumed here and by Bowen and Beerling [2004]. Empirical studies comparing the carbon isotope composition of soil-respired CO_{2} and SOM should be undertaken in the future to evaluate the accuracy of the values used here.

[16] PBUQ includes four options for calculating δ^{13}C_{r}: (1) from bulk paleosol organic matter, (2) from organic matter occluded in carbonate or otherwise well preserved during diagenesis, (3) from δ^{13}C_{a} (using Kohn [2012]), and (4) another estimate of δ^{13}C_{r}. For options 1 and 2, PBUQ requires the user to specify whether the organic matter was collected from the A or B horizons of the paleosols so that the appropriate Δ^{13}C_{SRCO2-SOM} can be applied. For option 3, PBUQ includes four options for calculating δ^{13}C_{a}: (1) using a published Cenozoic record derived from measured δ^{13}C values of marine benthic foraminifera carbonate [Tipple et al., 2010], (2) from contemporaneous marine carbonates using Passey et al. [2002], (3) from contemporaneous, well-preserved, humid-climate organic matter such as coal using Arens et al. [2000], (4) another estimate of δ^{13}C_{a}. PBUQ includes an option for using an estimate of MAP, if available, to more precisely calculate δ^{13}C_{r} from δ^{13}C_{a} using [Kohn, 2010]. MAP can be calculated using one of the three options described above. If no MAP estimate is used, then PBUQ randomly selects from all data points compiled by Kohn [2010] for MAP < 1000 mm, the typical range in which modern soils contain carbonate bearing B and C horizons. Use of the relationship reported by Arens et al. [2000] to determine δ^{13}C_{r} values from δ^{13}C_{a} values is not included as an option in PBUQ because moisture stress must be considered [e.g., Kohn, 2012] to accurately use δ^{13}C_{a} to calculate the δ^{13}C value of organic matter in most calcic soils. If δ^{13}C_{r} is calculated independently of δ^{13}C_{a}, PBUQ includes an option for the calculation of δ^{13}C_{a} from δ^{13}C_{r}. Otherwise, δ^{13}C_{a} is calculated using one of the four options described above.

4. Results and Discussion

[17] The most uncertain variable used to calculate [CO_{2}]_{atm} with equation (1) is still S(z), despite recently improved constraints. Errors associated with soil order-based S(z) values are skewed right and range from +31%/−23% for Inceptisols to +300%/−65% for Vertisols (for the middle 68% around the median). Once error is propagated through the transfer functions, S(z) values determined from MAP [Cotton and Sheldon, 2012] are associated with ≥ ± 50% error. For S(z) calculated from MAP, the magnitude of the percent error increases as S(z) decreases (Figure S1). Sources of error other than S(z) are also important. Uncertainties associated with [CO_{2}]_{atm} are minimized when soil CO_{2} is an evenly balanced mixture of soil-derived and atmospheric CO_{2}, that is, S(z) ≈ [CO_{2}]_{atm} such that 0.3 < R < 1.8 [Breecker, 2010]. In that study, a sensitivity analysis was conducted by holding, one at a time, the variables on the right-hand side of equation (1) constant (i.e., no error) to determine the degree to which total [CO_{2}]_{atm} uncertainty was reduced. Minimum uncertainties result when S(z) ≈ [CO_{2}]_{atm} because under that condition calculated [CO_{2}]_{atm} values are relatively insensitive to variations in δ^{13}C_{s}, δ^{13}C_{r}, and δ^{13}C_{a} (Figure 2). This sensitivity increases as R departs from unity such that for soils with unbalanced soil CO_{2} mixtures the uncertainty associated with δ^{13}C_{s}, δ^{13}C_{r}, and δ^{13}C_{a} can be more important than uncertainty associated with S(z). Unfortunately, the S(z) values that would result in the most evenly balanced soil CO_{2} mixtures for much of the Phanerozoic (S(z) <1000 ppmV) [Berner, 2008; Breecker et al., 2010] are themselves highly uncertain when calculated from MAP using Cotton and Sheldon's [2012] calibration (Figure S1). However, S(z) values for evenly balanced Phanerozoic, and especially Cenozoic and Carboniferous, paleosols can perhaps be more precisely determined from MAP by restricting the regression of S(z) on MAP to soils formed in climates with MAP < 300 mm (Figure S1). The development of the MAP-based and other, novel proxies for S(z) in desert soils is, therefore, likely to make substantial improvements to our understanding of Phanerozoic atmospheric CO_{2}. However, further work is required to determine if S(z) is in fact related to MAP [Cotton and Sheldon, 2012] or if S(z) is poorly related to MAP and better related to actual evapotranspiration [Montañez, 2013], some other climatic variable or to soil property variables such as organic matter content and porosity. Furthermore, future work should investigate whether S(z) varies with [CO_{2}]_{atm} (the uniformitarian approach used here assumes that it does not), which might be expected considering the effects of CO_{2} fertilization on soil respiration [Bernhardt et al., 2006].

[18] Consideration of soil CO_{2} balance is important for two other reasons. (1) Uncertainty associated with S(z) values calculated by rearranging equation (1) and applying to modern soils for which [CO_{2}]_{atm} is known [Breecker et al., 2010; Montañez, 2013] is minimized for evenly balanced soil CO_{2} mixtures. Therefore, S(z) values > 600 ppmV calculated with this approach are substantially more uncertain than S(z) <600 ppmV. This uncertainty associated with S(z) is not quantified with PBUQ. (2) S(z) values calculated from the Fe(CO_{3})OH component of palseosol goethite range from 30,000 to >100,000 ppmV [Yapp and Poths, 1996], and thus these paleosols likely had highly unbalanced CO_{2} mixtures. Therefore, the goethite technique requires highly accurate and precise determinations of δ^{13}C_{r} and may have larger overall uncertainty than the paleosol carbonate approach even though the former involves fewer assumptions.

[19] The significant differences among mean S(z) values for various soil orders (Table S1) suggest that the paleosol morphologies best suited for atmospheric CO_{2} determinations vary through geologic time. For instance, the paleosol carbonate CO_{2} proxy has long been considered to work poorly for time periods when atmospheric CO_{2} was below 1000 ppmV [e.g., Ekart et al., 1999], but the low Inceptisol S(z) values (300 ± 100 ppmV) may eliminate this restriction. In fact, protosols are promising for the reconstruction of icehouse atmospheric CO_{2} concentrations, provided that values of other variables in equation (1) can be precisely determined. Paleo-Andisols might be suited for greenhouse climates when atmospheric CO_{2} was elevated, but the Andisol S(z) values are currently based on soils located in only one region and, therefore, may not be representative. Vertisols have a relatively high mean S(z) value (1500 ppmV) but the individual S(z) values are highly variable, and the distribution is more strongly skewed toward high values than are the distributions for other soil orders (Figure S2). Aridisols, Alfisols, and Mollisols have intermediate mean S(z) values and their paleosol analogs likely work best during climate transitions and are also a good compromise throughout the Phanerozoic. Given the substantial variance of S(z) for most soil orders, many individual paleosols may not conform to these generalizations. Therefore, in advance of quantifying uncertainty with PBUQ (and without knowing S(z) or [CO_{2}]_{atm}) the value of R can and should be calculated to screen paleosols and assess their suitability for the time period of interest. However, it should also be noted that averaging among multiple paleosols, even those that are somewhat poorly balanced, may in fact reduce total uncertainty on calculated [CO_{2}]_{atm}.

[20] The pdfs of calculated [CO_{2}]_{atm} are slightly skewed toward high values but not to the same degree as determinations made using other CO_{2} proxies such as stomatal frequency which saturates at some level above 350ppmV [Royer et al., 2001]. The skewness results from (1) skewed S(z) distributions, (2) the multiplication of nonzero variables (i.e., S(z) times R) each with an associated error, and (3) the reassignment of all negative [CO_{2}]_{atm} values as zeros. The skewness is minimized for evenly balanced soil CO_{2} mixtures because uncertainty associated with R decreases as R approaches 1. Therefore, paleosols with evenly balanced mixtures of soil CO_{2} are desirable not only because they minimize total uncertainty but because 1) they provide better constraints on maximum CO_{2}, which other proxies poorly constrain, and 2) normal distributions are assumed in many statistical tests that might be usefully applied to the calculated atmospheric CO_{2} concentrations.

[21] For most paleosols, values of R can be calculated much more precisely than values of S(z) and [CO_{2}]_{atm}. This brings up the possibility that factor changes in R might provide the most precise constraints on factor changes in [CO_{2}]_{atm}, which would be valuable because the number of doublings between two time slices, not absolute atmospheric CO_{2} concentrations, are required to determine Earth System Sensitivity. A required assumption for this method to work is that the S(z) values for the paleosols from two time slices are equivalent, that is, F=R1R2M, where F is the factor change in [CO_{2}]_{atm}, the subscripts 1 and 2 refer to time slices 1 and 2, respectively, and the multiplier M is the ratio S(z)_{2}/S(z)_{1}, which is assumed to equal one. Averaging S(z) values from numerous paleosols has been suggested to result in such an equivalence of S(z) through time [Breecker et al., 2010]. I calculate here that to reduce uncertainty associated with M to 1.0 ± 0.1, approximately 50 randomly selected paleosols within each time slice need to be averaged together (Figure S3).

[22] Current theoretical minimum uncertainty associated with atmospheric CO_{2} concentrations calculated from individual paleosols is +38%/−29% (for Inceptisols with “evenly balanced” soil CO_{2} mixtures). The uncertainty, as determined using PBUQ, associated with most published [CO_{2}]_{atm} values from individual paleosols is substantially larger. In particular, uncertainty associated with CO_{2} estimates from “poorly balanced” soils made by calculating δ^{13}C_{r} from δ^{13}C values of marine carbonates without use of MAP estimates are closer to +100%/−50%. At this point in time, calcic paleosols are primarily useful for comparison of two or more time periods within which estimates from multiple individual palseosol have been averaged (as originally suggested by Cerling [1992]) and for evaluating the significance of trends in atmospheric CO_{2} through time. Uncertainties need to be reduced by large scale averaging and reducing uncertainties on input variables in order to provide useful constraints on Earth system sensitivity.

5. Conclusions

[23] This work indicates that S(z) is largest source of uncertainty associated with atmospheric CO_{2} concentrations determined from most, but not all, calcic paleosols. Minimum uncertainty for individual paleosols will be achieved by selecting paleosols for which S(z) ≈ [CO_{2}]_{atm} such that 0.3 < R < 1.8, provided that values for the required δ^{13}C values can be precisely determined for such paleosols. For much of the Phanerozic, paleosols that satisfy this condition are likely to be those formed in deserts or those that are weakly developed. Future efforts should be dedicated to developing proxies for S(z) and δ^{13}C_{r} values in such soils. The computer program PBUQ is intended to help researchers draw statistically significant conclusions regarding paleoatmospheric CO_{2} concentrations. Moreover, such uncertainty quantifications are valuable for troubleshooting and improving paleoclimate proxies.

Acknowledgments

[24] Thanks to C. Jackson for helpful discussion. Comments from Nathan Sheldon and an anonymous reviewer helped improve this work. This research was supported by NSF grant 0922131 to D. Breecker.

Footnotes

1

Additional supporting information may be found in the online version of this article.