A conceptual and practical approach to data quality and analysis procedures for high-frequency soil respiration measurements


*Correspondence author. E-mail: savage@whrc.org


  • 1Understanding the mechanisms regulating the efflux of carbon dioxide (CO2) from the soil to the atmosphere via soil respiration (SR) is a critical component of understanding terrestrial carbon (C) cycle responses to climate change, but requires high-quality measurements of SR fluxes. Thus, measurements of SR have become one of the primary tools used in terrestrial C cycling research.
  • 2When developing a sampling strategy for SR measurements, researchers must consider the ultimate use of the data set. A weekly or bi-weekly manual sampling strategy is likely sufficient if the desired outcome is an annual estimate of CO2 efflux. However, if modelling SR on time scales from minutes to days is the purpose of the study, automated SR measurements are advantageous.
  • 3Automated SR systems produce large volumes of data that present new challenges for quality assurance and quality control. A relatively efficient protocol to analyse large SR data sets is proposed here.
  • 4Analysis of two large data sets provides information about systematic sampling uncertainties as well as random measurement errors. These must be taken into account when using automated SR measurements in any data–model fusion context.


Global climate change, driven by increasing atmospheric concentrations of carbon dioxide (CO2), is a foremost environmental concern, and considerable research has been focused on quantifying the components of the global carbon (C) cycle. Soil respiration (SR), representing an aggregation of below-ground processes by both heterotrophs and autotrophs, contributes 30–80% of the total respiratory efflux in most ecosystems (Davidson et al. 2002) and is therefore a major component of the C cycle. Understanding the mechanisms of, and potential changes to, the soil–atmosphere exchange of CO2 through SR is a critical aspect of understanding ecosystem responses to climate change. Thus, measurements of SR have become a primary tool for terrestrial carbon cycling research.

SR is influenced by several factors, primarily temperature, soil moisture, root growth and substrate supply (Davidson, Janssens & Luo 2006; Liu et al. 2006). Two primary goals of measuring SR are to determine an annual SR budget and to develop improved models of SR which move beyond simple temperature functions. Additionally, these measurements have been used to inform and constrain models of decadal scale changes in soil C stocks (Gaudinski et al. 2000), and to characterize seasonal and interannual patterns of below-ground C allocation (Davidson et al. 2002, 2006).

Since SR is influenced by factors which cause variation at hourly, diel, seasonal and interannual time scales, measurements must be made at the appropriate temporal frequency in order to address research questions at each of these temporal scales. Current SR measurement techniques include manual and automated chamber systems (Savage & Davidson 2003; Liu et al. 2006; Carbone & Vargas 2008). Manual measurements of SR are made by a researcher at discreet points in time (typically weekly or less frequently) but with sampling across the landscape, whereas autochambers function without supervision and more or less continuously but at a limited number of fixed sampling points.

Due to equipment constraints, manual sampling usually occurs only on days without precipitation. Thus, the immediate and potentially large effects of soil moisture changes on SR (Lee et al. 2004; Xu, Baldocchi & Tang 2004) are missed, which may bias estimates of annual SR. The limited sampling frequency requires interpolation or modelling to calculate annual fluxes, but spatially extensive sampling is possible, which provides good characterization of site heterogeneity (Stoyan et al. 2000).

With automated SR systems, each chamber is typically sampled every half hour, 24 h a day, 7 days a week, providing an abundance of data – potentially tens of thousands of measurements per year. The continuity of measurements permits modelling and statistical analysis of SR at time scales from minutes to months. Disadvantages of an automated system include more complicated maintenance, higher initial cost and more restricted spatial sampling imposed by power constraints, infrastructure and semi-permanent installation.

Several publications have compared different designs of SR chambers and addressed their sources of error and bias (Hutchinson & Livingston 2001; Davidson et al. 2002). Protocols for evaluating data quality have not been explicitly addressed, probably because manual measurements are usually very few (e.g. a few hundred per year) that most researchers can visually inspect their entire data sets. By comparison, within the community of researchers using the eddy covariance method to measure whole-ecosystem CO2 fluxes, standardized methods for processing and quality control have been developed and are being widely adopted (Papale et al. 2006; Moffat et al. 2007). Now that large quantities of data are being generated by automated SR systems at many research sites, there is a need for similar protocols to be developed for SR data.

Information about data uncertainties is needed for basic quality control, to make statistical comparisons between models and data, and to provide confidence intervals (CIs) on C budgets. As eddy covariance and SR data are increasingly being used in a data–model fusion context by C cycle researchers, there is a need for data errors and uncertainties to be characterized so that these can be accounted for in the assimilation process (Raupach et al. 2005). While this information is available for eddy covariance measurements (Richardson et al. 2006a, 2008), there are few comparable data for SR data (Richardson et al. 2006b).

The objectives of this paper are to (i) derive a data quality protocol for automated SR data, (ii) evaluate random measurement error and systematic sampling uncertainties in automated SR data, (iii) compare manual and automated SR data, and (iv) use high frequency data to provide guidelines regarding an appropriate sampling strategy for manual SR measurements.


site and plot description

SR was measured at Harvard Forest near Petersham, MA, USA (42°32′N, 72°11′W), and at Howland Forest, near Howland Maine, USA (45°12′N, 68°44′W). Harvard is a mixed hardwood forest, c. 70 years old, growing on well-drained sandy loam soils. The dominant tree species is red oak. Soils are classified as Typic Distrochrepts, and the soil series is Canton fine sandy loam. Mean annual temperature is +7 °C and mean annual precipitation is 1026 mm (see Compton & Boone 2000 for further information).

Howland is a mature boreal transition forest, dominated by 160-year-old red spruce and eastern hemlock stands. Soils are classified as Aquic Haplorthods, and the soil series is Skerry fine sandy loam. Mean annual temperature is +6 °C, and mean annual precipitation is 1063 mm (see Fernandez, Rustad & Lawrence 1993 for further information).

manual sampling of sr

At both Harvard and Howland forests, manual SR measurements were made with dynamic chambers and a portable infrared gas analyser (Licor 6252 IRGA; see Savage & Davidson 2003 for chamber design details). SR was measured weekly during spring and summer and once or twice per month during fall and winter. Manual SR measurements have been made across a range of plots (spanning different soil types and drainage classes) at both forests, but only the manual data from the same plot where the autochamber measurements were made are used in the present analysis. SR was sampled between the hours of 09·00 and 15·00 h. We have determined that fluxes measured during this time interval are usually representative of the daily mean flux.

For each SR measurement an opaque, vented chamber top was placed over a soil collar (previously inserted to a depth of 5 cm, and left in place between measurements). A pump circulated air from the chamber to the IRGA at a rate of 0·5 L min−1 for 5 min, and CO2 concentrations were logged every 12 s. The most linear section (across a minimum of six sample points) of the increasing CO2 concentration time series was identified and the rate of increase (slope) over time was calculated, but only accepted if the linear regression R2 > 0·90. Slope estimates were scaled for the collar cross-sectional area, and corrected for atmospheric pressure and air temperature to yield the CO2 efflux rate (eqn 1).


eqn 1

where dCO2/dt is the slope of the change in CO2 concentration over time (µL CO2 L air−1 s−1) , P is atmospheric pressure (atm), V is chamber volume (L), T is air temperature (K), A is collar surface area (m2), R is the universal gas constant 0·08206 L atm mol−1 K−1. Flux is in units of mg C m−2 h−1. Chamber volume is the sum of the collar volume and the chamber cap volume. The chamber cap is a fixed volume of 5·45 L. Collar volumes for each sampling location were calculated by taking the average collar height (n = 7) and multiplying by the collar area. The chamber volumes were measured at the beginning of the sampling season and remained fixed for the season. Chamber volumes for the automated SR system were calculated in the same manner.

Six collars were installed at Harvard and eight collars at the Howland forest, and averages of these chamber flux measurements were used for each sampling date (13 sampling days during the summer period from June to August and 12 days during the rest of the year; thus 150 measurements from Harvard and 200 from Howland). Each of the 350 individual flux traces was examined and visually assessed for data quality. Seasonal CO2 efflux was calculated by linearly interpolating between sampling dates and then integrating over time.

automated sampling of sr

The same automated SR system and data analysis protocols were used at both Harvard and Howland forests (see Savage & Davidson 2003 for design details). During each 5-min measurement period, an opaque, vented chamber top closes (time to closing < 5 s) onto a collar, a pump circulates air from inside the chamber headspace to an IRGA (Licor 6252) and a data logger (Campbell CR10X) records the headspace CO2 concentration every 12 s, with fluxes then calculated as described above. There were six chambers per site, each of which was sampled once every 30 min between 17 May and 11 November 2003 (Harvard), and from 4 May through 3 November 2005 (Howland). Roughly, 50 000 flux measurements were recorded at each site.

data quality procedure for the automated sr system

Routine inspection of the automated system is important to ensure quality data. Collars were checked for proper sealing and consistent flow rates between chamber and IRGA. Flow problems were discovered between day of year (DOY) 174–176 and 184–196 at the Harvard forest, and all measurements from these periods were removed (3456 measurements). At Howland, there were several short power outages resulting in data loss.

To determine the most linear portion of the CO2 time series, a spreadsheet was set up to calculate automatically the slope from three specific 96-s time intervals (eight sample points, Fig. 1) from each tracing. The slope of the regression with the highest R2 value among these three regressions was selected for the flux calculation (following eqn 1), provided that the R2 of this regression was at least 0·90. It should be noted that a linear fit to our flux data was appropriate for our soil type and methodology; however, this may not be the case for all soils and flux measurement systems. The combination of this R2 criterion and the routine system inspection resulted in the removal of 12% of measured fluxes from Harvard forest (leaving 40 113 SR measurements) and 13% of fluxes from Howland forest (leaving 44 390 SR measurements).

Figure 1.

A typical tracing of increasing CO2 concentration in a chamber headspace over time in an automated chamber. Each line represents that portion of the tracing in which a linear regression was calculated. The lag before CO2 concentrations begin to increase is a function of the time it takes to close the chamber top, and the time for the air sample to move from the chamber top to the IRGA. T0 represents the beginning of ambient sampling, 30 s before the chamber top lowers.

Time series of SR fluxes were visually inspected for anomalies: for example, unusually large fluxes (occasionally exceeding 1000 mg C m−2 h−1) in chamber 3 at Harvard were obvious between DOY 228 and 232, coincident with the appearance of two very large mushrooms in that collar. Although these flux measurements were accurate, the inclusion of these SR measurements from chamber 3 would significantly skew the mean of the six chamber measurements representing this study site. This sample size is probably too small to adequately include spatially representative contributions of mushrooms across the site; therefore, the fluxes (192) from chamber 3 during this period were excluded.

SR measurements for each individual collar were plotted against the mean SR values for the other five collars for the same sampling period (Fig. 2). Although there are differences in the magnitude of SR among chambers, reflecting the spatial heterogeneity of the soils, these fluxes tended to covary temporally. Anomalies in this temporal covariance among chambers were used to identify potentially anomalous measurements. When the ratio of an individual chamber measurement divided by the mean across all other chambers was < 0·5 or > 2·0, the individual chamber measurement was flagged ‘suspect’ for subsequent investigation. For Harvard, 1209 SR measurements fit these criteria, compared with 3722 at Howland.

Figure 2.

Individual automated soil respiration flux measurements for the Harvard forest plotted against the mean of the other five collars made during the same half-hour period. Dashed lines represent the ratio of 0·5 and 2·0 for the y-value/x-value.

CO2 concentration traces for suspect fluxes were first visually examined for unusual increases or decreases that warranted rejecting the measurement. For suspect fluxes with normal looking traces, the fluxes measured before and after the suspect measurement were examined. If the suspect flux differed by > 100 mg C m−2 h−1 from adjacent measurements, it was removed; if the suspect flux was consistent with adjacent measurements, it was retained. For Harvard, 237 suspect flux measurements were removed, compared with 717 at Howland.

In summary, roughly 13% of measurements were excluded because of equipment or power failure, and the regression R2 > 0·90 criterion. Another 5% were identified as ‘suspect’ by comparing individual flux estimate with the mean of the other five fluxes, but the number of measurements which were removed after manually checking suspect tracings for unusual patterns and by identifying unusual, non-representative, events (e.g. mushrooms within collars) was only 1%. The protocol for identifying and evaluating about 5000 ‘suspect’ SR measurements was simple and relatively efficient. The final data sets for Harvard forest contained 39 876 SR measurements, and for Howland forest, 43 673 SR measurements. For subsequent analysis, the mean of these measurements (n = 2–6 chambers) was calculated at half-hourly intervals and is used as the basis for subsequent analyses.

random and systematic uncertainties in autochamber sr data

Measured SR is affected by random and systematic sources of error. Random errors result from instrument glitches and other stochastic events, whereas sampling uncertainty (i.e. a limited number of chambers are deployed in a spatially heterogeneous landscape) is a source of systematic error, or bias.

Following procedures proposed by Hollinger & Richardson (2005), a ‘paired observations’ approach was used to infer the statistical properties of the random error, ɛ = | SRt=0 – SRt=24 | /√2, from the difference between measurements made at the same chamber exactly 24 h apart. To ensure that differences in these paired measurements was attributed to random error and not to recent precipitation and increases in soil or litter water content, only pairs in which there was no precipitation between paired measurements or within 24 h prior to the initial SR were used. The random error was characterized by an estimate of its standard deviation (SD), σ (ɛ).

Sampling uncertainty was characterized as the standard error of the mean (SEM) (across the sample of six chambers) flux, calculated first for weekly mean fluxes per chamber, and then for the seasonally integrated flux sums.


estimates of seasonal co2 efflux with manual and automated measurements

Manual SR measurements were interpolated between sampling points to estimate the seasonal CO2 efflux; a similar procedure was used to deal with the smaller gaps in the autochamber data. The estimated seasonal CO2 efflux from the manual measurements was within ±10% of that from the automated SR in both years at Howland and in one of the two years at the Harvard forest (Table 1). There was a 23% difference between manual and autochamber estimates at the Harvard forest in 2003.

Table 1.  Seasonal total CO2 efflux estimate derived from the manual sampling strategy and the automated SR measurements. Missing periods of data (due to system failure and suspect fluxes) for automated soil respiration (SR) were linearly interpolated. At Harvard there were 982 missing points, with the longest continuous period of missing data being 5 days, and at Howland there were 385 missing data points, with the longest continuous missing period being 3 days
YearsSiteNo. of daysSeasonal total
Interpolated manual SR estimate (kg C m−2 season−1)Interpolated automated SR estimate (kg C m−2 season−1)
2002Harvard Forest580·260·27
2003Harvard Forest1800·600·78
2004Howland Forest1770·620·66
2005Howland Forest1850·700·68

manual sampling intensity needed for seasonal estimates

The high frequency automated SR measurements were randomly subsampled to investigate the effect of sampling frequency on seasonally integrated estimates of SR. For the Harvard forest (DOY 137–315), the total SR estimated from the autochamber data was 0·77 kg C m−2 season−1, which is considered the ‘best estimate’. Similarly for Howland (DOY 124–307) the ‘best estimate’ was 0·64 kg C m−2 season−1.

Manual SR measurements are limited by human availability and constraints imposed by time limitations and weather. For this analysis, it was assumed that a typical sampling strategy for manual measurements would be to measure on weekdays, between 09·00 and 15·00 h and when it is not raining. Using these criteria, random selections were made from the observed automated SR measurement data set using four potential sampling strategies, ranging from two flux measurements per week to one measurement per month. This procedure was repeated 100 times for each strategy, and seasonal estimates were calculated for each using linear interpolation. Table 2 shows the range of seasonal sums per strategy and the percentage of the 100 estimates which occur within 1%, 5% and 10% of the ‘best estimate’. For example, with a once per week sampling strategy there is a 39% chance of being within 1% of the ‘best estimate’, a 92% chance of being within 5% and a 99% chance of being within 10% of the ‘best estimate’. Results were comparable for both Harvard and Howland forests (Table 2): a manual sampling strategy of twice per month yields a > 90% probability of estimating the seasonal flux within 10% of the ‘best estimate’ calculated from the high frequency autochamber data.

Table 2.  Probability that a particular manual sampling strategy will yield a seasonal estimate of CO2 efflux within 1%, 5% or 10% of the seasonal ‘best estimate’ based on the nearly continuous autochamber data during c. 7 months. Missing data points were not interpolated. The number of sampling dates is given in parentheses under the sampling regime (e.g. once per month sampling during this period yields 7 sampling dates)
Best estimate: Harvard 0·77 kg C m−2 per DOY 137–315
Best estimate: Howland 0·64 kg C m−2 per DOY 124–307
Harvard Forest 2003Range kg C m−2Once per month (7)Twice per month (13)Once per week (26)Twice per week (52)
1% of the best estimate0·76–0·7818%26%39%73%
5% of the best estimate0·73–0·8031%58%92%> 99%
10% of the best estimate0·69–0·8460%91%99%> 99%
Howland Forest 2005Range kg C m−2Once per month (7)Twice per month (13)Once per week (27)Twice per week (53)
1% of the best estimate0·63–0·6519%22%45%64%
5% of the best estimate0·61–0·6740%63%82%> 99%
10% of the best estimate0·58–0·7065%93%98%> 99%

estimates of random and systematic sr uncertainties

Using the paired observation approach to estimate the random uncertainty, the distribution of the inferred random errors for all six chambers was characterized by long tails and a pronounced central peak (Fig. 3). A Kolmogorov–Smirnov test (P = 0·0012) indicates that the probability distribution function (PDF) is not Gaussian. The strong kurtosis (2·3) suggests a double-exponential (Laplace) distribution.

Figure 3.

Frequency distribution of the inferred random error of paired soil respiration measurements. Solid black line is Laplace distribution, dashed black line is Gaussian distribution.

Data were binned according to flux magnitude and the SD of the inferred random error was calculated for each bin (Fig. 4). This analysis indicated that σ (ɛ) increases linearly with flux magnitude for each chamber in a manner which does not differ significantly among chambers or between Howland and Harvard forests. Averaged across all chambers, the relationship between σ (ɛ) and SR is σ (ɛ) = 1·96 + 0·23 × SR (SEs for parameter estimates are 1·4 and 0·01, respectively). This is the error for a single chamber; averaging across n chambers, the random error will decrease by a factor of inline image. A Monte Carlo procedure (see Richardson & Hollinger 2005) was used to calculate the total uncertainty due to random errors, both as these errors affect the measurements and also as they are propagated forward during interpolation of data gaps. For seasonal flux integrals, accumulated random errors are estimated to be extremely small, on the order of < 2 g C m−2 season−1 at 95% CI.

Figure 4.

Random error terms σ (ɛ) binned by soil respiration flux for each of the six chambers at Harvard and Howland forests. Black circles, Harvard; open squares, Howland.

The SD of the six chamber measurements can be used as an indicator of spatial variability, which was smaller in spring and autumn and which peaked at roughly 50–60 mg C m−2 h−1 in July (Fig. 5). Generally, the SDs were about 20% of the measured flux, which is comparable in magnitude to the random measurement error for a single flux measurement. We estimated the uncertainty in the mean seasonal sum which can be attributed to sampling a limited number (n) of randomly distributed collars (C1 ... Cn) in a spatially heterogeneous plot by calculating the 95% CI based on the SEM for n = 6 chambers, and assuming a t distribution with n degrees of freedom and α = 0·05. This spatial sampling uncertainty (95% CI) was estimated to be ±75 g C m−2 season−1 at the Harvard forest and ±140 g C m−2 season−1 at the Howland forest, which is about 9% and 20% of the seasonal SR integral for the same time period at Harvard and Howland, respectively. The larger spatial variability at Howland is probably due to the hummock–hollow microtopography which characterizes this site. Based on more spatially extensive manual chamber measurements (n = 4 in each microtopographic position), SR for the 185-day measurement season of 2005 was 0·61 kg C m−2 in hummocks and 0·85 kg C m−2 in hollows. Reducing the sampling uncertainty at Howland so that it is comparable in relative magnitude to the Harvard forest (e.g. to < 10% of the seasonal SR integral) would require c. 18 chambers distributed randomly (t0·05,18 = 2·11; inline image; yielding a 95% CI on the mean of ±68 g C m−2 season−1). Fewer might suffice if the topographic variation was known and a stratified random sampling strategy used.

Figure 5.

Patterns of seasonal variation in the spatial variability of weekly mean soil respiration autochamber fluxes at Harvard and Howland forests.


data quality assurance

Increased use of automated SR systems has resulted in a wealth of high frequency SR data, making modelling at various temporal frequencies possible. This has led to the need for an efficient system for data management, quality assurance and analysis. The primary advantage of automated SR systems is that continuous measurements can be made, regardless of weather or time of day, and without a human operator present. However, without an operator, instrument errors and system malfunctions (e.g. inadequate sealing of chamber tops) are not recorded and must be diagnosed later. We have presented an efficient means by which data quality can be assessed, ‘suspect’ measurements identified and unreliable measurements deleted. A simple linear regression procedure is sufficient for eliminating the majority of unreliable SR measurements. At a given site, SR measurements spatially distributed tend to covary with time, which permits the comparison of one SR measurement to the mean of others measured at the same location and time interval to efficiently identify those measurements (c. 5% of total measurements in this study) requiring further investigation by the researcher. Although using a rigid R2 criterion during the measurement periods of this study was effective, this method may not be appropriate for periods of low flux measurements, such as winter, when the SR flux (and hence the signal : noise ratio) is lower.

data uncertainies

Analysis of random errors in the automated SR data set suggested that these errors are non-Gaussian and do not have a constant variance. Rather, the PDF is more closely approximated by a double-exponential distribution and the SD of the error increases with the magnitude of the flux. These patterns are similar to what has been reported previously (Richardson et al. 2006a, 2008) for eddy flux measurements, although the autochamber error is much smaller (in absolute and relative term) than the eddy flux error, and the main sources of the uncertainty differ. While instrument errors add to the random error for both eddy flux and autochamber measurements, the largest share of eddy flux uncertainty comes from the stochastic nature of turbulence, and surface heterogeneity and a time-varying measurement footprint. Since autochambers measure SR from fixed patches of ground, estimates of the spatial heterogeneity in the mean respiratory flux can be obtained, and these provide insight into systematic sampling uncertainties. An important distinction between random errors and systematic errors is that random errors add in quadrature over time (decreasing relative error), whereas systematic errors add linearly over time (constant relative error). As a result, although random errors are potentially larger than systematic errors at short time scales (for individual measurements), these results suggest that sampling uncertainty due to spatial heterogeneity, not random error, is the dominant source of uncertainty in seasonal, spatially averaged sums. Increasing the number of chambers is the only way to reduce sampling uncertainties.

The finding that the autochamber measurement error is non-Gaussian and has non-constant variance is important because it affects how SR data–model mismatches are treated in any data–model fusion exercise, regardless of the complexity of the model (from a simple Q10 model to a full ecosystem model). As has been shown previously for eddy flux data, parameter estimates and their uncertainties (and hence model predictions and model prediction uncertainty) depend critically on how data uncertainties are specified (Richardson & Hollinger 2005). Raupach et al. (2005) argued that, for data assimilation problems, knowledge of data uncertainties is as important as the data values themselves. Given the error characteristics of the random SR measurement error, a weighted absolute deviations optimization scheme (rather than ordinary least squares optimization) is recommended to obtain maximum likelihood estimates of model parameters (Press et al. 1992; Richardson & Hollinger 2005). This approach, which is a form of robust regression, reduces the influence of outliers on the fitted model parameters.

determining effective sampling strategies

A manual sampling strategy with a bi-weekly sampling frequency is sufficient to obtain a reasonable estimate of seasonal or annual CO2 efflux in the two forests studied here (> 90% probability of an estimate within 10% of the best estimate based on autochamber data). A similar study conducted by Parkin & Kaspar (2004) on agricultural soils found more frequent (every 3 days) sampling was needed to achieve the same relative confidence in annual CO2 soil efflux. These results can be used as a guideline to determine the sampling frequency for manual measurements if the data are to be used to estimate seasonal-to-annual integrals of SR.

Measurements of SR have become one of the primary tools used in terrestrial carbon cycling research, and high temporal frequency autochamber measurements provide insight into respiratory fluxes at diel, synoptic, annual and interannual scales. Along with the measurements themselves, quantification of data uncertainties, such as random instrument errors and systematic sampling uncertainties, is needed to determine CIs for carbon budgets and accounting and for data–model fusion. Finally, these large data sets can provide guidelines for efficient sampling strategies for researchers interested in determining annual C budgets from soils.


The authors would like to thank Holly Hughes for her work maintaining the automated system at Howland. This research was supported by the U.S. Department of Energy's Office of Science (BER) through the Northeastern Regional Center of the National Institute for Climatic Change Research, Grant No. DE-FC02-06ER64157 and through the Terrestrial Carbon Program Grant No. 07-DG-11242300-091.