Northern peatlands are likely to be important in future carbon cycle-climate feedbacks due to their large carbon pools and vulnerability to hydrological change. Use of non-peatland-specific models could lead to bias in modeling studies of peatland-rich regions. Here, seven ecosystem models were used to simulate CO2fluxes at three wetland sites in Canada and the northern United States, including two nutrient-rich fens and one nutrient-poor,sphagnum-dominated bog, over periods between 1999 and 2007. Models consistently overestimated mean annual gross ecosystem production (GEP) and ecosystem respiration (ER) at all three sites. Monthly flux residuals (simulated – observed) were correlated with measured water table for GEP and ER at the two fen sites, but were not consistently correlated with water table at the bog site. Models that inhibited soil respiration under saturated conditions had less mean bias than models that did not. Modeled diurnal cycles agreed well with eddy covariance measurements at fen sites, but overestimated fluxes at the bog site. Eddy covariance GEP and ER at fens were higher during dry periods than during wet periods, while models predicted either the opposite relationship or no significant difference. At the bog site, eddy covariance GEP did not depend on water table, while simulated GEP was higher during wet periods. Carbon cycle modeling in peatland-rich regions could be improved by incorporating wetland-specific hydrology and by inhibiting GEP and ER under saturated conditions. Bogs and fens likely require distinct plant and soil parameterizations in ecosystem models due to differences in nutrients, peat properties, and plant communities.
 Northern peatlands are an important component of the global carbon cycle due to large carbon pools resulting from the long-term accumulation of organic matter in peat soils [Gorham, 1991; Turunen et al., 2002]. These carbon pools are vulnerable to changes in hydrology, which could cause climate feedbacks. Because ecosystem respiration and productivity can have opposite responses to hydrological change, the direction of the net carbon flux response can be unclear. Lowering of the water table exposes peat soils to oxygen, resulting in higher rates of ecosystem respiration (ER) and an increase in CO2 emissions, along with decreases in CH4 emissions [Clymo, 1984]. This effect has been observed in both laboratory and field studies [e.g., Freeman et al., 1992; Junkunst and Fiedler, 2007; Moore and Knowles, 1989; Silvola et al., 1996; Sulman et al., 2009]. However, very dry conditions can be associated with lower rates of ER due to drying of substrates [e.g., Parton et al., 1987]. In wetlands with complex topography, different water tables in different microforms can lead to offsetting responses [Dimitrov et al., 2010].
 Sensitivity of gross ecosystem production (GEP) to changes in hydrology has also been observed in northern peatlands [Strack and Waddington, 2007; Strack et al., 2006; Flanagan and Syed, 2011; Sulman et al., 2009]. Under high water table conditions, saturation of soils tends to suppress productivity due to limitation of oxygen and nutrient availability in the root zone, leading to increased productivity during drier periods. However, very dry conditions can also be associated with lower productivity due to moisture stress. As a result, moderately wet conditions lead to higher productivity than either very dry or very wet conditions.
 Fens and bogs are two dominant wetland ecosystem types in boreal regions. Fens, or minerotrophic wetlands, are fed by surface or groundwater flows in addition to precipitation, and have significant nutrient inputs, while bogs (ombrotrophic wetlands) are fed primarily by precipitation, and have lower nutrient levels and higher acidity. Plant communities in bogs tend to be dominated by shrubs, herbs, and non-vascularSphagnummosses, while shrubs, sedges, or flood-tolerant trees dominate typical fen plant communities [Wheeler and Proctor, 2000]. Mosses are less productive than typical wetland vascular plants, and produce litter that is more resistant to decomposition. Peat derived from vascular plants also has different structure and hydraulic conductivity than peat derived from Sphagnum mosses [Limpens et al., 2008]. Previous studies have suggested that CO2 fluxes at rich fens are more sensitive to hydrological change than fluxes at bogs, and that ER and GEP at the two wetland types may have opposite responses to hydrological change [Adkinson et al., 2011; Sulman et al., 2010]. These distinctions are therefore important for understanding wetland contributions to the carbon cycle and responses to climatic changes.
 Modeling studies incorporating hydrological effects on peatlands have predicted a substantial positive climate feedback due to future drying that cannot be ignored in studies of the evolution of the global carbon cycle under climate change [Limpens et al., 2008; Ise et al., 2008]. However, global-scale carbon cycle models do not have fine enough spatial resolution to accurately simulate conditions at peatlands, which can depend on local topography at scales from kilometers down to meters [Baird and Belyea, 2009; Dimitrov et al., 2010; Strack et al., 2006; Waddington and Roulet, 1996]. Further, some ecosystem models used in global-scale simulations may lack specific and accurate parameterizations for the various peatland types contained in their simulated regions, or may not contain wetland land cover types and plant functional types at all. Finally, land cover maps used to set up large-scale modeling studies may be based on remote sensing products or inventories that do not accurately identify peatland areas, or that cannot distinguish between peatland ecosystem types with contrasting plant communities or different sensitivities to environmental drivers [Krankina et al., 2008]. Understanding the limitations of ecosystem model simulations of different types of peatland ecosystems is thus integral to interpreting the results of large-scale ecosystem model simulations in peatland-rich regions.
 In this study, we compared eddy covariance CO2 fluxes with simulated fluxes from a group of ecosystem models for three peatlands (two in Canada and one in the northern United States). The goal was to identify potential pitfalls and areas for improvement in simulating peatland CO2fluxes using, in general, non-peatland-specific models with limited driver data, in an analog to the likely conditions for global-scale modeling studies in peatland-rich regions. We compared model output to measured fluxes to examine the accuracy of models and explore differences between models with different architectures. We tested three central hypotheses:
1. Differences between simulated and observed CO2fluxes will be correlated with observed hydrological conditions, since these conditions drive ecosystem responses that are not included in general ecosystem models that lack peatland-specific processes.
2. Models with more soil layers and explicit connections between hydrology and soil respiration will be better able to simulate hydrology-driven ecosystem processes, resulting in closer matches between modeled and observed fluxes.
3. Models will perform better at the fen sites than at the bog site, due to the prevalence of nonvascular plants and the low nutrient availability in bogs. These factors make bogs more different from the plant communities for which general ecosystem models have been well parameterized.
2.1. Field Sites
 The three peatland sites used in this study are part of the Fluxnet-Canada and Ameriflux networks, respectively. Site characteristics are summarized inTable 1.
Temperature (T), precipitation (precip), and CO2 fluxes are annual means over the study period for each site, with summer (June–July–August) average in parentheses. Water table is summer average. Annual and summer CO2 fluxes are in g/m2/yr, and g/m2/summer, respectively.
 The Lost Creek flux tower is located in a shrub fen in northern Wisconsin, USA (46°4.9′N, 89°58.7′W). The creek and associated floodplain provide a consistent water and nutrient source. Seasonal average water table levels were significantly correlated with precipitation, and were also affected by downstream beaver (Castor canadensis) dam-building activity [Sulman et al., 2009]. Vegetation at the site is primarily alder (Alnus incana ssp. Rugosa) and willow (Salix spp.), with an understory dominated by sedges (Carex spp). The site experienced a decline in yearly average water table level of approximately 30 cm over a period from 2002 to 2006 [Sulman et al., 2009].
 The Western Peatland flux tower is located in a moderately rich, treed fen in Alberta, Canada (54.95°N, 112.47°W). Vegetation is dominated by stunted trees of Picea mariana and Larix laricina, along with an abundance of a shrub, Betula pumila. The understory is dominated by various moss species [Syed et al., 2006]. The site experienced a decline in growing-season water table of approximately 25 cm over a period from 2004 to 2007 [Flanagan and Syed, 2011].
 The Mer Bleue field station is located in a domed, ombrotrophic bog near Ottawa, Canada (45.41°N, 75.48°W). The peatland has an overstory of low stature, woody shrubs, both evergreen (Chamaedaphne calyculata, Ledum groenlandicum, Kalmia angustifolia) and deciduous (Vaccinium myrtilloides). The understory is dominated by Sphagnum mosses, with some sedges (Eriphorum vaginatum) [Moore et al., 2002]. For additional details, see Moore et al.  and Roulet et al. .
2.2. Measurements and Gap-Filling
 CO2 fluxes were measured at all three sites using the eddy covariance technique [Baldocchi, 2003]. In this manuscript, gross ecosystem production (GEP) is defined as negative, and ecosystem respiration (ER) is presented as positive. Net ecosystem exchange of CO2 (NEE) is defined as ER + GEP, so that negative values of NEE indicate uptake of CO2by the ecosystem. Eddy covariance NEE was supplied by investigators at each field site, and then gap-filled and decomposed into GEP and ER using a standardized process as part of the North American Carbon Program (NACP) Site Level Interim Synthesis (http://www.nacarbon.org/nacp) [Schwalm et al., 2010]. The partitioning and gap-filling procedure is described in detail byBarr et al. . Gaps resulted from equipment failure and from screening of data for outliers and periods of low turbulence. Simple empirical models were fit to screened eddy covariance observations at an annual time scale, and an additional time-varying scale parameter was applied using a moving window to account for variability within the year. ER was determined by fitting a function of soil temperature to nighttime NEE. GEP was then calculated by subtracting ER from daytime NEE and fitting the residual to a function of photosynthetically active radiation (PAR). The ER and GEP values presented in this study are therefore not strictly measured values, but result from the assumptions of the gap-filling procedure. However, since the gap-filling procedure involved fitting the simple empirical models to observed data in a short moving window, variations in these values over time do reflect real changes in the observed quantities [Desai et al., 2008].
 Uncertainties in eddy covariance values were estimated based on a combination of random uncertainty, uncertainty due to the friction velocity (u*) threshold, gap filling algorithm uncertainty, and GEP partitioning uncertainty. These errors were assumed to be independent and summed in quadrature to determine total measurement uncertainty. Random uncertainty was estimated using the method of Richardson and Hollinger . Gap filling uncertainty was based on the standard deviation of multiple algorithms [Moffat et al., 2007]. Partitioning uncertainty was based on the standard deviation of multiple partitioning algorithms [Desai et al., 2008].
 Models were driven by meteorological data collected for each site and gap-filled according to the procedures described by [Schwalm et al., 2010] and the NACP site synthesis protocol (http://nacp.ornl.gov/docs/Site_Synthesis_Protocol_v7.pdf). Briefly, tower measurements from each site were used where available. Periods with missing site data were filled using data from nearby weather stations included in the National Climate Data Center (NCDC) Global Surface Summary of Day data set. Periods when both site and NCDC meteorology were unavailable were filled using output from the DAYMET model [Thornton et al., 1997]. Table 2 shows the meteorology data sets and the percentage of original tower measurements for each site. Additional site data were also available for model forcing, including soil properties and carbon and nutrient content, vegetation type, and biomass. Data were collected independently for each site, according to the Ameriflux biological data collection protocols [Law et al., 2008].
Numbers are percentage of original site data used. The remainder for each variable was gap-filled, as described insection 2.2. Psurf is surface atmospheric pressure; LWdown and SWdown are longwave and shortwave downwelling radiation, respectively; Qair is specific humidity; Tair is air temperature; and Precip is precipitation rate.
 This analysis incorporates hydrological measurements from each site in addition to the standardized meteorology data sets. These data sets were not available for model parameterization. Water table was measured at Lost Creek using a pressure transducer system [Sulman et al., 2009] and at Mer Bleue and Western Peatland using float and weight systems [Roulet et al., 2007; Syed et al., 2006]. In this manuscript, water table is referenced to the mean hummock surface. Negative values indicate water table below the hummock surface and positive values indicate water table above this level. Topographical relief between hummocks and hollows was on the order of 25 cm at Mer Bleue [Lafleur et al., 2005]. Detailed topographical information was not available for Lost Creek and Western Peatland. Water table values have uncertainties on the order of a few cm due to spatial variations in site topography. Multiyear declines in water table at Lost Creek and Western Peatland resulted in subsidence of the peat surface, which was subtracted from water table measurements using the method described by Sulman et al. , so that water table values reflect the position relative to the peat surface over the observed time period for each site. No significant changes in peat level were observed at Mer Bleue during the study period.
 In addition to water table, volumetric soil water content was measured at the Mer Bleue and Western Peatland sites. Measurement depths at Western Peatland were 7.5, 10, and 12.5 cm below the peat surface, and measurement depths at Mer Bleue were 5 and 20 cm below the surface. Fraction of saturation rather than volumetric soil moisture content was used for comparison purposes, since some of the included models reported soil moisture only in units of fraction of saturation. A fraction of saturation of 0.0 indicates completely dry soil, and a fraction of 1.0 indicates soil with pores completely filled with water. Mer Bleue soil water content was converted to fraction of saturation by dividing by an estimated peat porosity of 0.9 (P. M. Lafleur, personal communication, 2011), and Western Peatland soil water content was divided by the maximum value observed during periods of inundation. No soil water content measurements were available at Lost Creek.
2.3. Ecosystem Models
 This study used model results from the NACP Site Level Interim Synthesis. Seven process-based models were run at all three peatland sites, representing different simulation strategies, temporal resolutions, and levels of complexity, but sharing in common the site-level meteorological driver data and investigator-provided site initial conditions described above. A summary of model characteristics is shown inTable 3. Important differences in model structure included number of soil layers and carbon pools, representations of hydrology, and calculation of the light environment for photosynthesis.
Soil layers is the number of soil layers used in model hydrology; Veg. C pools is the number of vegetation carbon pools; Psyn calculation is the model strategy for calculating photosynthesis; N cycle indicates whether the model included nitrogen cycling; Phenology indicates whether model leaf phenology was driven by internal model calculations or external satellite observations; Max soil moisture indicates whether the model was able to calculate saturated soil conditions or whether soil moisture above field capacity was directly partitioned to runoff.
 Four of the models simulated soil moisture values up to saturation, while the other three models partitioned soil water above field capacity directly to runoff and subsurface drainage, making them incapable of simulating saturated soil conditions. Of the models included in this study, only ecosys produced simulations of water table level. SiB and SiBCASA shared a soil moisture redistribution submodel based on the Richards equation. TECO included multiple soil layers, with water infiltrating from an upper to a lower layer when soil water in the upper layer was above field capacity. Ecosys explicitly calculated matric, osmotic, and gravimetric components of water potential and was the only model to include vertical variations in peat hydrological properties through the soil profile. Ecosyswas also the only model to include a representation of hummock and hollow topography. The model was run for one hummock and one hollow grid point, and the results were combined in a weighted average based on observed area fractions for the sites. LPJ, DLEM, and ORCHIDEE used two-layer soil models and therefore did not produce estimates of soil moisture at defined soil depths.
 Model formulations of the light environment could be divided based on whether models included multiple canopy layers and explicitly calculated light extinction and the properties of sun and shade leaves, or used a single layer “big leaf” model for photosynthesis. Ecosys explicitly calculated carboxylation rates for leaf surfaces defined by height, inclination, and exposure to light. DLEM and TECO used a two layer approach that included sunlit and shaded leaves. SiBCASA parameterized differences between sunlit and shaded leaves using an effective leaf mass calculation that weighted leaf mass based on expected nitrogen content for sunlit and shaded leaves. SiB, ORCHIDEE, and LPJ used the single layer “big leaf” approach, without considering sunlit and shaded leaves separately.
 Since hydrology is an important driver of peatland ecosystem processes, the model processes that connect soil respiration and photosynthesis to soil moisture are another important basis of comparison. Figure 1 shows the functions that relate photosynthesis and soil respiration to soil moisture fraction for six of the models. Soil moisture is represented as a fraction of saturation, where a value of 1.0 indicates that pore spaces are full and the soil cannot accommodate additional water. In order to include the models that do not simulate soil water fractions above field capacity, soil moisture values for those models were normalized by a field capacity fraction of 0.7. All of the models included in the photosynthesis plot have similar moisture limitation functions, with photosynthesis suppressed at low soil moisture and reaching a plateau at high soil moisture. LPJ and ecosyswere not included in the photosynthesis plot, because their calculations of moisture-related photosynthesis limitation could not be reduced to simple functions of soil moisture. LPJ calculates water stress on photosynthesis by first calculating non-water-stressed photosynthesis rate, and then optimizing canopy conductance based on water-limited evapotranspiration [Sitch et al., 2003]. Photosynthesis in LPJ is not limited by high-moisture conditions.Ecosys explicitly calculates water potentials and flows between soil, roots, plant tissues, and leaves, and allows for reduction of productivity as a result of saturated soils, through reductions in water and nutrient uptake by roots. Ecosys was the only model included in this study to include a process that suppresses photosynthesis at high soil moisture.
 Of the models included in the respiration plot, only SiB, SiBCASA, and DLEM suppress respiration under wet conditions. Heterotrophic respiration in ecosys involves growth and respiration of microbial communities that are limited by the availability of substrates, nutrients, and oxygen. This process could not be reduced to a simple function of soil moisture, but heterotrophic respiration rates are limited under both dry and saturated conditions [Grant et al., 2009].
 Models were initialized with a spin-up period intended to reach steady state conditions. According to the NACP synthesis activity protocol, steady state for the carbon cycle is reached when annual NEE is approximately zero when averaged over the last five years of model spin-up. Since peatlands are defined by long-term carbon accumulation and since the sites included in this study are all presently net carbon sinks of between 68 and 105 gC/m2/yr (Table 1), this steady state condition likely contributed to underestimation of CO2 uptake by models. Because peatlands contain large soil carbon pools relative to aboveground biomass pools and because northern peatland carbon accumulation is driven by low rates of soil decomposition, this bias was most likely manifested as an overestimate of soil respiration relative to photosynthesis. To estimate the magnitude of this bias, annual average GEP/ER ratios were calculated for observed and modeled fluxes and ER was multiplied by the ratio of these factors to produce an adjusted ER that matched the annual GEP/ER ratio of eddy covariance measurements. Adjusted NEE was then calculated by subtracting GEP from adjusted ER. The majority of this analysis used the original ER and NEE, and adjusted values are identified as such when they appear.
2.4. Statistical Analysis
 Residuals in this study were defined as simulated minus observed time series, so that positive residuals indicate an over-estimate of the time series by a model. Confidence levels for correlation coefficients were calculated using a two-tailed t test. In diurnal variation plots, error bars indicate the 95% confidence limits on the mean of each time period, based on a two-tailed t test.
 For diurnal plots (Figures 8–13), eddy covariance fluxes were divided into wet and dry periods on a weekly basis, with observations from weeks in the top 30th percentile of water table shown in blue and observations from weeks in the bottom 30th percentile of water table shown in red. Simulated NEE values from each model were similarly divided, based on weeks in the top (green) and bottom (orange) 30th percentiles of simulated soil moisture in the model layer closest to 20 cm below the surface. For models with only two soil layers, the reported root zone soil moisture was used. NEE plots were calculated using only non-gap-filled eddy covariance data, and only model data points corresponding to the included eddy covariance points. ER and GEP diurnal plots were calculated using all gap-filled eddy covariance data and all model data. LPJ and DLEM produced output with daily resolution and were not included in diurnal plots.
3.1. Model Simulations of Hydrology
Figure 2 shows representative ranges of simulated and observed summer soil moisture saturation fractions, as well as representative ranges of water table observations and water table simulated by the ecosys model. Ranges are bounded by the 10th and 90th percentiles of the soil moisture values for each soil layer. As in Figure 1, models with an upper soil moisture limit of field capacity were normalized by a field capacity of 0.7. The upper plots show vertical profiles for observations and models that included soil layers with explicit depths. The lower plots show the upper and lower soil layers of LPJ and the mean root zone soil moisture of ORCHIDEE, for which soil moisture in multiple layers was not available. DLEM did not provide soil moisture data for this comparison. In general, ecosys predicted wetter conditions than the other models, but with a moderate range of temporal variability. TECO predicted a wider range of soil moisture variability at each site than the other models. SiB, SiBCASA, ORCHIDEE, and LPJ predicted small ranges of variability. SiB predicted almost constantly saturated conditions at Mer Bleue, but was closely matched with SiBCASA at the other sites. Observations indicated very low soil moisture and low variability at Mer Bleue, where only LPJ predicted a similar range in the upper soil layer. Measured soil moisture at Western Peatland had a large range of variability, including very wet conditions. All models except LPJ overlapped with this range in their upper soil layers.
 If models are capturing the hydrological conditions at a site, they should simulate saturated soil moisture below the water table. Ecosys was the only model to predict saturated soil moisture below the observed water table level at any of the sites. Water table ranges predicted by ecosys (black arrows) were well matched to observations (white arrows) at Lost Creek and Mer Bleue, but predicted higher water table than observations at Western Peatland.
3.2. GEP to ER Ratios
 Comparing the ratio of GEP/ER for simulated and eddy covariance fluxes can help to assess the impact of the steady state assumption used in model setup. Models running in steady state should have annual ratios of approximately 1.0, while sites that are CO2 sinks should have ratios greater than one. Annual and summer ratios for eddy covariance fluxes and simulated fluxes for all models are shown in Table 4. SiB, TECO, and SiBCASA had annual ratios of approximately 1.0 for all three sites, indicating that they maintained steady state for CO2 fluxes. The other models all predicted a CO2 sink at Lost Creek and Mer Bleue, and all except ORCHIDEE predicted an annual sink at Western Peatland as well. While the synthesis protocol required models to reach a steady state of zero net CO2flux during the spin-up process, NEE was not necessarily zero following spin-up due to differences in environmental drivers between spin-up and subsequent simulations. Results were mixed for summer fluxes, with no consistent bias of GEP/ER ratio relative to eddy covariance data between models. LPJ predicted a ratio slightly above 1.0 for all sites, underestimating the growing season CO2 sink. There was no consistent pattern of bias in summer ratios relative to eddy covariance ratios for the other models.
Ratio in eddy covariance data (EC) and values for each model are shown. Annual ratios include all months of the year, and summer ratios include the months of June, July, and August. The 95% confidence limits on observed ratios are shown in parentheses.
Lost Creek annual
Lost Creek summer
Western Peatland annual
Western Peatland summer
Mer Bleue annual
Mer Bleue summer
3.3. Mean Model Bias
Figure 3 shows mean model residuals for flux components at the three sites, as well as adjusted ER and NEE. Annual average simulated GEP and ER at all three sites were significantly higher than eddy covariance values for all models. All models significantly overestimated annual NEE at Western Peatland and Mer Bleue, while three of the seven models had a significant positive bias of NEE at Lost Creek. Since negative NEE corresponds to CO2 uptake, this positive bias indicates an underestimate of the CO2 sink, which was expected as a result of the steady state assumption.
 Summer-only bias showed similar patterns to annual bias, but was somewhat less consistent between sites. All models significantly overestimated summer ER at Western Peatland and Mer Bleue. All models also overestimated summer GEP at Mer Bleue, as did the majority of models at Western Peatland. At Lost Creek, there was a larger range in model bias of summer fluxes, with some models overestimating and some models underestimating all three fluxes relative to eddy covariance values.
 The effect of steady state assumptions on ER and NEE can be estimated by comparing original values with values adjusted to match the observed ratio of GEP to ER. For the majority of models, this adjustment reduced ER, with the largest differences occurring at Western Peatland. However, some models predicted higher GER/ER ratios than eddy covariance, so that adjusted ER was higher than original modeled ER. Applying GEP/ER adjustments to modeled NEE resulted in substantial reductions in NEE residuals, changing residuals from positive to negative for many models. These results suggest that steady state model assumptions contributed significantly to model bias in NEE predictions.
Figure 4 shows mean bias for subsets of models divided according to important differences in model structure. The most consistent difference was between models that included functions to limit soil respiration in wet conditions and those that did not. Models that included this functionality (SiB, SiBCASA, DLEM, and ecosys) had significantly lower bias of annual ER at all three sites. However, the same subset also showed decreased bias in annual GEP at all sites, suggesting that this functionality was also associated with other model differences that lead to overall improvements in performance. Models with more than two soil layers (TECO, SiBCASA, SiB, and ecosys) had significantly less bias in both annual GEP and ER at Lost Creek and Mer Bleue compared to models with two soil layers, but multiple-layer models had higher bias at Western Peatland. Big leaf models (LPJ, SiB, SiBCASA, and ORCHIDEE) had slightly higher bias in GEP at Lost Creek compared to models including sunlit and shade leaves, but showed slightly lower bias at Mer Bleue. Differences in mean summer bias between model subsets did not show consistent patterns between sites.
3.4. Simulated CO2 Flux Residual Relationships With Observed Water Table
Figure 5 shows monthly mean June, July, and August model residuals for the three sites, plotted as a function of monthly mean observed water table. Figure 6 shows the correlation coefficient (Figure 6, top) and linear regression slope (Figure 6, bottom) describing the relationships between flux residuals and water table for each individual model as well as the mean of all models.
 At Lost Creek and Western Peatland, the two fen sites, residuals for GEP and ER were both positively correlated with water table for all models individually as well as the mean of all models, indicating that models overestimated both ER and GEP under high water table conditions relative to drier conditions. ER relationships were significant at the 95% level for all models at both fen sites. GEP relationships were significant for all models except DLEM at Western Peatland, and for three models as well as the model mean at Lost Creek. The slopes of the relationships were higher at Western Peatland than at Lost Creek, and slopes were consistent between models for GEP and ER for both sites. Correlations of NEE residuals with observed water table at the fen sites were positive for most, but not all, of the models, and were not significant at the 95% level for most models, indicating weaker relationships between observed water table and model-measurement mismatch in net CO2 flux. This suggests that errors in GEP and ER canceled each other.
 At Mer Bleue, the bog site, the majority of models also had positive correlations between GEP and ER residuals and observed water table while four of the models as well as the model mean showed negative relationships between NEE residuals and water table. Most of the relationships at Mer Bleue were not statistically significant at the 95% confidence level, although the mean of all models was significantly correlated with water table for ER, GEP, and NEE.
Figure 7shows correlation coefficient and slope between observed water table and monthly residuals for model subsets, divided as described above. The only site where model subsets were associated with significant differences in correlation coefficient was Mer Bleue, where models that included high soil moisture limitation of soil respiration and models with multilayer leaf functions were both associated with lower correlations between GEP residuals and water table compared to models without those attributes. The same pattern was evident for the water table-residual slopes.
3.5. Simulated and Observed Diurnal Cycles of NEE
 The diurnal cycle of NEE can illuminate features of both GEP and ER, and can be produced without including gap-filled values. Mean summer diurnal cycles of measured and simulated NEE at Lost Creek, Western Peatland, and Mer Bleue are shown inFigures 8, 9, and 10, respectively, divided into dry and wet modeled and observed periods as described above. Only data from non-gap-filled periods was included in these plots. At Lost Creek, measured daytime net CO2 uptake was slightly higher during dry periods than wet periods, while nighttime CO2 emissions were higher during dry periods than wet periods. Measurements at Western Peatland also showed higher nighttime CO2 emissions during dry periods, but did not show a significant difference in daytime CO2 uptake between wet periods and dry periods. Measurements at Mer Bleue showed higher daytime CO2 uptake during wet periods than dry periods, and no difference in nighttime emissions between wet and dry periods.
 At the fen sites, most of the models slightly overestimated nighttime CO2 emissions. TECO and ecosys overestimated peak daytime uptake at Lost Creek. Ecosys and ORCHIDEE underestimated peak daytime uptake at Western Peatland, while TECO overestimated daytime uptake. TECO predicted a sharp, early peak in uptake at all three sites. All models overestimated the magnitude of the diurnal cycle at the Mer Bleue bog site, and all but ecosys substantially overestimated nighttime CO2 emissions there.
 At Lost Creek, the dependence of modeled NEE on soil moisture was either weak or in the opposite direction from observations. SiB, TECO, and ecosys showed higher daytime update during wetter periods, and SiBCASA showed higher nighttime emissions during wetter periods. At Western Peatland, SiB predicted higher nighttime emissions during dry periods, in agreement with observations. ORCHIDEE and TECO predicted slightly higher daytime uptake during wetter periods, while ecosys predicted lower daytime uptake during wetter periods. At Mer Bleue, ORCHIDEE predicted much higher daytime uptake during wet periods, and SiBCASA and TECO also predicted increased uptake during wet periods, but to a lesser degree. In the case of ORCHIDEE, the contrast in sensitivity is likely due to the fact that the two fen sites were modeled using a forest plant functional type, while Mer Bleue was modeled using a grassland plant functional type. ORCHIDEE and TECO predicted significantly higher nighttime emissions at Mer Bleue during wet periods as well.
3.6. Diurnal Cycles of ER and GEP
 The diurnal cycles of ER and GEP, the components of NEE, can further illuminate sources of model-observation mismatch. These are shown for Lost Creek, Western Peatland, and Mer Bleue inFigures 11, 12, and 13, respectively. Data were divided into wet and dry periods using the same process as in the NEE figures. ER values are positive, and are shown with solid lines. GEP values are negative, and are shown with dashed lines.
 Eddy covariance ER and GEP were not strictly observed quantities, but were derived from observed NEE as described above. Patterns of model bias relative to eddy covariance values as well as differences between wet and dry periods were consistent with the patterns seen in the non-gap-filled NEE data, providing confidence that these results do reflect actual ecosystem processes rather than artifacts of the gap-filling process.
 At Lost Creek and Western Peatland, eddy covariance values of both ER and GEP were higher during drier periods. These relationships offset, leading to smaller differences between wet and dry periods in the diurnal cycle of NEE at these sites. In contrast to the fen sites, eddy covariance ER at Mer Bleue was not significantly different between wet and dry periods, and GEP was slightly higher during wet periods.
 As with NEE, the majority of ecosystem models simulated either no difference between wet and dry periods, or the opposite direction of change compared to eddy covariance results. At Lost Creek, ORCHIDEE showed no difference in GEP between dry and wet periods, while the other models simulated slightly higher GEP during wet periods. SiBCASA simulated higher ER during wet periods, while the other models showed no difference. At Western Peatland, SiBCASA and TECO simulated higher GEP during wet periods. Ecosys showed higher GEP during dry periods, in agreement with eddy covariance results but with a smaller magnitude of difference. TECO and ecosys simulated higher ER during wet periods, while SiB simulated slightly higher ER during dry periods, in agreement with the direction of the relationship identified in eddy covariance data but with a smaller magnitude of difference. At Mer Bleue, ORCHIDEE and TECO predicted substantially higher ER and GEP during wet periods, and SiBCASA simulated slightly higher GEP during wet periods. The other models showed no difference between wet and dry periods, in agreement with eddy covariance results.
 The magnitude and shape of modeled diurnal cycles at the fen sites were generally in agreement with eddy covariance values, although ecosys and SiBCASA predicted somewhat higher GEP than eddy covariance values at Lost Creek and SiBCASA overestimated GEP and ER at Western Peatland. Modeled ER was closer to eddy covariance values for dry periods than for wet periods for most of the models at both fen sites. TECO predicted an earlier daytime peak GEP than eddy covariance values at both fen sites. Despite large differences in model complexity, simpler models such as SiB did not perform significantly better than models such as ecosys, which includes many soil layers and specific parameterizations for wetland hydrology.
 At Mer Bleue, all models substantially overestimated GEP and daytime NEE relative to eddy covariance values, and all models except ecosys overestimated ER. SiB, SiBCASA, and TECO all predicted peak GEP early in the day, followed by suppressed GEP in the late morning and afternoon. This could be an indicator of simulated moisture stress within these models.
4.1. Correlations Between Model Residuals and Hydrology
 Hypothesis 1 stated that model residuals would be correlated with observed water table as a result of hydrology-driven peatland processes not included in the models. This hypothesis was confirmed at the fen sites by the positive correlation between model residuals of GEP and ER, suggesting that hydrological processes were important sources of model-data mismatch. At the bog site, the relationship was not consistent between all models but there was still a significant correlation for the mean of all models. The differences in eddy covariance CO2 fluxes between high and low water table periods at the fen sites suggest that both GEP and ER are suppressed under wet conditions, which is consistent with previous peatland field studies [Flanagan and Syed, 2011; Silvola et al., 1996; Strack et al., 2006; Sulman et al., 2009]. Of the seven models included in this study, four (SiB, SiBCASA, ecosys, and DLEM) include processes that suppress ER under saturated conditions, and only ecosys includes a process that suppresses GEP under saturated conditions. Although the majority of models were capable of simulating the observed sign of the relationship between ER and soil moisture, only one predicted increased ER during dry periods at any site. Models that included processes for suppressed ER at high soil moisture had significantly lower correlations between ER residuals and water table at Mer Bleue compared to models that did not include those processes, but there was no significant difference at other sites. This was most likely a consequence of models' inability to accurately predict saturated conditions in peatland soils. Only ecosys consistently predicted saturated conditions below the water table at the three sites. Furthermore, three of the models partition moisture above the soil's field capacity directly to runoff and subsurface drainage, making them incapable of simulating saturated soil conditions at all. If they cannot successfully simulate wetland hydrological conditions, even models that include responses of respiration and photosynthesis to saturated soil cannot successfully replicate the observed relationships with hydrology.
 The fact that fens, by definition, are fed by incoming water flows makes accurately simulating hydrology in these ecosystems more difficult. For example, the Lost Creek site is fed by a stream, and the water table responds to changes in streamflow that can result from such factors as upstream precipitation, regional water management, and downstream beaver dam-building activity [Sulman et al., 2009]. The difficulties presented by local water flows are consistent with the results of Yurova et al. , a modeling study at a minerotrophic mire. That study found good agreement between measured and modeled water table during periods of the year dominated by precipitation events, but poor agreement when site hydrology was dominated by snowmelt. While snowmelt was not a focus of this study, it is an example of a hydrological process that integrates lateral flows and inputs from a larger spatial area, and can contribute significantly to variations in seasonal CO2 fluxes in some ecosystems [Aurela et al., 2004; Hu et al., 2010].
Bond-Lamberty et al. addressed the issue of lateral inflow in a modeling study by including site-specific information about the modeled site's relationship with the surrounding watershed.Pietsch et al. used a similar approach, including explicit information about timing and magnitude of flood events. While these approaches do address some of the issues with modeling wetlands that are influenced by lateral inflows, they require fairly detailed information about regional hydrology. Including this local information in large-scale modeling studies would not be feasible, but a regional hydrological model combined with an accurate elevation map could be used to simulate redistribution of surface and groundwater over a region, providing a good alternative. Wetland location and fractional area could be predicted based on topographically low areas that collected water in hydrological simulations, and water table variations could be calculated based on modeled water flows. This information could then be incorporated into larger grid scales using a fractional area approach. Examples of global-scale models incorporating this type of sub-grid-scale peatland fractional area approach includeGedney and Cox , Kleinen et al. , and Ringeval et al. .
4.2. Effect of Model Structure on Mean Bias
 Hypothesis 2 stated that models with more complex hydrology would produce more accurate simulations of peatland CO2 fluxes. In fact, models with more than two soil layers did not consistently have less mean bias than models with more soil layers. Models that included processes for reducing soil respiration at high soil moisture did have less mean bias in both GEP and ER than models that did not include those processes. These results suggest that increased vertical resolution of soil processes is not sufficient for improving model performance at peatlands. More explicit connections between hydrology and carbon cycling are necessary.
4.3. Contrasting Results Between Bogs and Fens
 Hypothesis 3 stated that models would perform better at fens than at the bog. This hypothesis was confirmed by the relative fidelity of modeled diurnal cycles at fens compared to the large overestimates of the magnitudes of diurnal cycles at the bog site. These differences suggest that fens and bogs should be considered separately in modeling studies that include peatlands. The successful results at fens suggest that extensive model changes such as the development of fen-specific plant functional types are not necessary, and that improving modeled hydrology and effects of saturated soils on respiration and photosynthesis would be sufficient.
 Accurately representing bogs in general ecosystem models is likely more difficult. While GEP bias could be addressed by introducing bog-specific maximum photosynthesis rate parameters, the unique chemistry, nutrient levels, and plant communities of bog ecosystems require additional specific parameterizations to be added to general ecosystem models. Distinguishing between fen and bog wetlands could be problematic for large-scale studies, where spatial maps that distinguish bogs from other ecosystem types may not be available, and the spatial resolution of the model will be much larger than the scale of heterogeneity between peatland types. Fractional area approaches based either on digital elevation maps and topography-based classifications or on statistical predictions of peatland type areal coverage could provide a solution to this problem.
4.4. Aspects of Peatlands Not Included in General Ecosystem Models
4.4.1. Heterogeneity at Small Spatial Scales
 Variations in site topography at small scales, in the form of hummock and hollow landforms with sizes on the order of 1 to 100 m, contribute significantly to site hydrology, plant community composition, and carbon fluxes [Strack and Waddington, 2007; Waddington and Roulet, 1996]. Becker et al. suggested that topographical variations on scales as small as 25 cm may be important for accurately calculating carbon fluxes in wetlands with hummock-hollow topography.Sonnentag et al. , in a spatially explicit modeling study at Mer Bleue, successfully simulated water table responses to precipitation at the bog, but demonstrated that lateral flows within the bog contributed significantly to the overall variations in water table. Govind et al.  also used a spatially explicit model to investigate CO2 fluxes under different hydrological scenarios, and found significant differences in net CO2 flux between scenarios that did or did not include topographically driven hydrological flows within the peatland ecosystem. However, in a recent study at Mer Bleue, Wu et al.  found that differences in net CO2 flux between hummocks and hollows could be successfully accounted for by using an average of parameters for each microsite, weighted by relative areas. Of the ecosystem models included in this study, only ecosyssimulated hummock and hollow topography and internal lateral flows. Small-scale heterogeneity is further complicated by the formation of peatland macropores and pipes, which lead to preferential pathways for water and carbon flows that can be decoupled from the processes that drive near-surface flows [Limpens et al., 2008].
 Small-scale variations in topography lead to variations in vegetation. In this study, Western Peatland and Mer Bleue were examples of peatlands that support heterogeneous vegetation, including areas of sedges, shrubs, and small trees. This is problematic for computation of the light environment, as most of the models included in this study calculate light attenuation as a function of LAI or canopy depth, implicitly assuming a horizontally homogeneous canopy. In simulations of Mer Bleue,Sonnentag et al. determined that a multiple-layer canopy including separately mapped tree, shrub, and moss layers was necessary for an accurate simulation. Of the models included in this study, onlyecosys incorporated this type of canopy heterogeneity, by separately modeling hummock and hollow areas. Failure to simulate the separate contributions of different vegetation layers likely contributed to model bias of GEP at Western Peatland and Mer Bleue.
Baird and Belyea suggested that sub-grid-scale peatland processes could be parameterized in low-resolution models through a multiscale modeling method. Peatland landscapes within a grid cell would be identified using high-resolution remote sensing and elevation data. Representative samples of each peatland type would be simulated at high resolution, including lateral flows and topography within the peatland, and the results would be scaled up to the coarse resolution grid scale.
4.4.2. GEP Under Saturated Soil Conditions
 Under the moisture limitation schemes used in the models included in Figure 1, GEP decreases under dry conditions when photosynthesis is limited by moisture stress, and moisture is not a limiting factor for GEP under wet conditions. In peatlands, high water tables can provide a consistent source of water that prevents moisture limitation except during exceptionally dry periods. During wet periods, saturated soil can cause plant stress due to reduced availability of oxygen and buildup of toxins in the root zone [Pezeshki, 2001; Mitsch and Gosselink, 2007]. Thus, ecosystem models used in peatland-rich areas could be improved by a moisture limitation parameterization that suppresses GEP under both very dry conditions and very wet conditions. Biological adaptations such as air spaces in the roots (aerenchyma) allow flood-tolerant plant species to transport oxygen below the water line, mitigating the impact of soil saturation on plant function. However, since these adaptations are limited to specific wetland plant species, including them in models would require calibration of plant functional types to match the other photosynthetic and physiological properties of wetland communities. The only model included in this study that included detrimental effects of saturated soils on plant function wasecosys, which simulated lower GEP and ER in hollows compared to hummocks. Ecosys did predict higher GEP under dry conditions at Western Peatland, but not at Lost Creek, possibly due to differences in simulated hydrology between the sites. A mechanism for including saturation stress was integrated into peatland plant functional types added to the LPJ model in a previous study [Wania et al., 2009], although they concluded that their modified model still over-estimated net primary production at peatlands.
 Changes in productivity can directly impact ER by affecting autotrophic respiration. Most of the models included in this study calculated autotrophic respiration either as a fixed fraction of productivity, or as a function of temperature and living biomass. Ecosys explicitly included oxygen limitation of root respiration, but the other models did not. Eddy covariance data could not be partitioned into autotrophic and heterotrophic components of respiration, so model predictions of autotrophic respiration could not be evaluated against measurements. Few peatland carbon cycling studies have explicitly considered the sensitivity of autotrophic respiration to hydrology, and further research is needed in this area.
 Further complicating the relationship between water table and GEP is the importance of time scale. Long-term decline of water tables can cause changes in dominant plant communities from mosses and graminoids to shrubs and trees over time scales of five to ten years [Flanagan and Syed, 2011; Strack and Waddington, 2007; Talbot et al., 2010; Weltzin et al., 2003]. This suggests that model simulations of GEP could be improved by including dynamic plant communities that shift between grassy and woody dominance depending on water table elevation. Over shorter time scales, flooding can introduce additional nutrients to ecosystems without causing long-term anoxia in soils, potentially increasing productivity.
4.4.3. Steady State Model Conditions and Non-CO2 Carbon Fluxes
 Analysis of GEP/ER ratios and adjusted ER and NEE showed that the steady state condition of model spin-up used in this study led to overestimates of ER and underestimates of net ecosystem CO2 uptake. The approach used here to estimate the amount of bias introduced depended on observed GEP/ER ratios, and thus could not be used for studies where direct observations of CO2fluxes are not available. Accurate simulations of NEE and ER may require parameterizations informed by ecological histories and independent estimates of peat carbon pools. Estimates of typical long-term peat accumulation rates based on peat cores could be used to develop alternative steady state conditions for model initialization. Models that include the hydrological processes necessary for peat accumulation could then be spun up using a condition of constant soil carbon accumulation rate rather than constant soil carbon pool size.
 The importance of non-CO2 fluxes such as methane and dissolved organic carbon (DOC) in peatland carbon balances further complicates the application of steady state model conditions to peatlands. For example, at Mer Bleue, Roulet et al.  found that DOC and methane fluxes accounted for carbon losses equivalent to 37% and 9% of NEE, respectively, over a five year period, and that ignoring these fluxes could lead to substantial overestimates of net carbon uptake in some years, and to estimating a carbon sink instead of a source in other years. In a regional study in northern Wisconsin, USA, Buffam et al.  estimated that DOC and methane fluxes accounted for 17% and 10% of peatland NEE, respectively. Billett et al.  reported that C loss in drainage and downstream evasion was greater than or equal to CO2 uptake at a peatland complex in Scotland, and Hope et al.  estimated that downstream evasion of CO2 and CH4 accounted for 28–70% of the net peatland C accumulation rate when divided by the watershed area. Clymo suggested that for peat-accumulating wetlands, a steady state can only be reached when carbon inputs from NEE are balanced by losses of methane and DOC from submerged peat. Based on these results, a carbon budget or steady state assumption based only on CO2 is not sufficient for characterizing the actual carbon balance of a peatland ecosystem.
 These fluxes pose additional complications for including peatlands in general ecosystem models, but they can feasibly be included. Methane production has been included in models related to those included in this study. These include versions of ORCHIDEE [Ringeval et al., 2011], LPJ [Hodson et al., 2011], DLEM, and ecosys. While the transport and evasion of dissolved carbon depends on detailed hydrology and surface flow, dissolved carbon could be included in the peatland carbon budget by assuming that all dissolved carbon will ultimately be released to the atmosphere over relatively short time scales compared to other carbon accumulation processes. In that case, dissolved carbon could simply be treated as an additional source of carbon to the atmosphere, and models would only need to include processes for dissolved carbon production, which could be parameterized as an additional form of anaerobic decomposition. Ecosys does include dissolved carbon production, but this process was not included in the other models in this study.
 The consistent positive bias in model predictions of GEP and ER for all three sites suggests that ignoring peatlands could lead to systematic overestimates of productivity and respiration in modeling studies of peatland-rich regions. Therefore, it is important for modelers to consider the impact of peatland areas when designing large-scale modeling studies and interpreting their results.
 Our results did show that non-wetland-specific ecosystem models can produce fairly accurate simulations of NEE at fen wetlands, especially during relatively dry periods. Specific areas for improvement include:
1. Improved simulations of site hydrology are required for correctly simulating responses of ecosystem respiration to changes in hydrology for the majority of models included in this study. Coupling carbon cycle models with hydrological models that include regional flows and small-scale topographical variations could help with incorporating processes important to wetland hydrology, as could including explicit treatment of saturated soil conditions and a variable water table.
2. Suppression of both photosynthesis and respiration under saturated conditions should be included in models used at wetlands in order to match observed effects. Hydrology-related succession could also improve simulations.
 Models substantially overestimated both photosynthesis and respiration at the bog site, suggesting that more effort is necessary in order to successfully model bogs using general ecosystem models. Additional measurements from other bog ecosystem sites that contrast with the relatively dry Mer Bleue site are needed in order to evaluate model performance in bog ecosystems representing a broader range of environmental conditions. It may be necessary to add bog-specific plant communities or plant functional types to models that will be used for these ecosystems. Furthermore, large-scale modeling projects need to develop strategies for distinguishing between fens and bogs, since these ecosystems are too different to be treated as a single “wetland” ecosystem type.
 We would like to thank the North American Carbon Program Site-level Interim Synthesis team for data collection, organization, processing, and distribution. Data infrastructure was supported by the Oak Ridge National Laboratory Distributed Active Archive Center. Lost Creek measurements were supported by the Department of Energy (DOE) Office of Biological and Environmental Research (BER) National Institute for Climatic Change Research (NICCR) Midwestern Region Subagreement 050516Z19. Funding for research at Western Peatland, which was part of Fluxnet – Canada and the Canadian Carbon Program (CCP) research networks, was provided by grants from the Natural Sciences and Engineering Research Council of Canada (NSERC), the Canadian Foundation for Climate and Atmospheric Sciences (CFCAS), and BIOCAP Canada. Funding for Mer Bleue was provided by the NSERC Strategic Grants Program, and CFCAS and BIOCAP – Canada network funding for Fluxnet – Canada and subsequently CCP. Analysis was partially supported by National Science Foundation (NSF) Biology Directorate grant DEB-0845166.