Corresponding author: Y. He, Department of Earth, Atmospheric and Planetary Sciences, Purdue University, 550 Stadium Mall Dr., West Lafayette, IN 47907–2051, USA. (firstname.lastname@example.org)
 Model-data fusion is a process in which field observations are used to constrain model parameters. How observations are used to constrain parameters has a direct impact on the carbon cycle dynamics simulated by ecosystem models. In this study, we present an evaluation of several options for the use of observations in modeling regional carbon dynamics and explore the implications of those options. We calibrated the Terrestrial Ecosystem Model on a hierarchy of three vegetation classification levels for the Alaskan boreal forest: species level, plant-functional-type level (PFT level), and biome level, and we examined the differences in simulated carbon dynamics. Species-specific field-based estimates were directly used to parameterize the model for species-level simulations, while weighted averages based on species percent cover were used to generate estimates for PFT- and biome-level model parameterization. We found that calibrated key ecosystem process parameters differed substantially among species and overlapped for species that are categorized into different PFTs. Our analysis of parameter sets suggests that the PFT-level parameterizations primarily reflected the dominant species and that functional information of some species were lost from the PFT-level parameterizations. The biome-level parameterization was primarily representative of the needleleaf PFT and lost information on broadleaf species or PFT function. Our results indicate that PFT-level simulations may be potentially representative of the performance of species-level simulations while biome-level simulations may result in biased estimates. Improved theoretical and empirical justifications for grouping species into PFTs or biomes are needed to adequately represent the dynamics of ecosystem functioning and structure.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
 The northern circumpolar permafrost region was recently reported to contain 1672 Pg of organic carbon (C) in soil, which amounts to about 50% of total global belowground organic C [Tarnocai et al., 2009]. In the past three decades, Arctic and boreal regions have been warming much more rapidly than the global average, and this warming may be significantly altering terrestrial ecosystem C and nitrogen (N) cycling [Arctic Climate Impact Assessment, 2005; McGuire et al., 2006; Overland et al., 2004]. The boreal forest plays an important role in the global C budget given its large amount of C storage and sensitivity to climate change [Gower et al., 2001]. The Alaskan boreal forest occupies about 52 million hectares within the state and represents 15% of all boreal forest in the northern hemisphere [Yarie and Billings, 2002]. Although Alaska constitutes only a small portion of the boreal forest present in the world, it represents an area where anthropogenic disturbances have a limited impact. It is also considered to be sensitive to global climate change due to the fire-prone forest types in the region [e.g., Black spruce. Van Cleve et al., 1983a, 1983b] and the large amount of C stored in soils, the fate of which is uncertain under changing climate conditions. The study of C and N dynamics in this region is even more significant given projected warming of the circumpolar boreal forest [Hobbie et al., 2002].
 Modeling is an integrated tool for estimating C balance at regional scales and for testing hypotheses, which could in turn help in the design and implementation of field studies. One of the pivotal links between modeling and field ecological studies is the use of field observations to constrain parameters in models. Some parameters can be determined via literature review, some can be estimated directly based on field observations, and those that are difficult to retrieve through field studies can be estimated with model-data-fusion techniques [Braswell et al., 2005; Keenan et al., 2012a; Moore et al., 2008; Sacks et al., 2007; Weng and Luo, 2011; Williams et al., 2004; Williams et al., 2009].
 To date, the manner in which models should be parameterized is still a major source of uncertainty in terrestrial ecosystem modeling. Some studies have estimated vegetation responses to climate based on biomes. For example, King et al.  divided the earth's land surface into 13 biomes in a terrestrial biosphere model to explain the missing C sink in the global C cycle. There are also some studies that take a species-level approach to estimate different climatic responses among species [e.g., Clein et al., 2002, 2007; Schurgers et al., 2011; Yarie and Billings, 2002]. Many models that are applied in continental scale ecosystem studies now use an intermediate modeling approach based on plant functional types (PFTs) (e.g., LPJ, [Sitch et al., 2003], BIOME-BGC, [White et al., 2000], CLM, [Lawrence et al., 2011; Oleson et al., 2010], and BETHY [Kattge et al., 2009]), which are defined as discrete classes that group species with presumed similar roles in the ecosystem or observed correlations among their characteristics [Lavorel et al., 2007]. However, species-, PFT-, and biome-level approaches each has several weaknesses and challenges. The ability to develop and implement a species-level approach to calibration and application is often limited by the availability of fine-resolution remote sensing or inventory information. However, one perceived weakness of the aggregated PFT- and biome-level approaches revolves around the fact that species-specific parameters could vary by an order of magnitude among species within a forest [Condit, 2006], which bring into question the robustness of generic representations of interspecific characteristics. Furthermore, PFT- and biome-level parameterizations may be biased because they may not properly represent the functional and structural characteristics of the species within a region [Alton, 2011; Schurgers et al., 2011; Van Bodegom et al., 2012]. Field studies, eddy flux studies, and remote-sensing studies have revealed a broad range of leaf ecophysiological traits as well as key plant parameters within any given PFT and substantial overlap among PFTs [Alton, 2011; Reich et al., 2007]. The assumed parameter value distribution and the use of parameter values within a particular range used for each PFT can cause important regional differences in modeled C dynamics [Alton, 2011]. Indeed, despite the seemingly discrete average value of plant functional traits among aggregated categories, the variation of most ecologically important traits across species is naturally continuous with wide spread and significant overlaps [Reich et al., 1997, 1999, 2007; Wright et al., 2004]. It is worthy to note that some data collection efforts (e.g., TRY, a global plant trait database, www.try-db.org/ [Kattge et al., 2011]) are amassing tremendous amount of data that may make trait-based continuous model parameterization feasible in the future. Other studies that have identified key traits to be incorporated into vegetation classifications also provide an empirical basis for more robust modeling of global vegetation [e.g., Diaz et al., 2007].
 The manner in which vegetation is classified (i.e., vegetation schemes) also directly influences how observations data can be used to constrain parameters in models. For example, in an ecosystem model, each PFT or biome class is represented by an “individual plant” with the average biomass, C fluxes, and nutrient availability of the class. In most cases, the average plant is represented as a weighted average of species-specific characteristics [Bonan et al., 2003; Schurgers et al., 2011], whereas in species-level modeling, the species-specific data can be used directly in model calibration.
 In this study, we present a synthesis example of several alternatives for the use of observations in modeling, and we explore the implications of these options on both modeling and observational activities. We calibrated a recent version of the Terrestrial Ecosystem Model (TEM) [Chen and Zhuang, 2013; Zhuang et al., 2003] for a hierarchy of three levels of vegetation classification of the Alaskan boreal forest: species level, PFT level, and biome level, and we examined the differences in simulated C cycling and discussed their implications. Our study also demonstrates how models can be used as a heuristic tool to help guide observational studies of ecosystem dynamics.
 We used field data for the C and N fluxes and pools from five major boreal forest species to parameterize a process-based biogeochemistry model, the TEM [Zhuang et al., 2001, 2003; Zhuang et al., 2002; Chen and Zhuang, 2013], on three hierarchical levels: species level, PFT level, and biome level. The five major boreal forest species in the boreal forest of Alaska include white spruce (Picea glauca (Moench) Voss), black spruce (Picea mariana (Mill.) BSP), paper birch (Betula papyrifera Marsh.), quaking aspen (Populus tremuloides Michx.), and balsam poplar (Populous balsamifera L.). We then applied the model to a part of the Alaskan boreal forest dominated by the five forest species to simulate C dynamics from 1922 to 2099.
2.2 Description of the Terrestrial Ecosystem Model
 TEM is a process-based, regional to global-scale ecosystem model that is driven by spatially explicit data on climate, vegetation, soil, and elevation to estimate monthly pools and fluxes of C and N in the terrestrial biosphere. The underlying equations and parameters have been well documented [McGuire et al., 1992; Raich et al., 1991], and the model has been applied to a number of studies in high latitude regions [e.g., Clein et al., 2000; McGuire et al., 2000a, 2000b; Zhuang et al., 2002, Zhuang et al., 2003, 2004, 2007; Balshi et al., 2007; Euskirchen et al., 2006, 2009]. In this study, we used version 5.0 of TEM which has been described in detail by Zhuang et al.  and the core C and N dynamics module has been used in several recent studies [e.g., Chen and Zhuang, 2013; Sui et al., 2013]. TEM 5.0 explicitly couples biogeochemical processes with the soil thermal dynamics of permafrost and nonpermafrost soils and therefore is applicable to simulating the dynamics of the boreal forest ecosystems that dominate this region. In TEM, net ecosystem production (NEP) is calculated as the difference between the uptake of atmospheric CO2 associated with photosynthesis (i.e., gross primary production or GPP) and the release of CO2 through (1) autotrophic respiration (RA) associated with plant growth and maintenance respiration and (2) heterotrophic respiration (RH) associated with decomposition of organic matter. The fluxes GPP, RA and RH are influenced by changes in atmospheric CO2, climate variability and change, and the freeze-thaw status of the soil [Zhuang et al., 2003]. Net primary production (NPP) is calculated as the difference between GPP and RA.
2.3 Model Parameterization
 In the species-level simulation, we treated the Alaska boreal forest as comprised of the five dominant species and parameterized the model separately for each species; in the PFT-level simulation, we clumped the two species of spruce into needleleaf evergreen forest and the other three species into broadleaf cold deciduous forest and parameterized the model for the two PFT types; in the biome-level simulation, the boreal forest is treated as a single ecosystem type based on an ensemble parameterization for the biome. The target fluxes and pools for the PFT and biome parameterizations were derived from area-weighted averages of the target fluxes and pools for each species (Table 1). A number of assumptions and empirical relationships were necessary to estimate the target fluxes and pools used in calibrating the model (Table 2).
Table 1. Pools and Fluxes Used to Calibrate the Rate-Limiting Parameters of the Terrestrial Ecosystem Modela
The pools and fluxes of PFT-level and biome-level simulations were calculated as weighted averages across the corresponding areas of each species.
Units for annual gross primary production (GPP), net primary production (NPP), saturation response of NPP to N fertilization (NPPSAT), and annual N uptake by vegetation (Nuptake) are g C m−2 yr−1 and g N m−2 yr−1, respectively. Units for vegetation carbon (CV) and soil carbon (CS) are g C m−2. Units for vegetation N (NV), soil N (NS), and available inorganic N (NAV) are g N m−2.
Table 2. Sources and Assumptions for Deriving the Target Pools and Fluxes Values of Five Boreal Forest Species in Table 1
Sources and Calculation Assumptions
Based on Table 7 of Ryan et al. , assuming NPP/GPP = 0.2 for needleleaf coniferous forest, 0.34 for broadleaf cold deciduous forest; GPP value of black spruce is based on Table 1 of Clein et al. .
Aboveground + belowground biomass [Viereck et al., 1983, Table 4], assume belowground = 25% of aboveground.
Based on Table 4 of Viereck et al. , assume carbon = 47.5% of total biomass.
Based on Figure 2 of Powers and Van Cleve , Nv = leaves + branches + trunk.
Based on Figure 2 of Powers and Van Cleve, 1991, Cs = forest floor + litterfall.
Same as for Cs.
Based on Tables 2 and 3 of Weber and Van Cleve  for black spruce; Estimated for other species.
Assume 50% saturation response of NPP to N fertilization.
 The calibration of rate-limiting parameters for TEM requires field-based estimates of C and N pool and flux sizes, and environmental variables (long-term climate data, soil texture, etc.) for a specific site (representing a certain species, PFT, or a biome) as target values (see Clein et al.  for additional details beyond those presented here). Note that in this study, we did not evaluate how uncertainty in the field-based estimates/target values influences parameter estimates, i.e., we assumed that there was no error in the target values. The rate-limiting parameters used by the model, which control vegetation and soil C and N cycles, are obtained by first adjusting parameters for the C cycle with no N feedback on GPP and then adjusting parameters for the N cycle with N feedback on GPP. The order of adjusting the C cycle parameters is to adjust the rate-limiting parameter for GPP, followed by that for RA, and then followed by that for RH. The rate-limiting parameter for GPP is then adjusted so that NPP is set to a value of no N limitation to production. At this time, N feedback is implemented, and the rate-limiting parameters for N uptake by plants and N uptake by microbes are set until NPP and N uptake by plants are equal to their target values. The initial value of each parameter is based on the calibration results for boreal forest in previous studies [Zhuang et al., 2003]. For each time a parameter is adjusted, the model continuously does integrations driven by long-term average climate data and the initial atmospheric CO2 concentration of the simulation period (295 ppm as in year 1922) until the modeled annual NEP converges to nearly zero [Clein et al., 2002; Zhuang et al., 2001]. The model outputs are then checked to make sure that the simulated fluxes (annual NPP, GPP, N uptake) and pool sizes (soil C and N, available N) match with the field-based estimates of the calibration site within a certain tolerance (e.g., 1%). If this criterion is not met, the parameter will be adjusted up or down based on how the parameter affects the biogeochemical process until the target value is reproduced within the criteria. This set of optimized rate-limiting parameters (Table 3) for a specific site is then used for regional extrapolation of the model. One of the assumptions in the parameterization process is that the selected target site functions as a mature ecosystem. In other words, the modeling system reaches equilibrium when its C and N pools do not change with time given no disturbance. Due to concerns of parameter covariance and equifinality in the model calibration [Medlyn et al., 2005], and due to lack of sufficient data to constrain moisture-related C and N regulating parameters, we chose to calibrate only parameters that are most influential on vegetation C and N assimilation (Cmax, Nmax), autotrophic and heterotrophic respiration (Kr, Kd), vegetation C and N turnover (Cfall, Nfall), and microbial N uptake (Nup) [Tang and Zhuang, 2009]. Note that these parameters control processes at the monthly time scale and are not equivalent to ecosystem physiology parameters that are often measured at the second to minute time scale.
Table 3. Rate-Limiting Parameters That Were Calibrated in This Study
WS = White Spruce; BS = Black Spruce; BR = Birch; AP = Aspen; PP = Poplar; Needle = Needleleaf evergreen forest; Broad = Broadleaf cold deciduous forest.
Vegetation C and N assimilation rates
g m−2 mo−1
Maximum monthly rate of C assimilation
g m−2 mo−1
Maximum monthly rate of N uptake by vegetation
Plant and soil respiration rates
g g−1 mo−1
Plant respiration rate at 0°C per gram vegetation C
g g−1 mo−1
Heterotrophic respiration rate at 0°C per gram soil organic C at optimum soil moisture
Vegetation C and N turnover and microbial N uptake
g g−1 mo−1
Proportion of vegetation carbon loss as litterfall monthly
g g−1 mo−1
Proportion of vegetation N loss as litterfall monthly
Monthly ratio between N immobilized and C respired by heterotrophs
2.4 Regional Input Data
 We used the static spatially explicit data sets of soil texture and elevation from Zhuang et al. . For vegetation data, a forest type map from Ruefenacht et al.  was resampled from 250 m to a 0.05° × 0.05° (longitude × latitude) resolution (Figure 1). In addition, we used daily time series data of air temperature, precipitation, and vapor pressure from the Vegetation Ecosystem Modeling and Analysis Project [Kittel et al., 2000] and averaged to monthly temporal resolution. Specifically, we used the historical climate (1922–1996) and the future HadCM2 scenario (1997–2099) for this study. The atmospheric CO2 concentration data for the historical period (1765–1990) were developed from Enting et al. . The future atmospheric CO2 concentrations (1990–2100) were estimated by the Bern global C cycle model for IS92a emission data [Joos et al., 1996].
2.5 Simulation Protocol
 All three sets of simulations were based on the same vegetation distribution map and climate data but with different vegetation classification methods, i.e. species, PFT and biome. The total Alaskan boreal forest, which includes other forest species in addition to the five species considered in this study, occupies about 52 million hectares, of which 84% is covered by the five forest species. We simulated C dynamics from 1922 to 2099 for only that part of the Alaskan boreal forest dominated by the five species.
3.1 Comparison of Parameter Estimates Among Parameterization Methodologies
 Calibrated key plant parameters exhibited a broad range among the parameterization methodologies (Table 3). There was a substantial overlap among needleleaf and broadleaf species for all of the calibrated parameters of the species-level parameterizations. The needleleaf parameterization had parameter values that were more similar to those for white spruce than to those for black spruce, which reflects the greater area of white spruce (51% of the area) than that of black spruce (35% of the area). Similarly, the broadleaf parameterization had parameter values that were more similar to those for paper birch (11 % of the area) than to those for aspen and poplar (3% of the area). The most notable parameter difference between the needleleaf and broadleaf parameterization are the estimates for the maximum rate of N assimilation (42.6 vs. 162.6 g m−2 mo−1), the proportion of vegetation N loss as litterfall (0.0059 vs. 0.017 g g−1 vegetation N mo−1), and the base gram specific rate of heterotrophic respiration at 0°C (0.0017 vs. 0.0041 g g−1 organic matter mo−1). The comparison of these parameters indicates that broadleaf forests have a faster rate of N cycling than needleleaf forests with respect to rates of N uptake, the rates of N loss in litterfall, and rates of organic N released in inorganic forms in decomposition. It is notable that the biome-level parameterization had parameter values that are more similar to those of the needleleaf parameterization than to those of the broadleaf parameterization, which suggests that it may not represent the function of broadleaf forests within the region.
3.2 Regional C Dynamics of Vegetation Classifications
 In comparison to the species-level and PFT-level simulations, the biome-level simulations produced the lowest estimates for NPP and NEP for both the 1990s and 2090s (Table 4). The total annual average NPP and NEP during the 1990s were 100.3 Tg C yr−1 and 8.0 Tg C yr−1 for the species-level simulations, respectively; 97.5 Tg C yr−1 and 8.1 Tg C yr−1 for the PFT-level simulations; and 89.8 Tg C yr−1 and 7.1 Tg C yr−1 for the biome-level simulations. White spruce was responsible for more than half of the total NEP in the species-level simulations, followed by black spruce, birch, aspen, and poplar. The rank order of NEP among the five species corresponded to the relative magnitude of their geographical coverage, with white spruce covering slightly more than half of the total area of the five species, and poplar covering the least area. Needleleaf evergreen forest (white spruce and black spruce) dominated the NEP of the species-level and PFT-level simulations due to their wide spread distribution in the Alaskan boreal forest (Table 4).
Table 4. Mean Annual NPP and NEP in the Alaskan Boreal Forest Over the Period of 1990–1999 and 2090–2099 in Species Level, PFT Level, and Biome-Level Simulations
Area indicates the percentage of specific species coverage in total terrestrial area of Alaska.
Black Spruce (needleleaf)
White Spruce (needleleaf)
Black Spruce (needleleaf)
 A similar order in the magnitude of C fluxes among the three levels of parameterization strategies generally held through the whole simulation period. The biome-level simulations had the lowest heterotrophic respiration and cumulative NEP (Figure 2), whereas the species-level simulations estimated slightly higher heterotrophic respiration and cumulative NEP than that of the PFT-level simulations. By the end of the 21st century, the species-level simulations predicted a total C sequestration of around 1.75 Pg C, followed by the PFT-level simulations with 1.65 Pg and biome-level simulation with 1.50 Pg (Figure 2). The 0.25 Pg C differences between the highest and lowest estimates is equivalent to 2.6 g C m−2 yr−1 during the 178 years from 1922 to 2099.
 In general, the species-level simulations produced similar estimates to those of the PFT-level simulations estimates for soil and vegetation C and N pools (Figure 3). The biome-level simulations had the lowest estimates for soil organic C and N and vegetation C, but the highest estimate for vegetation N. Vegetation and soil C both increased in the three simulations, indicating an enhanced C sink for atmospheric CO2. There was small increase in vegetation N and a small decrease in soil organic N during the simulations because of a reallocation of N from soil to vegetation.
3.3 C Dynamics Within Species, PFT, and Biome Classifications
 Several distinct differences were observed between needleleaf and broadleaf simulations, and these differences are minimal among the corresponding species within these PFT classes. Needleleaf evergreen forest had a statistically significant higher GPP (g C m−2 yr−1) than broadleaf cold deciduous forest during the 1990s (Figure 4a; paired sample t test p < 0.05, t = 33.3, n = 10); the GPP of needleleaf forest was intermediate the GPP of the needleleaf species, but the GPP of the broadleaf cold deciduous forest was less than the GPP of the broadleaf species. Despite the high GPP in needleleaf forest, the NPP of the needleleaf forest is much lower than that of broadleaf cold deciduous forest (Figure 4b) because of a higher consumption of C in maintenance respiration (autotrophic respiration). The NEP of broadleaf cold deciduous forest was not significantly different than that of needleleaf forest (paired sample t test p = 0.27, t = 1.18, n = 10). NEP in all three sets of simulations showed large interannual variation (Figure 4c), especially in the simulations for poplar and aspen, in which the standard deviation was about three times the mean NEP, whereas for other vegetation categories, the standard deviation was about 1–1.5 times of the mean. The net N mineralization rate (g N m−2 yr−1) was distinctly different between broadleaf and needleleaf forest (Figure 4d), which was confirmed also by the species-specific values, and also indicated the clear differences in N turnover rate between the two PFTs which reflects different degrees of N limitation between broadleaf and needleleaf forest. The net N mineralization rate in the biome-level simulation was much more similar to that estimated by the needleleaf forest simulation than that estimated by the broadleaf simulation (Figure 4d).
 The differences in estimated C and N pools among species within and among PFTs during the 1990s (Figure 5) did not parallel that of estimated C and N fluxes. Black spruce and poplar were estimated to have a very high soil organic C storage in comparison to the other three species (Figure 5a), which primarily reflects differences in the target soil C levels of the parameterizations (see Table 1). Aspen and white spruce have similar levels of soil organic N (Figure 5b), and black spruce has a much lower vegetation N content in comparison with the other species (Figures 5d). White spruce has the highest vegetation C and black spruce the lowest (Figure 5c).
4.1 Evaluation of Model Simulations and Limitations
 In all three simulations, the average NPP (170–285 g C m−2 yr−1) for boreal forest fell within the 52–868 g C m−2 range reported by Gower et al.  from field studies. NPP simulated for white and black spruce (130–180 g C m−2 yr−1) and aspen (200–450 g C m−2 yr−1) in Figure 4b was close to the 200–400 g C m−2 yr−1 Alaskan estimate from the study of Keyser et al.  that used the BIOME-BGC model. Our species average NEP is within the range estimated by Yarie and Billings  with the CENTURY model for the Alaskan forest. They estimated the current Alaskan boreal forest absorbs approximately 9.65 Tg C yr−1, which agrees well with the 7.1 to 8.1 Tg C yr−1 estimated during the 1990s in this study across the parameterization strategies. The increase in NEP over the course of our simulations largely occurred because NPP increased at a fast rate than heterotrophic respiration. This may be because of the effects of climate in enhancing N mineralization and plant N uptake, as evidenced by the shift in N from soils to vegetation [Gerten et al., 2008].
 There have been some data-oriented approaches to directly extract relationships of ecosystem responses to climatic controls and thus can serve as benchmarks for process-based models [Abramowitz et al., 2007; Keenan et al., 2012b; Moffat et al., 2010]. Here we draw on a recent global-scale model-data-fusion estimates of GPP of Beer et al. , which provide the median value of the mean annual GPP averaged over 1998–2005 of five model-data-fusion approaches at resolution of 0.5° × 0.5°. We extracted the grid cells identified as boreal forest (biome-level) according to our vegetation distribution map. The resulted area-weighted GPP is 440 g C m−2 yr−1. The model-data-fusion-derived GPP is much lower than our model output which is around 850 g C m−2 yr−1 for all three vegetation schemes during 1990–1999. It is difficult to determine whether our estimates may be biased because of the calibration sites we chose or whether the Beer et al.  estimate is biased because it relied on only one FLUXNET site in Alaska (US-Bn2) in its analysis. The site used by Beer et al.  is a 15 year site that last burned in 1987. The crown fire killed all of the aboveground vegetation which consisted primarily of black spruce. As of 2002, this overstory of the site was dominated by heterogeneous aspen and willow species [Chambers and Chapin, 2002; Liu et al., 2005; O'Neill, 2003; O'Neill et al., 2006], a vegetation type that represents less than 3% of the region in our vegetation map. In contrast to the 15 year old burned site, TEM relied on mature forest sites that ranged from 50 to 130 years old with mean around 70 years old [Viereck et al., 1983]. Stand age of forest after fire disturbance has significant impact on the production [Goulden et al., 2011]. For example, measured NPP in seven black spruce-dominated sites comprising a boreal forest chronosequence in Canada had low NPP (5–100 g C m−2 yr−1) immediately after fire, and high NPP 12–20 g after fire (332–521 g C m−2 yr−1) [Bond-Lamberty et al., 2004]. Total NPP of boreal forest has been documented to peak in midsuccession (23 through 74 year old stands) [Goulden et al., 2011; Mack et al., 2008]. Overall, these data suggest that a model-data-fusion is not likely to produce a robust unbiased estimate of GPP over interior Alaska if it relies on only one early successional site in its methodology. Of course, our limited selection of five sites for model parameterization has no guarantee of producing an unbiased estimate. Independent validation with forest inventory and other data in interior Alaska is required to evaluate whether models are producing unbiased estimates [e.g., see Yuan et al., 2012]. Our simulation results for the regional total C storage and our sink or source conclusions did not consider changes in disturbance regime and shifts in vegetation composition. Some previous studies suggest a weakening sink in the northern high-latitude terrestrial ecosystems [Hayes et al., 2011; Denman et al., 2007] because of increased fire and a deepening active layer which may cause previous frozen soil organic C to be released at a faster rate than increases in NPP from longer growing seasons, enhanced N availability, and CO2 fertilization. We note that our study did not consider these issues in calculating regional C dynamics, and we chose to evaluate the C dynamics of “mature” forests in the region to gain some insight into alternative ways of using data to parameterize models applied to the region.
4.2 Implications for Vegetation Classification Methodologies in Regional Modeling
 The differences in model outputs between species and aggregated vegetation schemes may be explained by the smoothing effect of species aggregation on model parameters. The suite of parameter values of the needleleaf and broadleaf parameterizations were similar to those of the dominant species, white spruce among needleleaf species and paper birch among broadleaf species. Thus, there appeared to a loss of information about the function of black spruce and of aspen and popular in the simulations. However, the needleleaf and the broadleaf simulations were functionally quite distinct with respect to how the parameters represented N cycling. Specifically, the broadleaf parameterization had parameters that result in greater rates of vegetation N uptake from the soil, higher rates of N lost from vegetation to the soil, and higher rates of N released from organic to inorganic N in decomposition. The availability of inorganic N is largely controlled by N mineralization, which is expected to be altered by C input to soil, litter quality, soil temperature, and soil water content [Bonan and Van Cleve, 1992; Gärdenäs et al., 2011; Reich et al., 2006]. These differences between the parameterizations represent functional differences that have been noted between needleleaf and broadleaf forests in the boreal region [Van Cleve et al., 1993]. Thus, although there was some information lost about the role of black spruce, aspen, and poplar in aggregating to PFTs, the essential functional features of needleleaf vs. broadleaf forests were maintained in the calibrations. In contrast, the biome-level parameterization was very similar to the needleleaf parameterization, and therefore the functional aspects of broadleaf forests were lacking in simulations using that parameterization. In particular, the biome-level parameterization is likely to have overestimated the effects of N-limitation on C assimilation [Reich and Hobbie, 2013; Vitousek and Howarth, 1991] within the Alaska region. It is worthy to note here that when there is a strong nonlinearity between parameters and model outputs (i.e., “fallacy of the averages” [Rastetter et al., 1992; Wagner, 1975]), the functional use of resources by species may not be reflected in the parameterizations for PFTs and biomes [Chapin et al., 1987; Rastetter et al., 2001].
 Our analysis of parameter values suggests that the application of PFT-level and biome-level parameterization methodologies is likely to be biased with respect to the application of a species-level methodology. This potential for bias may lead to further biases in models that consider changes in vegetation composition when environmental changes involving climate, N deposition, and atmospheric CO2 concentration tend to favor certain species or plant traits [Dukes and Mooney, 1999; Hellmann et al., 2008]. With the species-level simulations, we were able to differentiate the responses among species that possibly reflect functional differences among the species embodied in particular parameters that we calibrated (e.g., maximum rate of C and N assimilation). As summarized by Van Bodegom et al. , vegetation attributes could differ strongly depending on climate [e.g., Moorcroft, 2006], soil fertility [Ordoñez et al., 2009] and hydrology [e.g., Wright et al., 2005], within and between PFTs. A single PFT with fixed attributes risks failing to capture various vegetation attributes including those that are responsive to environment changes such as adaptation [Guisan and Thuiller, 2005], or C-nutrient feedback [Gerber et al., 2010]. Our results suggest that the different ways of using observations, either grouped into ensembles of PFTs or biomes, or directly applied to models when species data are available, will have considerable impacts on regional model extrapolations and resulting estimates of C dynamics. More specifically, inappropriate classification strategies may underestimate or overestimate the exchange of C with the atmosphere.
4.3 Implications for Future Observational and Experimental Activities
 An appropriate classification of species into PFTs is important in regional to global-scale C cycle modeling, and to achieve this we need to justify classification strategies. An example is the climate change projection that included both C cycle and dynamic vegetation by Cox et al. . The simulation result exhibited a severe Amazonia forest die back by the end of the 21st century. When the phenomenon was later investigated, it was found to be caused by interactions between drought and unrealistic vegetation dynamics simulated by the model. While the main driver of the simulated “dieback” is related to projected rainfall reductions and the subsequent severe drought [Cox et al., 2004; Gash et al., 2004; Huntingford et al., 2008], the “dieback” was also caused by the response of trees that were overestimated in the savanna regions by the TRIFFID dynamic vegetation model. Specifically, the absence of fire-disturbance processes caused the vegetation model to overestimate the response of tropical forest to climate [Cox et al., 2004]. Failing to predict responses of successional trajectories and the subsequent shift in vegetation composition could potentially lead to large errors in quantifying C dynamics. In the case of our study, black spruce is classified into the needleleaf evergreen forest PFT, but is well-known as a fire-prone species [Lynch et al., 2002; Van Cleve et al., 1983a]. The parameters estimated for black and white spruce were quite different in our study, and our simulation results also showed substantial differences between the two species in soil C and N and vegetation C and N storage. Grouping black spruce with white spruce may potentially alter the simulated responses to fire regime and the subsequent vegetation trajectories, resulting in even larger biases in simulated C dynamics. Future observational and experimental studies should focus on better identification of species-specific functional characteristics and provide an improved empirical basis to appropriately classify species into PFTs. It would be useful to develop a theoretical basis for classification to justify the aggregation of some species into PFTs and the representation of single species in a hybrid approach to simulate regional and global C dynamics.
 We used a process-based biogeochemistry model, the TEM, to examine a three-level hierarchy of alternative ways of using field-based estimates in the model calibration process: species-level simulations, intermediate PFT-level simulations and biome-level simulations, for five common Alaskan boreal forest species. We found that calibrated key ecosystem parameters differed substantially among species and overlapped for species that would be categorized into different PFTs. Our analysis of parameter sets suggests that the PFT-level parameterizations primarily reflected the dominant species and that functional information of some species were lost from the PFT-level parameterizations. Furthermore, the biome-level parameterization was primarily representative of the needleleaf PFT and lost information on broadleaf species/PFT function. Species-level and PFT-level simulations from 1922 to 2099 had similar estimates of C fluxes and pools, whereas the biome-level simulations consistently produced the lowest estimates. This indicated that the PFT-level simulations were potentially representative of the performance of species-level simulations, and that biome-level modeling is most likely to produce biased results. Our results also suggested that the three options for using observations could result in differences in estimating C dynamics at the regional scale, and that improved theoretical and empirical justifications for grouping species into PFTs or biomes are needed. Future observational and experimental studies should focus on better identification of species-specific functional characteristics to provide an improved theoretical basis for appropriately classifying species into PFTs.
 We are very grateful to the Editor, Dennis D. Baldocchi, two anonymous reviewers, and David. W. Kicklighter for their constructive comments that helped us to improve this paper. This research is supported with a NSF project (DEB-0919331), the NSF Carbon and Water in the Earth Program (NSF-0630319), the NASA Land Use and Land Cover Change program (NASA-NNX09AI26G), Department of Energy (DE-FG02-08ER64599), and the NSF Division of Information and Intelligent Systems (NSF-1028291). Support was also provided by Bonanza Creek Long-Term Ecological Research program (funded jointly by NSF the USDA Forest Service).