Diagnosing errors in a land surface model (CABLE) in the time and frequency domains

Authors


Abstract

[1] We present an approach for diagnosing errors in land surface models in the time and frequency domains. The approach is applied to the Community Atmosphere-Biosphere-Land Exchange (CABLE). We compare modeled and observed fluxes of net ecosystem carbon exchange (NEE), latent heat (LE) and sensible heat (H) at two different forested flux station sites. Using wavelet analysis, we identify the frequencies where model errors are relatively large, and then analyze the sensitivities of model errors at those frequencies to selected model parameters, yielding significant improvements in several measures of model performance. At Harvard Forest, the predictions by CABLE using the modified parameter values reduce model errors at daily to yearly timescales for all three fluxes, but not at interannual timescales for NEE. At the Tumbarumba site, predictions by CABLE with modified parameter values reduce the variance of model errors of all three fluxes from daily to interannual timescales, but do not improve the agreement with the observed mean diurnal or daily time series for H. We conclude that analyzing both mean and variance of model errors at a range of time/frequency scales is useful for identifying and reducing errors of a land surface model.

1. Introduction

[2] Global land-surface models (LSMs) have evolved over the last three decades. Some models now include vegetation dynamics and biogeochemistry at timescales from decades to centuries [Pitman, 2003; Raupach et al., 2005]. The lack of direct observations of fluxes and other prognostic variables at spatial and temporal scales comparable to global climate model predictions makes it difficult to validate simulations by LSMs and therefore ascertain the confidence level of their predictions. Thus most model evaluations have used time series of fluxes and other observations at local spatial scales to identify possible errors in modeled processes. A recent study using ten global climate models coupled with LSMs showed that nearly all models failed to reproduce the observed energy fluxes in selected regions [Dirmeyer et al., 2006]. Model errors were partly due to inaccurate responses of the latent heat flux to soil water availability while differences in simulated precipitation also contributed to the errors. To separate the errors in atmospheric processes from those in the LSMs, Abramowitz [2005], Stöckli et al. [2008] and Randerson et al. [2009] used flux data and other observations to benchmark the performance of LSMs to identify and then reduce systematic model errors. Most previous studies analyzed model errors by comparing model simulations with observations in the time domain, using sensitivity analysis to identify the likely causes for the errors [e.g., Medlyn et al., 2005]. A deficiency of this approach is that it is unable to differentiate errors at a range of timescales, because many of the processes in the model have different response times to external inputs [Baldocchi and Wilson, 2001; Stoy et al., 2007]. Analysis of model errors at different frequencies may provide some very useful information for diagnosing errors and improving model predictions.

[3] In this paper we assess sources of error in the Community Atmosphere Biosphere Land Exchange model (CABLE) [Kowalczyk et al., 2006] in the time and frequency domains using surface flux measurements at two different forest sites. In the time domain we compare the mean fluxes of the modeled and observed net ecosystem carbon exchange (NEE), latent heat (LE) and sensible heat (H) at diurnal, seasonal and interannual timescales. In the frequency domain we use wavelet analysis [Torrence and Compo, 1998] to partition discrepancies between model predictions and observation of NEE, LE and H into multiple frequencies. This allows us to assess errors arising from short-term processes, such as photosynthetic response to absorbed radiation, and to relatively long-term responses, such as the dependence of evapotranspiration on soil moisture. Thus the objective of this study is to demonstrate what we can learn from analysis of model errors in the time and frequency domains.

[4] In what follows, section 2 describes the evolution of the treatment of biophysical processes included in CABLE, the two forest sites and measurements used to evaluate CABLE, and the methods of time series and wavelet analyses for comparing model predictions with observations; section 3 presents the results of time series and wavelet analysis; section 4 discusses the results; and finally we draw conclusions in section 5.

2. Methods

2.1. Model Description

2.1.1. Model Evolution

[5] The first global LSM developed at CSIRO in 1990 used a single soil type, constant roughness length over land and no vegetation [Kowalczyk et al., 1991]. It used the soil-moisture scheme of Deardorff [1977] and the force-restore method of Deardorff [1978] to calculate surface temperature. Explicit representation of vegetation was added using the big-leaf canopy approach [Sellers et al., 1986], and the improved LSM was incorporated into the CSIRO global climate model in 1993 [Kowalczyk et al., 1994]. In 1995, the soil module was modified to include six soil layers and three snow layers before coupling the LSM to the CSIRO Division of Atmospheric Research Limited Area Model (DARLAM). In this version, surface fluxes were calculated as the weighted mean of fluxes from bare ground and vegetation, based on the fraction of ground covered by vegetation.

[6] In the Soil Canopy Atmosphere Model (SCAM), developed by Raupach et al. [1997], the modeled canopy was located above the soil surface to allow for a more realistic aerodynamic coupling of land and atmosphere using the localized near-field dispersion theory of Raupach [1989a, 1989b] and a simple model for vegetation roughness length [Raupach, 1994]. An empirical model of leaf stomatal conductance was combined with Beer's law to calculate the amount of radiation absorbed by the canopy and ground surface without differentiating between direct beam and diffuse radiation, or sunlit and shaded leaves. SCAM coupled to DARLAM was tested using field measurements in southeast Australia by Finkele et al. [2003].

[7] Wang and Leuning [1998] developed a one-layer, two-leaf canopy model as a simplification of the multilayer model of Leuning et al. [1995]. Key features of the two-leaf canopy model are the calculation of leaf energy balances and photosynthesis separately for sunlit and shaded leaves. Photosynthesis, stomatal conductance and the leaf energy balance are fully coupled using the quasi-mechanistic leaf-level model presented by Leuning [1995]. The two-leaf model, named the CSIRO Biosphere Model (CBM), has been used to examine the information content of flux measurements when estimating model parameters [Wang et al., 2001, 2007], in estimating the carbon balance of Australia using multiple constraints [Wang and McGregor, 2003], and in designing an optimal sampling network for atmospheric inversion studies [Law et al., 2004].

[8] The first version of CABLE was developed in 2003 and it combined all features of the predecessor LSMs described above [Kowalczyk et al., 2006]. In particular, CABLE combines the two-leaf, sun-shade canopy model from CBM developed by Wang and Leuning [1998], the model for surface roughness and aerodynamic resistance from SCAM and the soil and snow model developed by Kowalczyk et al. [1994], improved later by Gordon et al. [2002]. The performance of CABLE compares favorably with other major global LSMs in simulating surface fluxes of CO2, latent and sensible heat for different vegetation types [Abramowitz et al., 2007; Wang et al., 2007]. It has also been used to study systematic model errors [Abramowitz, 2005], effects of land cover change on regional climate [Cruz et al., 2010], and regional water balances [Zhang et al., 2010]. It has been adopted as the Australian community LSM and is a key component of the Australian Community Climate Earth System Simulator (ACCESS) (see http://www.accessimulator.org.au).

2.1.2. Model Components and Integration

[9] CABLE consists of five components: (1) the radiation module describes radiation transfer and absorption by the sunlit and shaded leaves; (2) the canopy micrometeorology module describes the surface roughness length, zero-plane displacement height, and aerodynamic conductance from the reference height to the air within canopy or to the soil surface; (3) the surface flux module includes the coupled energy balance, transpiration, stomatal conductance and photosynthesis of sunlit and shaded leaves; (4) the soil module describes the heat and water fluxes within soil and snow at their respective surfaces; and (5) the ecosystem carbon module accounts for the respiration of stem, root and soil organic carbon decomposition. The integration time step of this model is either half-hourly or hourly. A detailed description of all five modules is provided in the above references and in Appendix A.

[10] To initialize the state variables in CABLE, it is spun up by recycling the observed annual meteorological forcing until the soil moisture in each soil layer differ by <0.001 m3 H2O/m3 soil and <0.1 K for soil temperature at the same time step between two successive runs. Forcing variables include incoming short-wave and long-wave radiation, air temperature, specific humidity, air pressure, wind speed, precipitation and ambient CO2 concentration. We then use the values of soil temperature, moisture and canopy water storage at the last time step of the spin-up as initial values for model runs. Model outputs used in this study are the fluxes of latent heat (LE), sensible heat (H) and net ecosystem carbon exchange (NEE). Fluxes are positive when directed from the surface to the atmosphere. The version of CABLE used in this study does not include the dynamics of carbon pools.

2.1.3. Model Parameters

[11] When CABLE is run globally, values for vegetation and soil model parameters are obtained from look-up tables [Kowalczyk et al., 2006]. CABLE uses vegetation types defined by the International Geosphere and Biosphere Program [Loveland et al., 2000] and the soil texture types of Zobler [1999]. The same look-up tables and parameter values are used in this study, except for leaf area index and canopy height that were estimated for each of the two sites.

[12] Two sets of simulations were performed using CABLE. One used default values of all model parameters as used for global modeling and the other used modified parameter values chosen by inspection after analysis of model errors in the time and frequency domains (see section 2.3). No parameter optimization is attempted here, as the objective of this study is to demonstrate the usefulness of analyzing model errors in the time and frequency domains for diagnosing model errors.

2.2. Measurements

[13] We used eight to fifteen years of measurements from two forested sites: a mixed broadleaf forest at Harvard, USA and a temperate evergreen broadleaf forest at Tumbarumba, Australia. These two sites were chosen because of the availability of relatively long-time series of high-quality flux measurements (fraction of gap-filled measurements is <15% at both sites) and because they experience quite different climates, water availability, seasonal variations in the fluxes and canopy leaf area indices. Measurements from these two forest sites provide a good test of our approach for analyzing model errors. A brief description of each site is given below.

[14] The Harvard Forest site is located at 42.538°N, 42.171°W in Massachusetts, USA. The soil is a silt loam and the vegetation is dominated by deciduous hardwood trees (Quercus rubra and Acer rubrum) and evergreen conifers (Tsuga canadensis, Pinus strobus and Pinus resinosa). The IGBP classification is mixed forest. The canopy leaf area index varies between 1.5 and 5.4 seasonally (Figure 1). The mean annual rainfall was 1137 mm from 1992 to 2006. Soil water was not limiting for most years, but may have been limiting for 1997, 1998 and 2001 when the annual rainfall was below 1000 mm [see Urbanski et al., 2007]. Gap-filled measurements of all meteorological variables and surface fluxes from 1992 to 2006 used in this study were obtained from the data archive of Carbon Dioxide Information Analysis Center of the Oak Ridge National Laboratory. Urbanski et al. [2007] give further details about the data.

Figure 1.

Observed variation of the canopy leaf area index (LAI) of the Harvard forest (gray curve) and that of the Tumbarumba forest (black curve).

[15] The Tumbarumba site is an evergreen broadleaf forest located at 35.656°S, 148.152°E in southeast Australia. The dominant species in the upper canopy layer are Eucalyptus delegatensis and E. dalrympleana while the patchy understorey consists of shrubs and grasses. The total leaf area index varies between 2.3 and 3.5, as estimated from remote sensing measurements and calibrated locally [Leuning et al., 2005; J. Verbesselt, personal communication, 2010] (see Figure 1). The Zobler soil texture type is sandy loam. All meteorological variables required by CABLE and fluxes of NEE, LE and H were measured continuously at a height of 70 m from 2001 to 2008. The forest experienced severe drought and insect infestation in 2003 [Keith et al., 2009] and the lowest summer rainfall in 2006/2007 over the period. Further details about the site and measurements conducted there can be found in the works of Leuning et al. [2005] and Keith et al. [2009].

2.3. Overall Performance of Model Predictions and Analysis of Model Errors

[16] In this study, we compare the modeled and observed fluxes of NEE, H or LE. The model error at time step n is defined as the difference between observed (On) and modeled flux (Pn), or OnPn. The sum of model error squared, U is calculated as

equation image

The index d of Willmott [1981] is often used to assess the degree of agreement between model predictions and observations. It is defined as

equation image

where equation image is calculated as equation image. N is total number of observations. A value of 0 for d means no agreement whereas a value of 1 means perfect agreement. Note that unlike r2 (see below) d is sensitive to differences between the observed and model means.

[17] Linear regression of the form On = a + bPn, is also used to assess model performance. The model is considered to have a significant bias when regression coefficients, a and b are significantly different from zero and one, respectively.

[18] Both the agreement index (d) and linear regression are used to assess the overall performance of model predictions, but they are not very helpful in identifying the causes for errors arising from model structure or model parameter values. One of the commonly used methods for identifying the causes of model errors is to select the times when the model error squared, (PnOn)2 is relatively large but this approach is inefficient when analyzing long time series of data. A large amount of “detective work” is required to diagnose model errors using sensitivity analysis [Abramowitz et al., 2008] because model errors can result from many interacting processes. The problem is even more difficult for diagnosing global land surface models that typically have 20–50 parameters for any vegetation type.

[19] Because many ecosystem processes or states can have quite different response times to external forcing, we can represent the variance of model predictions, observations or their differences (model errors) in the frequency domain to identify the frequencies at which model errors are relatively large. This method complements the approach previously discussed, because we can use additional information in the frequency domain to narrow down the number of processes or model parameters we need to diagnose model errors. For example, if model errors are relatively large at monthly scale, they may result from errors in modeled soil water dynamics or seasonal variation of leaf photosynthetic capacity for deciduous trees.

[20] In the following, we provide a brief description of analysis of model errors in the time and frequency domains.

2.3.1. Variance Analysis Using Wavelet Analysis

[21] Wavelet analysis is used to estimate the variance of model errors as a function of frequency only or the frequency and time. By removing the mean from model error (OnPn), we can construct a time series, Xn, as

equation image

We can use continuous wavelet analysis to transform a discrete, one-dimensional time series, Xn, into a complex time series with real and imagery parts, from which a two-dimensional surface of the variance (or power) can be constructed as a function of time step n and scale s. Following Torrence and Compo [1998], the continuous wavelet transform (CWT) of Xn is calculated as:

equation image

where Wn(s) is a complex number with real and imaginary parts representing the wavelet transform of Xn at time step n and scale s, respectively; equation imagek is the Fourier transform of Xn; equation image* is the Fourier transform of the conjugate of the wavelet function; s is the scale (day); ωk is angular frequency (d−1), δt is the time step (day), k is the scale index and N is the total number of hourly or half-hourly observations.

[22] In this study, we used the Morlet wavelet function [Torrence and Compo, 1998, Table 1]. For each scale s, the wavelet function was normalized to have unit energy and the wavelet power spectrum was calculated as ∣Wn(s)∣2. For nonstationary time series the CWT gives more accurate estimates of the power at all frequencies than the Fourier transform, which tends to underestimate the power at low values of s and overestimate the power at high values [see Lau and Weng, 1995].

[23] The variance of Xn at time step n over scales s1 and s2 is computed as

equation image

where δj is scale interval (0.25 day), δn is the time interval (0.042 day), Cδ is the reconstruction factor (0.776), and sj is the scale as defined by equation (9) of Torrence and Compo [1998].

[24] The global wavelet spectrum (GWS) or wavelet power spectrum, Vg(s), is calculated as the sum of wavelet power at all time steps for a given scale, s. That is

equation image

We use equation (6) to calculate the GWS of the modeled and observed flux time series and equation (5) to calculate the variance of model errors as a function of time at different scale intervals. We tested whether the power at a given scale interval is significantly different from that of a white noise (see equations (26)–(28) of Torrence and Compo [1998] for further details) at 95% confidence level. The wavelet analysis in this study used the software freely available at the Web site http://paos.colorado.edu/research/wavelets/software.html.

[25] Using equations (5) and (6), we can partition total variance of model error, V into the contributions from four different scales. That is

equation image

where Vd, Vs, Vy and Vny represent the variance at ≤daily, between daily and half-yearly or seasonal, half-yearly to yearly and >yearly scales, respectively. Partitioning V into different scales can be computed using equation (5) for each time step or for all time steps using equation (6).

[26] The sum of model error squared (U) is related to the total variance of model error, V as

equation image

and

equation image

where equation image is the mean modeled flux.

[27] In parameter optimization, U is often used as cost to be minimized [see Wang et al., 2009]. Therefore the optimal values of model parameters should minimize the sum of total variance of model errors (V) and the bias in the mean bias, or (equation imageequation image). Therefore we need to study the variance of model errors using wavelet analysis and bias in the means using time averages to reduce overall model errors.

[28] In this study, time is measured in days, frequency in cycles/day and scale in days/cycle; the term “cycle” is omitted in subsequent text from the unit for frequency or scale for simplicity.

2.3.2. Calculating Time Averages

[29] As shown by Stoy et al. [2005], surface fluxes of NEE, LE and H can have three dominant scales: daily, annual and interannual. For a discrete modeled or observed flux time series, F, where Fn is the flux at nth time step, the three kinds of time averages are calculated as follows: (1) The conditional-mean hourly flux 〈Fhour is calculated by averaging the flux Fn at the same hour of the day for each of four seasons (DJF for December, January and February; MAM for March, April and May; JJA for June, July and August; and SON for September, October and November) across all 8 or 15 years. (2) The conditional-mean daily flux, 〈Fday, is calculated by averaging all fluxes within the same day-of-year across all 8 or 15 years. (3) The annual block-mean flux, equation imageyear, is calculating by arithmetically averaging all fluxes within years for each of 8 or 15 years. Here we use angle brackets for conditional averages and upper bars for block averages. Interannual variation in fluxes is evident in (3) but not in (1) or (2). Comparing the modeled and observed mean trends at those frequencies allows us to examine the ability of the model to capture the mean response of simulated fluxes to meteorological forcing, soil water and model inputs, and to identify mean model errors at different timescales.

3. Results

3.1. Performance of Model Simulations Using Default Parameter Values at Two Sites

[30] Table 2 presents the regression coefficients a and b of the regression On = a + bPn. Also presented are the squared correlation coefficients, r2, and the index of agreement, d, of Willmott [1981] for flux measurements made at Harvard Forest and Tumbarumba. For simulations using default model parameter values (P1), b is differs significantly from 1 and a is significantly different from 0 for all three fluxes at both sites, suggesting there are systematic model errors for all fluxes. Figure 2 shows that the model underestimates NEE, LE and H when the observed fluxes are high and positive and overestimates all three fluxes when the observed fluxes are low or negative. Results in Table 2 and Figure 2 can be used to quantify overall model performance, but are not very informative for identifying what causes the model biases.

Figure 2.

Comparison of modeled (x axis) and observed (y axis) half-hourly (a) NEE, (b) sensible heat, or (c) latent heat fluxes for Harvard forest. The dashed line is the linear regression (On = a + bPn), and the solid line represents 1:1 relation. The unit is μmol m−2 s−1 for NEE and W m−2 for latent and sensible heat fluxes.

[31] In the following sections, we shall demonstrate the approach presented in section 2.3 for identifying causes for model errors and hence improving performance by CABLE for two forest sites.

3.2. Analysis of Model Errors of CABLE at Harvard Forest

[32] Figure 3 shows that the Global Wavelet Spectra (GWS) for NEE, H and LE observed at Harvard Forest have peaks at daily and yearly scales. At daily scale, default simulation (P1) accurately reproduces the variance of NEE and LE, as shown by the ratios of the modeled to observed GWS being close to unity, but underestimates the observed variance in H by about 45%. This suggests that the diurnal variation of the modeled H is not as large as in the observations. At the yearly scale, model simulations accurately reproduce the variance in H, but underestimate the variance of NEE and LE by about 40%. Furthermore, variance of modeled NEE is higher than observed NEE at the 10 and 100 day scales but lower at 6 month to yearly scales. Similar differences are also found between modeled and observed H and LE at scales between weekly and yearly. These differences suggest that timing and amplitude of the simulated fluxes have significant errors. This is confirmed by comparison of the observed and modeled mean annual trends of NEE and LE using the default parameters (P1) (see Figure 6). This comparison will be discussed later. Use of default parameters also resulted in an overestimate of −1.1 μmol m−2 s−1 in NEE and an underestimate of H by 11 W m−2 over the 15 year period.

Figure 3.

Global wavelet spectra for NEE, H and LE observed at Harvard Forest (dashed gray curves and left y axis). The right-hand y axes are for the global wavelet spectrum from the model simulation P1 divided by the corresponding spectrum from the observations (solid black curve). The dashed black line represents the ratio of 1 (right y axis). The unit of the GWS is μmol2 m−4 s−2 for NEE and W2 m−4 for LE and H. The two vertical bars represent the daily (s = 1) and yearly (s = 365) scales.

[33] We note from Figure 3 that model errors in NEE and LE are largest at the 10 and 100 day timescales. This has helped us to identify two parameters, Tminvj and Tmaxvj that affect the dependence of maximum leaf carboxylation rate on leaf age and therefore the modeled seasonal variation of NEE and LE for deciduous trees at Harvard Forest. Table 1 lists the original (P1) and modified values (P2, P3) of the parameters. Compared to using P1, simulations using P2 reduce the mean biases (Figures 4a, 4g, and 4m) and total variance of model errors (Figures 4b, 4h, and 4n) for each of the three fluxes. More importantly, simulation P2 reduces the variance of model errors for NEE and LE at daily to yearly scales (Figures 4c4e and Figures 4o4q), and H at yearly scales (Figure 4k) where there is a large mismatch between the observed and modeled NEE, LE or H. However, the simulation with P2 had little effect on model errors at interannual scale for each of the three fluxes (Figures 4f, 4l, and 4r). This is not surprising as both parameters are assumed to be constant from year to year.

Figure 4.

Sensitivities of CABLE simulations to model parameters for Harvard Forest (1992–2006). Shown are the difference in the means (equation imageequation image), and the ratio of variances of model errors (VP−O) to observations (Vo) for all data combined, for daily, seasonal, yearly and interannual scales The unit for the mean is μmol m−2 s−1 for (a–f) NEE and W m−2 for (g–l) sensible heat or (m–r) latent heat flux.

Table 1. Default and Modified Values of Some Model Parameters Used in the Simulationsa
ParameterUnitDefault Value (P1)Modified ValueSite
  • a

    Modified parameters: βv and βs are the slopes of the responses of stomatal conductance and soil evaporation to available soil water, respectively (Appendix A, equations (A19) and (A25)). Tminvj and Tmaxvj are used to simulate the dependence of maximum leaf carboxylation rate on leaf age (see equation (A22)), xp and xs are multipliers for scaling the respiration rates of nonleaf tissues and soil, respectively (equations (A38)(A40)), and froot,m is the fraction of roots in different soil layer (equation (A19)). The modified values of xp and xs are used at Tumbarumba for year 2003 only. Default values are used in simulation P1, and modified values are used in P2, P3 for Harvard Forest (H) and P2 for Tumbarumba (T).

TminvjK278286 (P2, P3)H
TmaxvjK288293 (P2, P3)H
βsdimensionless10.1 (P3)H
βvdimensionless13 (P3)H
froot,mfraction[0.08,0.19,0.33,0.32,0.08,0.00][0.01,0.04,0.2,0.2,0.25,0.2] (P3)H
xpdimensionless1.53.0 (P2)T
xsdimensionless2.63.2 (P2)T
βvdimensionless21 (P2)T
βsdimensionless10.4 (P2)T

[34] Results of the wavelet analysis shown in Figure 3 indicate there is a significant mismatch between observed and modeled NEE, LE and H at weekly to 6 monthly scales. The dynamics of soil water moisture strongly influences LE and H at these timescales, suggesting the need to modify two parameters, βs and βv that characterize the response of stomatal conductance and the rate of soil evaporation to soil water content (equations (A19) and (A25)). The fraction of roots in each soil layer was also changed (Table 1). When compared to simulation P2, the modified values of βs, βv and froot,m in simulation P3 reduce the mean bias in H (Figure 4g), increase the mean bias in LE (Figure 4m), but reduces the total variance of model errors of LE (Figure 4n). The modified values of those four parameters as used in simulation P3 reduce the model errors at all scales from daily to interannually for LE (Figures 4o4r), and at scale from annual and interannual for H (Figures 4k and 4l), but has little effects on the model error of NEE, as compared with simulation P2.

[35] Figure 4 and Table 3 show that modifying values of model parameters may not give consistent changes to the mean and total variance of model errors, or to model errors at different scales. For example, the errors of modeled H from simulation P3 are reduced by 60% at annual scale and by 20% at interannual scale, as compared with those from simulation P2, but the reduction of the total model error in H across all scales is less than 2% (see Figure 4h). This is because the variance of model errors at annual to interannual scales contribute less than 6% of total variance of model error for H (Table 3). Whether to minimize total variance across all timescales or the partial variance for a selected timescale when constructing the cost function to estimate parameter values thus depends on the problem at hand. For example, different parameter values are likely to be obtained if the model is optimized for daily fluxes or annual totals. This will be discussed later.

[36] Figure 5 provides more evidence that serious errors result in the seasonal variation of all three fluxes when simulated by CABLE using default parameter values (P1). For Harvard Forest, the mean diurnal variation (〈Fhour), the agreement between simulation P1 and observation is best during the growing season (JJA). Simulation P1 generally overestimates LE, and underestimate NEE and H for the other seasons. Biases in the modeled mean fluxes of NEE, LE and H are largest around middle day time in the spring (MAM) when deciduous trees start growing leaves and in the autumn (SON) during leaf fall. Figure 6 also shows that the simulated daily carbon uptake (NEE negative) starts about 1 month earlier and continues for a month longer than is observed. These results are consistent with the bias in GWS for NEE at monthly to annual scales seen in Figure 3.

Figure 5.

Mean diurnal variation of net ecosystem carbon exchange (NEE), sensible heat flux (H) and latent heat flux (LE) as calculated from the measurements (open circles) and modeled by CABLE (black curve for simulation P1 and gray curve for simulation P3) for December, January and February (DJF); March, April and May (MAM); June, July and August (JJA); and September, October and November (SON) at the Harvard forest.

Figure 6.

Means of NEE (μmol m−2 s−1), H (W m−2) and LE (W m−2) as calculated from the observations (blue symbol), simulation P1 (black curve) and simulation P3 (red curve). (a–c) Daily means and (d–f) yearly means from 1992 to 2006 at Harvard forest.

[37] Adjusting the values of two parameters, Tminvj and Tmaxvj that control seasonal variation of maximum leaf caroxylation rate as in simulations P2, significantly improved the agreement with observed 〈Fhour〉 and 〈Fday〉 for NEE, but much less for LE and H. Modifying the values of βs, βv and froot,m that affect soil water dynamics improved the agreement in 〈Fhour〉 and 〈Fday〉 between observed LE and H and those modeled in simulation P3, but had little effect on NEE, particularly for the relatively dry years (1997, 1998 and 2001) at the Harvard Forest site (data not shown here). As a result, the mean trends for all three fluxes simulated in P3 agree much better with the observations than those from simulation P1 (Figures 5 and 6).

[38] These changes in the values of model parameters significantly reduced the bias in the mean NEE over the 15 year data record, but had very little effect on the variance of model errors of NEE at interannual scale (Table 3). On the other hand, these changes significantly reduced the variance of model errors of LE and H at interannual scales (Table 3), but had very little effect on the bias of the mean LE and H. This is because NEE depends on the dynamics of various carbon pools [Carvalhais et al., 2008], some of which have a residence times of many decades [Kirschbaum, 2004; Wang et al., 2010]. The dynamics of slow-turnover carbon pools are not included in the version of CABLE used in this study and therefore the simulations could not explain much of the observed interannual variation of NEE. On the other hand, water stress is rare at Harvard Forest so the energy fluxes are predominately determined by the atmospheric demand for water vapor and radiation input. The interannual variations of both these factors are relatively small and therefore CABLE simulations can explain much of the interannual variation of the observed LE and H.

[39] Table 2 compares two simulations (P1 and P3) by CABLE with observations for each of three fluxes from 1992 to 2006. The overall performance of simulation P3 is better than for simulation P1 as shown by higher values of the agreement index, d and correlation coefficient r.

Table 2. Linear Regression (On = a + bPn) Statistics, Where On is Observation and Pn is Model Prediction, r2 is the Correlation Coefficient Squared and d is the Agreement Index [Willmott, 1981] of NEE, LE and H for Harvard Forest and Tumbarumbaa
Harvard Forestabr2d
P1P3P1P3P1P3P1P3
  • a

    Coefficient a is significantly different from zero, and coefficient b is significantly different from 1 for all four fluxes at both sites unless specified; “ns,” not statistically significant with a probability, p < 5%. The units are μmol m−2 s−1 for NEE and W m−2 for LE and H. Simulations using global default parameters in CABLE are in columns P1, and those with modified parameters are in columns P3 (revised) for Harvard forest or P2 for Tumbarumba.

NEE0.930.410.880.99 (ns)0.710.800.900.94
LE3.63.60.870.960.620.740.880.92
H8.73.51.091.150.630.740.860.90
Tumbarumbaabr2d
P1P2P1P2P1P2P1P2
NEE0.17−0.071.041.030.730.800.910.94
LE12.31.30.830.99 (ns)0.600.770.870.93
H11.87.80.881.050.760.830.930.95

3.3. Analysis of Model Errors for the Tumbarumba Site

[40] For Tumbarumba, Figure 7 shows that CABLE faithfully reproduces the observed variance in the fluxes NEE, H and LE at the weekly timescale using the default model parameter values listed in Table 1, but that agreement is quite poor at monthly to annual scales. As for Harvard Forest, soil water dynamics and seasonal variation of maximum leaf carboxylation rate (vcmax) can significantly affect errors in modeled fluxes at those scales. Field measurements showed that leaf vcmax was quite constant throughout the year [Keith et al., 2009], leading us to reexamine values of βs and βv as well as parameters accounting for the sensitivity of stomatal conductance to environmental variables (a1, D0, see equation (A18) in Appendix A). Revised parameter values are given in Table 1.

Figure 7.

Global wavelet spectra for NEE, H and LE observed at Tumbarumba (dashed gray curves and left y axis). The right-hand y axes are for the global wavelet spectrum from the model divided by the corresponding spectrum from the observations. The solid black curve is for simulation P1, and the gray curve is for simulation P2. The dashed black line represents the ratio of 1 (right y axis). The unit of the GWS is μmol2 m−4 s−2 for NEE and W2 m−4 for LE and H. The two vertical bars represent the daily (s = 1) and yearly (s = 365) scales.

[41] Results in Table 2 show that the intercept of the linear regression (a) from simulation P2 is closer to zero and slope (b) is closer to 1 than the corresponding value from simulation P1. The agreement index for simulation P2 also is higher for each of the three fluxes. Model errors at monthly to yearly scales are not very sensitive to a1 and D0 (results not shown), whereas using revised values of βs and βv in simulation P2 reduced the variance of model errors for LE and H especially at monthly to yearly scales when compared to simulation P1 (Table 3).

Table 3. Mean of the Observed or Modeled Fluxes or the Variance of Observed Flux (O) or Errors of Modeled Fluxes (P1–P3 for Harvard Forest and P1 and P2 for Tumbarumba) at Four Different Scales (Vd for Daily, Vs for Daily to Annual, Vy for Annual and Vny for Interannul) at the Harvard Forest or Tumbarumbaa
 MeanVdVsVyVny
  • a

    The units for the mean bias are μmol m−2 s−1 for NEE and W m−2 for LE and H. Units for variances are μmol2 m−4 s−2 (time step)−1 for NEE and W2 m−4 (time step)−1 for LE and H.

Harvard
   NEE(O)−0.5230.011.06.00.2
   NEE(P1)−1.656.36.710.2
   NEE(P2)−0.924.93.60.90.2
   NEE(P3)−0.944.44.00.80.2
   H(O)326920262043645
   H(P1)214444097011
   H(P2)223903674010
   H(P3)24363361157
   LE(O)342820122073644
   LE(P1)3595165524611
   LE(P2)3472651816710
   LE(P3)32627505817
Tumbarumba
   NEE(O)−1.2952.712.60.40.2
   NEE(P1)−1.3813310.2
   NEE(P2)−1.1410.020.80.1
   H(O)479930291090535
   H(P1)3982141623237
   H(P2)37421141554
   LE(O)576560183086235
   LE(P1)53267096028437
   LE(P2)561720383244

[42] To analyze the sensitivity of model errors to βs and βv further, we compared the model error at each time step at daily, seasonal or annual scales from simulations P1 and P2 (Figure 8). We used the method of Torrence and Compo [1998] to calculate the variance of model errors for each simulation that exceeds the variance of white noise at the 95% confidence level. Because of the variance of model errors from simulation P2 is less than that of simulation P1, the variance of white noise is also lower (see Figure 8). At daily scales, peak values of Vd from simulation P2 are less than 20% of those from simulation P1 for all three fluxes during the growing season (SON). At seasonal scale, the peak value of Vs during the growing season from simulation P2 is about 20–80% lower for LE, and about 80–90% lower for H, and 5–20% lower for NEE than those from simulation P1. At yearly scale, model errors from simulation P1 are relatively large for 2003 (drought plus insect infestation) and 2006/2007 (strong summer drought). Simulation P2 reduced model errors in all three fluxes at annual scale for those 2 years, and the errors in NEE in year 2003 were reduced even further using the modified values of the multipliers for nonleaf plant respiration rate (xp) and soil respiration rate (xs) for year 2003 only, as compared with simulation P1 (Figure 8). At interannual (>2 years) scale, simulation P2 also reproduces the variations of the observed LE or H better than simulation P1, as the ratios of the modeled and observed GWS from simulation P2 are closer to 1 than those from simulation P1 (see Figure 7), but the difference is quite small for NEE between the two simulations at interannual scale.

Figure 8.

The variance of model errors V at daily (Vd), seasonal (Vs) and annual (Vy) timescales for Tumbarumba. The solid black and red curves represent V for simulations P1 and P2, respectively. Whenever the solid lines exceed their corresponding dashed lines, V is greater than expected from white noise at the 95% confidence level estimated using the method of Torrence and Compo [1998]. The units are μmol2 m−4 s−2 for variance of NEE and W2 m−4 for variance of LE and H. For clarity, we used a 7 day moving window to smooth the estimated variance of model errors at daily scale for both simulations.

[43] The mean temporal variation (〈Fhour, 〈Fday) of the fluxes calculated from simulation P2 is not always in better agreement with the observations than those from simulation P1. Figure 9 compares the simulated and observed mean diurnal variations (〈Fhour) of NEE, LE and H over each of four seasons at Tumbarumba. 〈Fhour from simulation P2 agrees better with the observations for NEE and LE. However, the agreement with the observed mean diurnal H is better for simulation P1 for the summer (DJF) and autumn (MAM) seasons, but is poorer for the other two seasons than that from simulation P2.

Figure 9.

Mean diurnal variation of net ecosystem carbon exchange (NEE), latent heat flux (LE) and sensible heat flux (H) as calculated from the measurements (open circles) and modeled by CABLE (black curve for simulation P1 and gray curve for simulation P2) for December, January and February (DJF); March, April and May (MAM); June, July and August (JJA); and September, October and November (SON) at Tumbarumba.

[44] For 〈Fday, the differences in the simulated NEE are quite small between P1 and P2. Simulation P1 underestimates the daily mean LE during summer, overestimates the daily mean LE over spring (Figure 10c). Biases in the modeled 〈Fday are much reduced for LE in simulation P2. However, for H, the differences between the observed and simulated 〈Fday for simulation P2 are smaller for December, January and February but larger for July and August and September than simulation P1. As a result, there is little improvement over simulation P1 for the annual mean H (Figure 10b). For equation imageY, simulation P2 also provides better estimates for mean annual NEE for 2001–2003, and gives similar estimates to P1 for other years (Figure 10d). Simulation P2 also improves predictions of the mean annual LE for most of the 8 years (Figure 10f), but provides little improvement for H (Figure 10e).

Figure 10.

Means of NEE (μmol m−2 s−1), H (W m−2) and LE (W m−2) as calculated from the observations (blue circles), simulation P1 (black curve) and simulation P2 (red curve). (a–c) Conditional daily means and (d–f) yearly means from 2001 to 2008 at Tumbarumba.

[45] The overall agreement in the mean trends and annual means from simulation P2 is better for LE, but is quite similar for NEE, and is slightly worse for H, as compared with simulation P1. That is quite different from the result that variance of model errors from simulation P2 is less than that from simulation P1 at all scales. Because the bias in the mean only contributes about less than 5% to the sum of model error squared (U) for LE and NEE, and about 13% for H, the overall agreement, d for simulation P2 is higher than that from simulation P1 for each of three fluxes (see Table 2) at Tumbarumba forest.

4. Discussion

[46] In an earlier study, Abramowitz et al. [2008] evaluated the performance of three land surface models, including CABLE, by comparing mean-hourly and mean-monthly simulations and observations of surface fluxes at six sites. They also used a form of cluster analysis to examine where model bias is greatest under particular meteorological conditions to identify areas of model structure responsible for poor performance. The present study complements the work of Abramowitz et al. [2008] by demonstrating the usefulness of combining time series and wavelet analyses to identify the likely causes of model errors in CABLE for two quite different forests. This approach allows us to identify the likely causes of model errors through efficient sensitivity studies targeted at frequencies where errors are relatively large compared to errors at other frequencies. This is particularly useful for global land surface models with large number of parameters for each plant functional type. For example, the observed mismatch of global power spectrum between the observed and modeled fluxes at a timescale between 6 and 12 months for both LE and NEE at Harvard Forest, indicated that modeled photosynthetic capacity should vary differently with leaf age during the growing season from what was modeled using default parameter values by CABLE.

[47] Choosing the correct parameter values is also critical to the performance of land surface models. Parameters are usually estimated by minimizing the total of model errors squared without regard to timescale [Wang et al., 2009], but the sensitivity of model errors to parameter values may vary with timescale [Prihodko et al., 2008; Vargas et al., 2010]. High-frequency response (or short scales, daily to monthly) typically dominates the observed variations of surface fluxes, and therefore has much greater influences on the optimized parameters than the low-frequency response (yearly and interannual) [see Braswell et al., 2005; Williams et al., 2009]. Parameters related to the low-frequency response of the ecosystem are therefore poorly constrained using a cost function as the sum of model errors squared [Fox et al., 2009; Wang et al., 2009]. If cost functions at different frequencies are minimized separately, we should be able to better constrain model parameters that have strong influences on the low-frequency response of the system. For example, we found that the variance of model errors in H, LE and NEE could be reduced at weekly to yearly scales at both Harvard Forest and Tumbarumba by changing two parameters that affect the rate of soil water loss through soil evaporation or plant transpiration. Analysis of model errors in the time and frequency was also helpful in showing that model errors in NEE were large at annual scale for 2003 when the forest at Tumbarumba suffered both drought and insect attack. The errors were reduced by changing two parameters related to plant and soil respiration for that year.

[48] Sensitivity of modeled fluxes to parameter values also differ for the three fluxes at different timescales. Thus the sensitivity of NEE is quite different from that of LE or H at the interannual scale, because NEE depends on the dynamics of carbon pools, some of which have a turnover times of decades, whereas fluxes of LE and H depend on soil water within the rooting zone that is replenished several times a year in most regions. Furthermore, the sensitivity of mean trends can differ from that of the variance of model errors. CABLE simulations using modified model parameter values for Tumbarumba reduced the variance of model errors at all scales for H, but did not reduce the bias in the daily or annual means when compared with the simulation using default parameter values. Such conclusions can only be obtained by analyzing model errors in the time and frequency domains.

[49] Finally it is important to take account of errors in the measurements when model simulations are compared with measurements. This can be formally done using the model-data fusion framework that explicitly considers the errors in the model separately from those in the measurements, as discussed by Wang et al. [2009]. This often is an iterative process, as model errors are not known beforehand. It is also important to consider both the magnitude of errors in different measurements and the correlations of errors in time of each set of measurements when model errors are analyzed using eddy flux measurements [Williams et al., 2009].

5. Conclusions

[50] Using the measurements of surface fluxes from two different forest sites, we successfully demonstrated that analysis of the variance of model errors in time at different frequencies can be used to identify the frequencies where model errors are largest and to separate the model errors at those frequencies from model errors at other frequencies. This information is useful for diagnosing the causes for model errors and improving model simulations by modifying the values of relevant model parameters.

[51] For Harvard Forest, we found that two parameters controlling the seasonal variation of maximum leaf carboxylation rate with leaf age had significant effects on the model errors of NEE and LE from daily to yearly scale, and model errors of H at yearly scale. Model parameters affecting the rate of soil water dynamics significantly affects model errors of all three fluxes at weekly to seasonal scales. Simulations using modified parameter values reduced model errors of all three fluxes from daily to yearly scales, and improved performance of CABLE for this site.

[52] For Tumbarumba, we found that reducing the sensitivities of stomatal conductance and soil evaporation to soil water significantly reduced the variance of model errors of all fluxes at all frequencies, and the bias in the mean trends of NEE and LE. We also demonstrated that analyzing model errors as a function of time at different frequencies can identify periods when model errors are relatively large, such as in 2003 when the forest was subjected to drought and insect attack. These errors were reduced by modifying two parameters affecting respiration rates of plant and soil for 2003 only.

Appendix A:: Description of the Community Atmosphere-Biosphere-Land Exchange Model

[53] Community Atmosphere-Biosphere-Land Exchange (CABLE) is quite similar to some other land surface models, such as CLM [Oleson et al., 2010] and ORCHIDEE [Krinner et al., 2005] in representing the range of biophysical processes for climate simulations. A study by Abramowitz [2005] showed that the performance of CABLE also is quite similar to CLM and ORCHIDEE for simulating the surface fluxes from a range of sites globally. Some of major differences between CABLE, CLM or ORCHIDEE are: CABLE uses the theory developed by Goudriaan and van Larry [1994] for simulating radiative transfer in plant canopies, where CLM and ORCHIDEE use the two-stream approximation [Sellers et al., 1996]. CABLE uses Ball-Berry-Leuning stomatal model [Leuning, 1995], whereas CLM and ORCHIDEE use the Ball-Berry stomatal model [Ball et al., 1987]. CABLE as used in this study does not simulate dynamics of carbon pools, whereas CLM and ORCHIDEE do. There are also differences in representing the land surface, such as the classification of plant functional types, number of soil layers and soil depth and so on.

[54] CABLE consists of five submodels: radiation, canopy micrometeorology, surface flux, soil and snow, and ecosystem respiration. The radiation submodel computes the net diffuse and direct beam radiation absorbed by each of two big leaves and by soil surface in the visible, near infrared and thermal radiation, and the surface albedo for visible and near infrared radiation. The canopy micrometeorology submodel computes canopy roughness length, zero-plane displacement height and aerodynamic transfer resistance from the reference height or the height of the lowest layer in a climate model to within canopy air space or soil surface. The surface flux submodel computes fluxes of latent and sensible heat, and net canopy photosynthesis. The soil and snow model computes temperature and moisture at different depths in soil, snow age, snow density and depth, and snow covered surface albedo when snow presents. The ecosystem respiration submodel computes the nonleaf plant tissue respiration, soil respiration and net ecosystem CO2 exchange.

[55] The structure of CABLE model codes is dictated by the relationship of inputs/outputs between different submodels. A submodel has to be executed first if its outputs are used as inputs to another submodel. In CABLE, the radiation submodel is called first, as it provides the estimates of the absorbed radiation by plant canopies and soil for the surface flux submodel. The surface flux submodel is called before soil and snow submodel, as the surface flux submodel provides the estimates of water extraction and ground heat flux, which are required in the soil and snow submodel. The ecosystem respiration model is called last because soil respiration depends on soil temperature and moisture in the rooting zone.

[56] Within the surface flux submodel, there are two nested iteration: stability iteration loop and leaf temperature iteration loop. The canopy temperature iteration loop is nested within the stability iteration loop, because calculation of stability of the airflow between the reference height and the air space within the canopy depends on surface (canopy and soil) fluxes of latent and sensible heat, latent and sensible heat fluxes are calculated within the canopy temperature iteration loop.

[57] At the first stability iteration, neutral conditions are assumed, temperature and specific humidity of the air within the canopy space (Ta, qa) are assumed to be equal to their respective values at the reference height (Tref, qref), and the fluxes of latent heat, sensible and ground heat, and net canopy photosynthesis are calculated in the canopy temperature iteration loop, then the stability parameter is updated using the estimated surface fluxes of latent and sensible heat, and aerodynamic resistance is calculated, and the values of Ta and qa are updated, all the surface fluxes are recalculated with reference to the in-canopy variables in the canopy temperature iteration loop. Iterations are terminated only when the specified convergence criteria are met for both loops.

[58] Following the calculation of net photosynthesis, latent, sensible and ground heat fluxes and surface temperature, the only prognostic variable within the surface flux submodel, canopy water storage, is updated.

[59] The soil and snow submodel updates the soil moisture and temperature for all layers using the fluxes calculated in the surface flux submodel. If snow present, snow model will be used to update the density and thickness of each snow layer, and ground surface albedo. Finally the ecosystem respiration model is used to compute respiration by woody tissue and root, soil respiration and net ecosystem CO2 exchange (NEE).

[60] Details of the model have been presented by Raupach et al. [1997], Wang and Leuning [1998], and Kowalczyk et al. [2006]. Only a brief description of each submodel is given below.

A1. Radiation Submodel

[61] The radiation submodel has been described in detail by Wang and Leuning [1998], only the estimates of the radiation absorbed by vegetation canopy and soil are presented here.

[62] Absorption of visible (j = 1) or near infrared (j = 2) radiation by sunlit (i = 1) or shaded (i = 2) leaves within the canopy, Qi,j is calculated as

equation image
equation image

where Ib,j and Id,j are the direct beam and diffuse radiation flux density within wave band j in W m−2, αb,j and αd,j are the canopy reflectance for direct beam and diffuse radiation in wave band j (see Wang [2003] for further details), ωf,j is the leaf scattering coefficient (transmittance + reflectance) in wave band j, kb and kd are the extinction coefficients of direct beam and diffuse radiation of the canopy if all leaves are black, i.e., ωf,j = 0, and k*b,j and k*d,j are the extinction coefficient of direct beam and diffuse radiation for the canopy, and are calculated as kbequation image and kdequation image.

[63] The function χ(x) is defined as

equation image

For thermal radiation (j = 3), the radiation absorbed by the sunlit or shaded leaves are calculated as

equation image
equation image

where σ is the Stefan-Boltzman constant (5.67 × 10−8 W m−2 K−4), Ls is the incoming long wave radiation from the sky (W m−2) and Lc is the upwelling radiation from the land surface when the vegetation canopy temperature (W m−2), Tc, is equal to the temperature of the air within canopy space (Ta). ɛa and ɛf are the emissivities of the surface air and leaf, respectively.

[64] The total radiation (short-wave and long-wave radiation) absorbed by the soil, Qsoil, is

equation image

where Ts,o is the soil surface temperature in K and ɛs is the emissivity of the soil surface.

A2. Canopy Micrometeorology Submodel

[65] This submodel computes surface roughness length, zero-plane displacement height and aerodynamic resistance using the theory developed by Raupach [1994] and Raupach [1989a, 1989b] as implemented in SCAM [Raupach et al., 1997].

[66] The surface roughness length (z0 in m) and zero-plane displacement height (d in m) are two important parameters in the model for estimating the aerodynamic transfer resistance within the canopy. They are calculated as

equation image

where h is the canopy height (m), κ is the von Karman constant (=0.4), uh is the mean wind speed at the canopy height, u* is the friction velocity (m s−1), u*/uh is rather constant for many natural surfaces, and is approximated as

equation image

where cs is the substrate drag coefficient, cr is element drag coefficient and L is the canopy leaf area index that is not buried by snow [see Raupach et al., 1997]. Parameter a is the maximal value of u*/uh, which is equal to 0.3 in our model. In our model, cs = 0.003, cr = 0.3.

[67] ψh in equation (A7) is the roughness-sublayer influence function, and is calculated as

equation image

where cw is an empirical constant (=2).

[68] The zero-plane displacement height in equation (A7) is calculated as

equation image

where cd is an empirical constant (=15) [see Raupach et al., 1997].

[69] The friction velocity, u* is related to the wind speed (uref) at the reference height (zref) as

equation image

where z0 and d are height of surface roughness length (m) and zero-plane displacement height (m), ΨM is the integral stability function for momentum, LMO is the Monin-Obhukov length (m). The stability parameter, ζ can be estimated as

equation image

The stability function, ΨM is calculated using the Businger-Dyer form for unstable cases and the Webb form for stables cases [see Garratt, 1992]. The stability loop is considered to have converged when the estimates of ξ between two successive iterations differ by less than 1%.

[70] CABLE uses the Localized Near Field (LNF) theory to describe the turbulent transfer within and above the canopy [see Raupach, 1989a, 1989b]. LNF accounts for the fact that eddies responsible for most scalar transfer in a canopy have a vertical length scale close to the canopy height. The scalar (water, heat and CO2) concentration profile as a result of turbulent transfer at height z in the air, C(z), is composed of the “far-field” and “near-field” contributions, i.e., C(z) = Cf + Cn. Two turbulence properties, the vertical velocity standard deviation σw(z) in m s−1, and the Lagrangian timescale, TL(z) in s are used to describe compliance of the “far-field” component with a gradient diffusion relationship between flux and concentration. The aerodynamic transfer resistance from the air space within the canopy to the reference level zref, ra, is derived as:

equation image

The aerodynamic resistance from the soil surface to the air space within the canopy, rg (in s m−1), is given by

equation image

where z0s is the soil roughness length in m and fsp is the sparseness factor, varying from 0 for bare ground to 1 for medium to high dense canopy (L > 1.1) [see Raupach et al., 1997].

[71] The aerodynamic transfer resistance, ra, can be used to estimated the temperature (Ta) and specific humidity (qa) of the air within the canopy from the values at the reference height (Tref, qref) if the surface fluxes required for estimating the stability function are given (see Raupach et al. [1997] for further details).

A3. Surface Flux Submodel

[72] This submodel computes the latent and sensible heat fluxes from canopy (λEc, Hc) and soil (λEs, Hs), ground heat flux (Hg), net canopy photosynthesis (Ac) and updates canopy water storage (Wc) at each time step.

[73] Based on the principle of energy and mass conservation, we set up two sets of equations, one for the canopy net photosynthesis (Ac,i) and stomatal conductance (Gs,i), and the other for the canopy energy fluxes (λEc, Hc). Further details about the equations are provided by Wang and Leuning [1998]. These two sets of equations are solved numerically by iteration for estimating the surface fluxes for a given value of air temperature (Ta) and specific humidity (qa) of the air within the canopy [see Wang and Leuning, 1998; Kowalczyk et al., 2006]. At the first iteration, the canopy temperature (Tc) is set to the value of Ta, and the value of Tc is used to calculate all the temperature-dependent photosynthetic parameters, such as maximum carboxylation rate (vcmax), potential electron transport rate (jmax). Those parameters values are used in solving the first set of equations for net canopy photosynthesis (Ac,i) and stomatal conductance (Gs,i). The estimate of Gs,i is used to solve the second set of equation of energy fluxes and update Tc. The iteration terminates when the difference between the estimates of Tc between two successive iterations is <0.05 K [see Kowalczyk et al., 2006].

[74] The latent and sensible heat fluxes are calculated as a linear combination of the fluxes from the dry canopy and the wet canopy, i.e.,

equation image
equation image

where λ is the latent heat of vaporization (J kg−1), λEdry and Hdry are the latent and sensible heat flux of the dry canopy in W m−2. The corresponding λEwet and Hwet are for the wet canopy, all in W m−2. The canopy wet fraction, fwet, is calculated as

equation image

where Wc is canopy water storage (mm), and Wcmax is the maximal canopy water storage and is calculated as 0.1L.

[75] Canopy photosynthesis and transpiration is coupled through stomatal conductance that is modeled using the following model [Ball et al., 1987; Leuning, 1990]:

equation image

where G0,i is the residual or cuticular conductance in mol m−2 s−1, Ds,i, Cs,i and Ac,i are the water vapor pressure deficit at the leaf surface (Pa), CO2 concentration at the leaf surface in mol mol−1 and net photosynthesis of leaf i in mol m−2 s−1, respectively; Γ is the CO2 compensation point of photosynthesis in mol m−1 and is a function of canopy temperature (Tc) [see Leuning, 1990], a1 and D0 are two model parameters (a = 4 for C4 plant and = 9 for C3 plants), D0 = 1500 Pa), fwsoil is the influence of soil water limitation on stomatal conductance, and is calculated as

equation image

where βv is the model parameter, and froot,m is the fraction of root mass in soil layer m, and θm is the volumetric soil water content of soil layer m, θwilt and θfc are the volumetric soil water contents at wilting point and field capacity, respectively.

[76] For deciduous forest, the maximal carboxylation rate (vcmax) and the maximal potential electron transport rate of a leaf at the top of the canopy at a leaf temperature of 298 K also depend on leaf phenology that is modeled as a function of soil temperature at 25 cm depth [see Wang et al., 2007]. That is

equation image
equation image

and

equation image

where vcmax and jmax are maximum carboxylation rate and maximal potential electron transport rate of a mature leaf at the canopy top during the middle of growing season, both in μmol m−2 s−1, Ts,25 is the soil temperature at 25 cm depth from the soil surface (K), and Tminvj and Tmaxvj are two model parameters (K).

[77] The latent (Es), sensible (Hs) and ground (G) heat fluxes from the soil are calculated as follows:

equation image
equation image

where ws is the soil wet factor, Δqs is the difference of the specific humidity at soil surface and the air within canopy (kg kg−1), ρa is the density of air (kg m−3), Ts,1 is the surface soil layer temperature (K), Hg is the ground heat flux in Wm−2, and is calculated using the equations as described by Bonan [1996].

[78] The soil wet factor ws is calculated as

equation image

where βs is an empirical model parameter, where θ1 is volumetric soil water content of the surface soil layer.

[79] For the wet canopy, the amount of the wet canopy, Wc in mm, is calculated as

equation image

The term min(0, (1 − fwet)Edry) in equation (A26) represents the amount of dew formed onto the canopy surface, and PI is the canopy interception of atmospheric precipitation (mm s−1) and is calculated as

equation image

where P is the liquid rainfall (mm/Δt) and Δt is the time step (s).

A4. Soil Submodel

[80] The soil is a heterogeneous system composed of three constituent phases, namely the solid, water and air [Hill, 1982]. Water and air compete for the same pore space and change their volume fractions due to precipitation, evapotranspiration, snowmelt and drainage. Soil hydraulic and thermal characteristics depend on the soil type as well as frozen and unfrozen soil moisture content. In this model, soil moisture is assumed to be at ground temperature, so there is no heat exchange between the moisture and the soil due to the vertical movement of water. Volumetric soil moisture, θm, is considered in terms of liquid and ice components, i.e., θm = θm,l + θm,i. Ice decreases soil porosity but liquid moisture can move through remaining unfrozen soil pores.

[81] Each soil type in our model is described by its saturation content θsat, wilting content θwilt, and field capacity (θfc). θsat is equal to the volume of all the soil pores. Here, an additional variable, actual saturation θas, is used. Actual saturation excludes the pores filled with ice, θas = θsat − θi.

[82] The one-dimensional conservation equation for soil moisture in the absence of ice is described by

equation image

where θ is the volumetric soil moisture content (m3 m−3), and q is the kinematic moisture flux (m s−1, positive downward), Fw(z) is the water uptake by plants from depth z (m s−1), z is depth into the soil (m). From Darcy's law, the moisture flux is given by:

equation image

where K is the hydraulic conductivity (m s−1), and D is the soil moisture diffusivity (m2 s−1), and is equal to −Kψ/∂θ, where ψ is the water potential of the soil (m).

[83] Combining equations (A28) and (A29), we have

equation image

with the two boundary conditions of

equation image
equation image

where qin is the water flux infiltrated downward through the soil surface (m s−1), Es is the soil evaporation (m s−1), cdrain is soil drainage coefficient, Z is the depth of the bottom layer (m), Fw(z) is the contribution from soil depth z to dry canopy transpiration (Edry) by root uptake, and is proportional to froot,m(θ − θwilt)dz. When the amount of dry canopy (Edry) cannot be met by root uptake, Edry is reduced within the leaf temperature iteration loop within the surface flux submodel, and the leaf energy balance is recalculated until Edry = ∫0ZFw(z)dz within the loop.

[84] The relationship between hydraulic conductivity or soil water potential and soil volumetric moisture content are calculated as

equation image
equation image

Parameter b varies with soil texture [see Campbell, 1974; Clapp and Hornberger, 1978].

[85] The equation governing heat transport in soil is

equation image

where ρs is the density (kg m−3), cs is specific heat (J kg−1 K−1), κs is thermal conductivity (W m−1 K−1) of the soil. The volumetric heat capacity (ρscs) is calculated as the weighted sum of the heat capacity of dry soil, liquid water and ice. The boundary conditions for equation (A35) are:

equation image
equation image

κs plays a crucial role in determining the depth of freezing/thawing as it varies by about one order of magnitude as the soil approaches saturation point and can increase further when soil ice is present [see Johansen, 1975].

[86] The presence of water or ice in the soil can alter soil's thermal properties and thus modify soil temperature by several degrees. To take this into account, we calculate soil thermal properties at each time step. The snow model has been described by Kowalczyk et al. [1994].

[87] To solve equations (A30) and (A35) numerically, the soil is divided into six layers, and the thickness of each layer from the top layer is 0.022 m, 0.058 m, 0.154 m, 1.085 m and 2.872 m. Only the top layer contributes to soil evaporation and plant roots can extract water from all layers, depending on the amount of available soil water and the fraction of plant roots in each layer.

A5. Ecosystem Respiration Submodel

[88] The ecosystem respiration submodel calculates the respiration of wood and root and soil. They are calculated as

equation image
equation image
equation image

where Cwood and Croot are the amounts of carbon in wood and roots (g C m−2), respectively; rwood, rroot and rsoil are the respiration rates of wood, root and soil (at Ta = 20°C for wood or equation images = 285K for root and soil) in μmol m−2 s−1 (g C)−1 for wood and root, and in μmol m−2 s−1 for soil; they are biome-specific model parameters [see Kowalczyk et al., 2006]. equation images is the root-mass weighted mean of soil mean temperature (K) and equation images is the root-mass weighted mean of soil water content, and is calculated as

equation image

where froot,m is the fraction of root mass in soil layer m, θm is the volumetric soil water content of soil layer m, and θwilt and θfc are the volumetric water content at wilting point and field capacity, respectively. The functions f1 and f2 are calculated as

equation image
equation image

The function f1 is based on the work of Tjoelker et al. [2001] and f2 is based on the work of Reichstein et al. [2002], where b1 is a biome-dependent model parameter (μmol C m−2 s−1), and the empirical constants b2, b3, Ts0 and θs0 are equal to 52.4 (K), 285 (K), 227.2 (K) and 0.16, respectively [see Reichstein et al., 2002]. Tavg is the annual mean soil temperature (K), and is assumed to be equal to the temperature of the deepest soil layer (Ts,6).

[89] The net ecosystem exchange of CO2, NEE, is then calculated as

equation image

Inputs to this submodel are temperature of the air within the canopy (Ta), root-mass weighted mean temperature and moisture of the soil (equation images, equation images), and the amount of carbon in root (Croot in g C m−2) and wood (Cwood in g C m−2), and the output of this submodel is NEE.

Acknowledgments

[90] We are most grateful for the financial support by CSIRO and the Australian Greenhouse Office for this work for over a decade, and all scientists involved in collecting and making the measurements at the two forest sites available to us. We thank Jan Verbesselt and Darius Culvenor of CSIRO for providing us their unpublished data of NDVI measurements for Tumbarumba forest, and all scientists involved in the Harvard Forest flux measurements for making their data available to us, and the editors and two reviewers for their constructive comments.

Ancillary