A global reanalysis of vegetation phenology



[1] Simulations of the global water and carbon cycle are sensitive to the model representation of vegetation phenology. Current phenology models are empirical, and few predict both phenological timing and leaf state. Our previous study demonstrated how satellite data assimilation employing an Ensemble Kalman Filter yields realistic phenological model parameters for several ecosystem types. In this study the data assimilation framework is extended to global scales using a subgrid-scale representation of plant functional types (PFTs) and elevation classes. A reanalysis of vegetation phenology for 256 globally distributed regions is performed using 10 years of Moderate Resolution Imaging Spectroradiometer (MODIS) fraction of photosynthetically active radiation (FPAR) absorbed by vegetation and leaf area index (LAI) data. The 9 · 108 quality screened observations (corresponding to <1% of the globally available MODIS data) successfully constrain a posterior PFT-dependent phenological parameter set. It reduces the global FPAR and LAI prediction error to 20.6% and 14.8%, respectively, compared to the prior prediction error. A 50 year long (1960–2009) daily 1° × 1° global phenology data set with a mean FPAR and LAI prediction error of 0.065 (−) and 0.34 (m2 m−2) is generated. Temperate phenology is best explained by a combination of light and temperature. Tropical evergreen phenology is found to be largely insensitive to moisture and light variations. Boreal phenology can be accurately predicted from local to global scales, while temperate and mediterranean landscapes might benefit from a better subgrid-scale PFT classification or from a more complex canopy radiative transfer model.

1. Introduction

[2] Land surface vegetation is an interactive part of the climate system. Leaf transpiration influences cloudiness, temperature and moisture patterns of the atmosphere on the synoptic to climatological timescale [Heck et al., 1999; Tsvetsinskaya et al., 2001; Lu et al., 2001; Kim and Wang, 2005; Betts and Viterbo, 2005; Betts et al., 2007]. Vegetation biomass acts as a sink (or source) for the atmospheric carbon budget on a seasonal to centennial timescale [Keeling et al., 1996; Kramer et al., 2000; Schaefer et al., 2005; Piao et al., 2007; Körner, 2003]. The two processes regulating water loss and carbon uptake are coupled [Schimel et al., 1997; Sellers et al., 1997] and both depend on leaf stomatal opening and leaf presence. Leaf physiology controls stomates and is largely driven by local scale and short term weather events like the diurnal variability of temperature and radiation [Jarvis, 1976; Law et al., 2002; Larcher, 2003]. Leaf phenology on the other hand describes the timing of leaf appearance, presence and senescence and can be linked to the large scale seasonal to interannual climatic variability [Scheifinger et al., 2002; Menzel et al., 2006; Penuelas et al., 2009; Körner and Basler, 2010].

[3] Leaf physiology and leaf phenology are treated separately in most land surface models (LSMs) used to simulate the terrestrial water and carbon cycle. While several mechanistic formulations of plant physiological processes have been developed during the last three decades [Jarvis, 1976; Farquhar et al., 1980], highly empirical representations of plant phenology are used in LSMs [Cox, 2001; Foley et al., 1996; Levis and Bonan, 2004; Jolly et al., 2005]. In several LSMs, phenology is used as a means to scale leaf level physiological processes to the canopy level [Sellers et al., 1996b, 1997]. Phenology models used in LSMs simulate a continuous biophysical state of vegetation at the landscape scale rather than the timing of species-specific and local-scale events like flowering or bud burst. The latter information is available from long term phenological observations that are mostly confined to temperate climate zones [van Vliet et al., 2003; Rutishauser et al., 2007].

[4] However, the largest phenological model deficiencies are found for subtropical and mediterranean vegetation because model parameters are often generalized from temperate vegetation to global scales [Stöckli et al., 2008b]. Models often simulate a temporal mismatch in spring green up in the order of 1–2 months and show unrealistic drought responses of LAI that have adverse effects on the predicted terrestrial water and carbon fluxes [Kucharik et al., 2006; Randerson et al., 2009]. The ultimate goal to overcome such deficiencies is to further develop LSMs with a mechanistic terrestrial carbon-nitrogen cycling. They allow the coupling of leaf phenology and leaf physiology by use of for instance a prognostic carbon gain-loss formulation [Thornton et al., 2002; Arora and Boer, 2005].

[5] Satellite-based data assimilation can serve as an intermediate step to constrain unrealistic parameters of empirical phenology models and it might be used to augment the realism of terrestrial biosphere models [Demarty et al., 2007; Mahadevan et al., 2008; Rüdiger et al., 2010; Knorr et al., 2010; Rayner, 2010]. In the work of Stöckli et al. [2008b] we presented a local-scale data assimilation framework based on the Ensemble Kalman Filter (EnKF) [Evensen, 2003, 2009] that was able to mitigate several phenology model deficiencies by conditioning empirical model parameters with satellite-based phenological observations.

[6] Our local-scale data assimilation framework is however unrealistic for the prediction at the regional scale due to the increase of landscape heterogeneity. A single set of parameters representing a mixed vegetation signal of a specific location cannot be used at another location with a different vegetation composition. A global-scale prediction would hence require a cumbersome parameterization procedure for each grid point. In order to be useful on global scale, our previous framework needs to be extended. The main question is then how to select the bins needed to disaggregate global phenology into a discrete set of functional classes. It was chosen here to split the mixed landscape into a discrete set of plant functional types (PFTs) and elevation classes (HGTs) for the following reasons. In earth system models the terrestrial biochemical cycle is often decomposed on the subgrid-scale by using PFTs [Sitch et al., 2003; Kucharik et al., 2006; Thornton et al., 2007]. In comparison to biomes PFTs group plant species with similar physiological, structural and phenological traits. Satellite remote sensing data can be used to derive PFTs globally [Bonan, 2002; Lawrence and Chase, 2007]. However, any satellite-based classification is ultimately constrained by a incomplete set of functional traits [Ustin and Gamon, 2010] that only account for optical vegetation properties. Elevation classes are used since recent findings show that for instance a 100 m elevation difference can shift the leaf-out date by several days [Fisher et al., 2006] which requires a subgrid-scale treatment of the forcing weather data in a global prediction where grid cells can include substantial variability in elevation.

[7] The aim of this study is to create a global MODIS-based reanalysis data set of vegetation phenology. It should provide a data assimilation and modeling framework to earth system modelers with the capability to assimilate and predict FPAR and LAI of natural vegetation types. We firstly would like to evaluate whether the chosen data assimilation scheme allows to constrain a PFT-dependent parameter set with 10 years of assimilated MODIS data. We secondly would like to test whether the chosen phenology model, the PFT and HGT classification and the final satellite-constrained parameter set are suited to yield realistic global-scale phenological predictions. In section 2 the prognostic phenology model is presented, followed by a description of the data assimilation system. Global-scale data assimilation experiments are then performed to constrain a PFT-dependent phenological parameter set. This parameter set is used to predict global, regional and local FPAR and LAI. A global phenological reanalysis data set covering 50 years (1960–2009) is finally presented. Analysis of observed and predicted FPAR and LAI followed by a thematic discussion then evaluate the soundness of our method and data set.

2. Methods

2.1. Phenology Model

[8] The GSI (Growing Season Index) by Jolly et al. [2005] diagnoses the state of vegetation by use of three major climatic drivers serving as surrogates for the underlying controls on vegetation phenology: low temperatures, evaporative demand, and photoperiod. Stöckli et al. [2008b] and this study extended the GSI model into a prognostic phenology model that predicts the biophysical vegetation states FPAR and LAI.

2.1.1. Theory

[9] The GSI (−) is the product of three environmental factors f(T), f(L) and 1 − f(W),

equation image
equation image

where x = {T, L, W} are multiday running mean averages of the minimum daily temperature Tm (K), the mean daily global radiation Rg (W m−2) and the mean daily vapor pressure deficit vpd (mb), using averaging times τT, τL and τW (days). Tmax, Tmin, Lmax, Lmin, Wmax and Wmin are maximum and minimum T, L and W, respectively. L can alternatively be driven by photoperiod (day length) instead of global radiation as suggested by Jolly et al. [2005] and scientifically outlined by Körner [2006].

[10] The prognostic phenological state P (−) can be related to the biophysical state FPAR (−) by use of a linear relationship [Sellers et al., 1996a; Los et al., 2000],

equation image

where f(x) is given in equation (2), FPARmin and FPARmax are the minimum and maximum FPAR corresponding to the least and most developed state of vegetation. The growth vector ∂GSI/∂t (−) then gives the direction and rate of leaf growth or decay used to calculate the change in FPAR with a logistic growth model,

equation image
equation image

As presented by Dickinson et al. [2008] growth and senescence can be modeled as two separate processes. We choose a different maximum rate for leaf growth γg (day−1) and leaf senescence (γd) instead,

equation image

According to Sellers et al. [1996a] and Los et al. [2000] the biophysical state LAI (m2 m−2) can be related to FPAR by use of the Monsi-Saeki light interception model based on Beer's law for LAI, respectively [Monsi and Saeki, 2005],

equation image
equation image

where fv (−) is the vegetation fraction and FPARsat (−) is the FPAR value reached at the maximum leaf area index LAImax (m2 m−2).

2.1.2. Implementation

[11] A semi-implicit numerical scheme is used for the time integration. In comparison to Stöckli et al. [2008b] each grid-scale FPAR and LAI prediction is composed of subgrid-scale predictions covering h = 1 … nHGT elevation classes (HGT) and p = 1 … nPFT plant functional type (PFT) classes. Meteorological forcing is downscaled by HGT. Phenological model parameters are decomposed by PFT. The prognostic states are therefore decomposed by both HGT and PFT. They can be identified by their superscript time indices t and t + 1 in the following equations. GSI is diagnosed at every time step,

equation image

with new prognostic values of xt+1 = {T, L, W} that depend on their previous values xt, on the current elevation-dependent weather forcing y = {Tm, Rg, vpd} and on the PFT-specific time averaging parameters z = {τT, τL,τW},

equation image

Leaf growth ΔFPAR depends both on the new phenological state GSI and the previous biophysical state FPAR,

equation image
equation image
equation image
equation image
equation image

Compared to Stöckli et al. [2008b] LAI is a diagnostic variable derived from the prognostic state FPAR at each time step,

equation image

[12] Grid-scale FPAR and LAI are calculated by area weighted summation (ap and ah are fractional areas for each PFT and HGT class) of the PFT- and HGT-specific FPAR and LAI states:

equation image
equation image

[13] The following numerical constraints are used: P(1 − P) = max(P(1 − P),0.01); fv = 1.0 since the vegetation fraction is represented by the fractional areas ap of each PFT; FPARsat = min(max(FPARsat, 0.001),0.999). Natural logarithms in equation (16) are constrained to be larger than 0.0 and lower than 1.0.

2.2. Data Assimilation Model

[14] Ensemble data assimilation is the key method of this study. It enables to find realistic values as well as their uncertainties for a large set of unknown PFT-specific model parameters in the above equations by use of a global set of satellite observations.

2.2.1. Theory

[15] The Ensemble Kalman Filter (EnKF) after Evensen [1994, 2003] is applied in this study with modifications for joint state and parameter estimation following Moradkhani et al. [2005] and Evensen [2009]. The EnKF conditions N prior model states and parameter ensemble members with m observations yielding a posterior model state and parameter ensemble,

equation image

where Af is the ensemble matrix containing the prior model states and parameters. They are updated to Aa when new observations D become available. H is the operator relating observed to model states and parameters, DHAf is the matrix of innovation and K is the Kalman gain (for details, see Evensen [2003]). A is a matrix holding N ensemble members of the vector ψ with n states x and parameters θ. D is the matrix holding N ensemble members of the vector d with m observations,

equation image
equation image
equation image
equation image

[16] The state and parameter ensemble members ψi0 are perturbed at the beginning of the model integration by use of a Gaussian distribution with mean 0 and initial variance Vψ0. The observation ensemble members di are perturbed with mean 0 and with the observation variance Vd at each analysis time step.

2.2.2. Implementation

[17] States and parameters making up the Matrix A are defined in Table 1 with initial (prior) values similar to those given by Jolly et al. [2005] and variances encompassing the orders of magnitude found in the global climate system.

Table 1. State and Parameter Vector ψ, Initial Values ψ0, Initial Variances Vψ0, Minimum and Maximum Bounds for the Ensemble Mean
States x
LAI2.51.0010m2 m−2
T 0.25200350K
L 1001000W m−2
W 0.010100mb
Parameters θ
Lmax1501000−100500W m−2
Lmin501000−100500W m−2
LAImax7.00.5010m2 m−2

[18] Directly assimilating all global 1 km MODIS FPAR and LAI observations would yield a Matrix D with dimensions of O(109) observations  × O(103) ensemble members which is computationally very expensive to solve with the EnKF framework. Therefore, superobservations equation image for each model grid cell are created from observations do with o = 1 … nobs:

equation image
equation image
equation image
equation image
equation image
equation image

where equation image and equation imaged are the grid-scale superobservation and its uncertainty, and ah and ap are the grid-scale fractional HGT and PFT areas of the superobservation. By use of the weighting scheme wo superobservations contain the highest quality satellite data within each grid cell.

[19] The observation operator HA is created by linearly aggregating modeled FPAR and LAI weighted by observed elevation distribution equation image and PFT distribution equation image for each superobservation:

equation image
equation image
equation image

where x = [FPAR, LAI] is predicted by the prognostic phenology model. Ensemble perturbations HA′ are rescaled with the state variance because the weighed addition of ensemble members by definition deflates the ensemble variance when not all weights are equal.

[20] Aa is calculated by use of the square root implementation of the EnKF scheme as presented by Evensen [2004, section 7.3, equations (69)–(93)] using the low-rank pseudoinverse calculation because the observation count in our analysis will always exceed the ensemble size. Overdispersal, overconfidence and nonphysical drift of the posterior state and parameter ensemble is taken care of by applying:

equation image
equation image
equation image
equation image

where α = 1.0 is the upper limit for the ensemble dispersal, relative to the prior ensemble variance, β = 0.1 is the lower limit for the ensemble shrinkage, relative to the prior ensemble variance, and Amin and Amax are the lower and upper bounds for the ensemble mean as given in Table 1. It is important to note that the latter physical limits do only move the ensemble mean without modifying the ensemble variance.

[21] Aa is a global solution that updates all local states and parameters with a single global analysis. This is needed to estimate a single global set of parameters. The presented assimilation scheme can also be used for state estimation. Each local analysis then uses a spatial influence function that updates only states close to the observations. Such a local analysis for state estimation could follow the global analysis for parameter estimation as for instance outlined by equations (80) and (81) in the work of Evensen [2003].

2.3. Data

2.3.1. Meteorological Forcing Data

[22] Daily minimum temperature Tm, daily mean global radiation Rg and daily mean vapor pressure deficit vpd serve as forcing weather data for the prognostic phenology model. 1° × 1° gridded ECMWF ERA 40 [Uppala et al., 2005] are used during 1958–1989 and ERA Interim [Berrisford et al., 2009] are used during 1990–2006. The ensemble members of Tm, Rg and vpd are stochastically perturbed at each grid point and at each time step with a variance of 0.025 K, 1.0 W m−2 and 0.005 mb, respectively.

[23] The grid-scale 1° × 1° ERA 40 and ERA Interim data are the starting value for calculating Tm, vpd and Rg for each subgrid-scale elevation class. Subgrid-scale Tm is derived from grid-scale Tm by use of a lapse rate of −0.6 K 100 m−1; subgrid-scale vpd is calculated by keeping the mixing ratio constant with height and applying the subgrid-scale Tm to the vpd calculation. Subgrid-scale Rg increases by 0.3 W m−2 100 m−1 (mainly due to decreased atmospheric optical thickness at greater elevation). The local-scale experiments carried out at the four FLUXNET sites are driven by the grid-scale 1° × 1° ERA 40 and ERA Interim weather forcing downscaled to the single elevation class of the respective FLUXNET site.

2.3.2. Satellite Observation Data

[24] TERRA MODIS FPAR and LAI (MOD15A2, Collection 5 [Myneni et al., 2002]) fill the observation vector d in the assimilation experiments. They are also used as comparison data in section 3. Observations are quality screened and used only if their values are inside the valid range, and if none of the following MOD15A2 quality flag bits are set: FparLai bit 2 (dead detectors) FparLai bits 3 or 4 (clouds present or unclear) FparLai bit 7 (failed retrieval) FparExtra bit 0 or 1 (pixel not on land) FparExtra bit 2 (snow or ice) FparExtra bit 5 (internal cloud mask) FparExtra bit 6 (cloud shadow detected)

[25] Observation uncertainty Vd for valid observations is calculated by multiplying the minimum uncertainty with the sum of the “severity factor” s which is then added to the minimum uncertainty. Minimum uncertainty is defined as 0.05 (−) for FPAR and 1.0 (m2 m−2) for LAI. s = 0 if FparLai bit 0 (back up algorithm) set: s = s + 1 if FparLai bit 5 (saturated retrieval) set: s = s + 2 if FparLai bit 6 (empirical method used) set: s = s + 4 if FparExtra bit 3 (aerosols present) set: s = s + 3 if FparExtra bit 4 (cirrus clouds detected) set: s = s + 8

2.3.3. Elevation Data

[26] The subgrid-scale distribution of elevation classes is derived from the gap-filled CGIAR-CSI SRTM global elevation data set version 4 (A. Jarvis et al., Hole-filled seamless SRTM data v4, http://srtm.csi.cgiar.org, 2008), extended to the polar areas with GTOPO30 elevation data [U.S. Geological Survey, 1996]. The nHGT elevation classes are equally distributed over two standard deviations of the elevation range in the assimilation area. Elevations below or above the lowest or highest class are counted to the lowest and highest class, respectively. The area fraction ah for each elevation class is calculated by grid cell.

2.3.4. Plant Functional Type Data

[27] The subgrid-scale distribution of 35 plant functional type classes is derived from MOD12Q1 Collection 4 Land Cover [Friedl et al., 2002], MOD44B Collection 3 Vegetation Continuous Fields [Hansen et al., 2003], AVHRR Tree Cover Continuous Fields [Defries et al., 2000], MOD15A2 Collection 5 [Myneni et al., 2002], global crop data [Leff et al., 2004], global temperature (Version 2.02) and precipitation (Version 2.01) data [Wilmott and Matsuura, 2007], following the method described by Lawrence and Chase [2007] and Bonan et al. [2002]. The resulting PFT data set contains the area fraction ap for each of the 35 PFTs by grid cell. The PFT processing is described in Appendix A, and a list of PFTs is given in Table 2. In this publication only the 15 natural PFTs are analyzed even though all 35 PFTs were included in the data assimilation.

Table 2. List of PFTs Including Their Abbreviationsa
PFT NumberPFT NamePFT Abbreviation
  • a

    Only the PFTs of natural vegetation types are given.

1Bare soil, rock, ice, permanent snowbar all
2Trees: temperate evergreen needleleafenf tem
3Trees: boreal evergreen needleleafenf bor
4Trees: boreal deciduous needleleafdnf bor
5Trees: tropical evergreen broadleafebf tro
6Trees: temperate evergreen broadleafebf tem
7Trees: tropical deciduous broadleafdbf tro
8Trees: temperate deciduous broadleafdbf tem
9Trees: boreal deciduous broadleafdbf bor
10Shrubs: evergreen broadleafebs all
11Shrubs: temperate deciduous broadleafdbs tem
12Shrubs: boreal deciduous broadleafdbs bor
13Grass: Arctic c3c3g arc
14Grass: non-Arctic c3c3g nar
15Grass: c4c4g all

2.4. Experimental Setup

[28] The data assimilation experiments constrain a set of model parameters. The parameters are then used in global prediction experiments. Figure 1a displays the geographic location of the 256 manually selected regions used for the data assimilation experiments. In order to start where our previous study has ended, the 4 region selection (red squares) includes a temperate, mediterranean, boreal and tropical ecosystem at four FLUXNET sites that are identical to the ones used by Stöckli et al. [2008b]. Figure 1b then shows how the 256 region selection finally becomes representative for the full range of climatic conditions needed in a global prediction. The technical details on both the data assimilation and the prediction model are given in Appendix B.

Figure 1.

(a) Geographic and (b) climatic distribution of the regions used for the data assimilation: experiments with 4 regions (red large squares); 16 regions (red squares + blue triangles); 64 regions (red squares + blue triangles + green diamonds); 256 regions (red squares + blue triangles + green diamonds + violet circles). The colors of the climatic distribution (Figure 1b) qualitatively show the relative probability of occurrence for the given climatic zone (bright yellow, low probability; dark red, high probability).

2.4.1. Data Assimilation

[29] The four data assimilation experiments span 4, 16, 64 and 256 regions with 0.5° × 0.5° spatial coverage per region (subsequently labeled as 4, 16, 64 and 256). Each region is subdivided into 25 0.1° × 0.1° grid cells, where each grid cell has a subgrid-scale representation of 10 HGT classes and 35 PFT classes. 1000 ensemble members are integrated in time. Prior model parameters and states are initialized and perturbed as given in Table 1. The phenology model is integrated for 30 years by cycling the 10 year observation period (2000–2009) three times. The 8 day MOD15A2 observations are read at the center of their compositing period (days 4, 12, etc.) since the exact compositing day is not given in the MOD15A2 data set. In comparison to Stöckli et al. [2008b] the EnKF analysis is carried out at the end of each year and not after each observation period in order to avoid convergence to local minima. This modification further results in a less overconfined parameter ensemble and is based on the assumption that a yearly constant set of model parameters simulates the seasonal variation of vegetation states.

2.4.2. Global Prediction

[30] Global predictions are carried out on a 1° × 1° global grid with the prior parameter set (subsequently labeled as “prior”) and with the posterior parameter sets obtained by the above described data assimilation experiments (subsequently labeled as 4, 16, 64 and 256). The integrations employ 10 ensemble members spanning the parameter uncertainty and they are integrated forward in time during 1959–2009. 1959 is used as spin-up year. The prognostic states FPAR and LAI are generally within 1% of their spun-up values after 3 months. A final 50 year long global “reanalysis” data set covers the period 1960–2009 and uses the 256 region parameter set.

3. Results

[31] The data assimilation framework is used to estimate a new (posterior) set of global phenological parameters that should yield a better prediction of phenological states. In this section the prior and posterior parameter uncertainties are firstly analyzed. Secondly, the effect of the posterior parameter set on the global, regional and local-scale prediction of phenological states is evaluated.

3.1. Global Parameter Estimation

[32] A total number of 510 empirical model parameters were estimated (Tables 35). They can be separated into 6 climate control parameters, 6 structural parameters and 3 time averaging parameters that are estimated for each of the 15 natural PFTs (the water PFT and the 19 crop PFT parameters are excluded from the analysis).

Table 3. Climate Control Parameters (Mean and Standard Deviation) by PFT Constrained by the Assimilation Using 256 Regionsa
PFTTmin (K)Tmax (K)Wmin (mb)Wmax (mb)Lmin (W m−2)Lmax (W m−2)
  • a

    PFT abbreviations are explained in Table 2.

bar all270.6 ± 0.7290.9 ± 0.812.5 ± 0.723.6 ± 0.4102.7 ± 10.3149.4 ± 6.5
enf tem263.1 ± 0.5276.4 ± 0.36.9 ± 0.347.9 ± 1.3−68.3 ± 7.3216.7 ± 2.5
enf bor263.8 ± 0.6290.0 ± 0.77.6 ± 0.421.4 ± 2.4−82.8 ± 10.0197.4 ± 4.4
dnf bor262.2 ± 0.9275.6 ± 0.718.8 ± 3.027.9 ± 3.8103.9 ± 5.9208.0 ± 2.7
ebf tro271.3 ± 1.8292.8 ± 0.321.9 ± 0.6−1.4 ± 2.282.3 ± 9.4168.9 ± 2.6
ebf tem259.1 ± 1.0285.9 ± 0.310.1 ± 0.420.9 ± 3.014.1 ± 10.735.0 ± 6.0
dbf tro278.0 ± 0.4299.1 ± 0.19.9 ± 0.243.9 ± 0.644.0 ± 13.881.4 ± 7.6
dbf tem269.7 ± 0.3291.5 ± 0.25.1 ± 0.225.4 ± 0.344.3 ± 3.9203.0 ± 1.8
dbf bor271.0 ± 0.6279.8 ± 0.37.0 ± 1.046.9 ± 3.5110.1 ± 3.7223.4 ± 2.2
ebs all265.5 ± 2.2281.7 ± 0.83.4 ± 0.714.4 ± 0.4−7.0 ± 7.1242.4 ± 6.0
dbs tem256.9 ± 0.6298.0 ± 0.21.6 ± 0.444.5 ± 0.5−4.7 ± 9.269.3 ± 3.8
dbs bor273.5 ± 0.3287.8 ± 0.517.5 ± 1.011.7 ± 2.960.8 ± 11.268.0 ± 8.1
c3g arc267.8 ± 0.4282.0 ± 0.42.3 ± 0.313.5 ± 0.519.9 ± 7.1198.2 ± 3.2
c3g nar267.1 ± 0.2298.2 ± 0.51.5 ± 0.215.4 ± 0.1−21.4 ± 6.663.0 ± 3.3
c4g all268.6 ± 0.4279.2 ± 0.34.1 ± 0.223.3 ± 0.2−9.0 ± 5.1217.7 ± 1.4
Table 4. Structural Parameters (Mean and Standard Deviation) by PFT Constrained by the Assimilation Using 256 Regionsa
PFTFPARminFPARmaxγg (days−1)γd (days−1)LAIsat (m2 m−2)FPARsatb
  • a

    PFT abbreviations are explained in Table 2.

  • b

    Units are dimensionless.

bar all0.11 ± 0.000.05 ± 0.000.19 ± 0.030.05 ± 0.019.70 ± 0.311.00 ± 0.01
enf tem0.52 ± 0.010.98 ± 0.000.19 ± 0.020.19 ± 0.025.93 ± 0.250.98 ± 0.00
enf bor0.52 ± 0.011.00 ± 0.010.34 ± 0.030.39 ± 0.046.23 ± 0.280.98 ± 0.00
dnf bor0.33 ± 0.021.00 ± 0.000.49 ± 0.040.37 ± 0.046.69 ± 0.281.00 ± 0.00
ebf tro0.16 ± 0.040.99 ± 0.000.57 ± 0.050.05 ± 0.017.07 ± 0.020.93 ± 0.01
ebf tem0.01 ± 0.031.00 ± 0.010.45 ± 0.040.05 ± 0.006.91 ± 0.070.96 ± 0.00
dbf tro0.27 ± 0.011.00 ± 0.010.37 ± 0.030.21 ± 0.026.85 ± 0.070.93 ± 0.00
dbf tem0.29 ± 0.001.00 ± 0.010.57 ± 0.030.42 ± 0.026.01 ± 0.120.92 ± 0.00
dbf bor0.23 ± 0.011.00 ± 0.010.60 ± 0.040.49 ± 0.046.85 ± 0.230.94 ± 0.01
ebs all0.39 ± 0.010.85 ± 0.020.36 ± 0.040.31 ± 0.046.02 ± 0.310.97 ± 0.01
dbs tem−0.00 ± 0.010.77 ± 0.010.49 ± 0.040.43 ± 0.033.74 ± 0.100.84 ± 0.01
dbs bor0.33 ± 0.010.84 ± 0.020.42 ± 0.030.47 ± 0.047.50 ± 0.310.99 ± 0.00
c3g arc0.12 ± 0.000.63 ± 0.010.30 ± 0.030.15 ± 0.026.80 ± 0.270.99 ± 0.00
c3g nar0.24 ± 0.000.94 ± 0.010.47 ± 0.020.37 ± 0.028.36 ± 0.101.00 ± 0.00
c4g all0.19 ± 0.000.51 ± 0.000.55 ± 0.030.13 ± 0.017.87 ± 0.311.00 ± 0.00
Table 5. Time Averaging Parameters (Mean and Standard Deviation) by PFT Constrained by the Assimilation Using 256 Regionsa
PFTτT (days)τW (days)τL (days)
  • a

    PFT abbreviations are explained in Table 2.

bar all34.1 ± 1.343.0 ± 1.121.4 ± 1.5
enf tem25.9 ± 1.130.5 ± 1.412.2 ± 1.0
enf bor5.3 ± 0.819.1 ± 1.65.4 ± 1.1
dnf bor17.0 ± 1.39.6 ± 1.710.7 ± 1.1
ebf tro22.9 ± 1.736.1 ± 1.65.0 ± 1.2
ebf tem12.4 ± 1.213.0 ± 1.623.4 ± 1.6
dbf tro21.4 ± 1.011.3 ± 0.724.1 ± 1.6
dbf tem5.5 ± 0.533.7 ± 1.015.4 ± 0.8
dbf bor12.9 ± 0.917.6 ± 1.616.5 ± 1.1
ebs all19.1 ± 1.729.1 ± 1.413.5 ± 1.2
dbs tem16.4 ± 0.910.7 ± 0.619.9 ± 1.4
dbs bor5.4 ± 0.618.7 ± 1.630.9 ± 1.5
c3g arc5.2 ± 0.823.8 ± 1.325.4 ± 1.2
c3g nar5.0 ± 0.518.3 ± 0.620.3 ± 1.2
c4g all17.8 ± 1.07.2 ± 0.327.3 ± 0.9

3.1.1. Climate Control Parameters

[33] The climate control parameters given in Table 3 serve as environmental triggers that primarily determine leaf onset and senescence. The data assimilation is able to reduce the posterior uncertainty to <30% of the prior uncertainty (the latter is the square root of the initial parameter variances found in Table 1) for 82% of the temperature, 76% of the moisture and 72% of the light control parameters respectively. Table 3 reveals negative values for the light control parameter Lmin . This seems unphysical since the meteorological forcing Rg cannot become negative. The employed phenology model is highly empirical without real physical constraints. The negative Lmin values thus allow the evergreen needleleaf species to keep needles and therefore maintain LAI during winter when light can be absent especially in boreal regions.

3.1.2. Structural Parameters

[34] Structural parameters are given in Table 4. They determine the upper and lower bounds of the leaf state. For tropical evergreen broadleaf forests, the FPARmax (the upper bound of FPAR) is better constrained than FPARmin since a total absence of leaves can hardly ever be observed. The lowest FPAR values can further be contaminated by clouds and aerosols and get a larger observation error by the employed observation quality screening method. The EnKF then creates larger posterior parameter uncertainties when few observations concur with larger observation errors. The EnKF estimates each parameter independently from the others. The bare soil FPARmax (0.05) for instance has a lower posterior estimate than the FPARmin (0.11). The prediction of bare soil FPAR will therefore have no seasonal cycle. The predicted FPAR will remain at FPARmax. The posterior uncertainty of FPARmax and LAIsat is well below 5% of the prior parameter uncertainty for several boreal and temperate PFTs such as temperate deciduous broadleaf forest, boreal evergreen needleleaf forests or non-Arctic grasslands. Leaf growth rate γg values are higher than the leaf decay rate γd for most PFTs. Leaf onset is a faster process than leaf senescence for most natural species. Posterior uncertainties of both γg and γd are between 10 and 20% of their initial uncertainty.

3.1.3. Time Averaging Parameters

[35] Time averaging parameters are given in Table 5. The posterior uncertainties for the time averaging parameters is in the range of 20–50% compared to the prior uncertainty. Jolly et al. [2005] use 21 days for these parameters, which was our initial (prior) value. For the temperate deciduous broadleaf forest PFT the averaging times needed for temperature and light decreases from 21 to 5.5 days and from 21 to 15.4 days respectively while the time averaging needed for moisture increases from 21 to 33.7 days.

3.2. Global Prediction

[36] A 50 year long FPAR and LAI reanalysis data set is generated by running the prognostic phenology model with the 256 region parameter set over the whole ERA Interim and ERA 40 period (1960–2009). The prior and posterior global FPAR and LAI prediction uncertainties and errors are analyzed in this section. The prediction uncertainty is caused by the model's parameter uncertainty. It can be calculated as the ensemble variance of the predicted FPAR and LAI. The prediction error on the other hand is defined as the mean absolute deviation (MAD) between the ensemble mean of predicted FPAR and LAI and the quality screened observations (2000–2009).

[37] Figure 2a summarizes the mean leaf state and its seasonal variability for the 50 year long reanalysis data set. It shows that highest annual mean LAI of above 5 m2 m−2 is found in tropical climates and the largest seasonal LAI amplitude (contour lines in Figure 2a) occurs in subtropical and temperate climate zones. Figure 2b shows that northern hemisphere temperate climate zones green up during April and May while northern hemisphere boreal and Arctic climate zones green up during May and June.

Figure 2.

The 50 year long (1960–2009) global reanalysis data set of vegetation phenology. (a) Annual mean LAI and seasonal amplitude (contours with 1 and 2 m2 m−2). (b) Mean spring date for grid points that have a seasonal amplitude above 1 m2 m−2.

[38] Figure 3 visualizes that the highest prediction errors with the prior parameter set occur in subtropical, Mediterranean and temperate areas. The prior mean absolute deviation (MAD) of predicted versus observed LAI is in the order of 3.0 m2 m−2 for these regions. The MAD of LAI decreases to below 1.5 m2 m−2 in the 4 region experiment. The largest improvements between the 4 and the 256 region experiments are confined to crop-intensive areas such as India, central USA and Europe, but also semiarid areas such as the Sahel in Africa and central Australia.

Figure 3.

Global maps with the Mean Absolute Deviation (MAD) of the predicted FPAR and LAI versus MODIS FPAR and LAI using the prior parameter set and the parameters constrained by 4 and 256 regions during 2000–2009.

[39] Figure 4 displays the evolution of the global mean prediction error and prediction uncertainty with the increasing number of assimilated observations. The global LAI and FPAR prediction error (solid lines) successively decreases with the increasing number of assimilated observations. More than 50% of the prior prediction error is removed by the 4 region experiment, while the 256 region experiment further reduces the global FPAR and LAI prediction error to 20.6% and 14.8% of the prior prediction error, respectively. The prediction uncertainties (dashed lines in Figure 4 and Table 6) decrease to 3.4% and 3.9% of their prior values (0.326 and 2.79 m2 m−2) for FPAR and LAI. Already the 4 region experiment covering only 0.007% of the global land area reduces FPAR and LAI uncertainty to 17.8% and 16.1% of their prior uncertainty.

Figure 4.

Relationship between the global FPAR and LAI prediction error (blue and red solid lines) and prediction uncertainty (blue and red dashed lines), respectively, and the number of observations in the assimilation experiments using 4, 16, 64 and 256 regions.

Table 6. Number and Percentage of Assimilated Observations (Relative to Available Non-QA Screened Observations and Relative to Total Global Land Area) as Well as Resulting FPAR and LAI Posterior Uncertainties and Prediction Errors
ExperimentObservationsFPARaLAI (m2m−2)
NumberPercent QA PassedPercent Global LandUncertaintyErrorUncertaintyError
  • a

    Units are dimensionless.


[40] Figure 4 also reveals that the prediction error decreases less rapidly than the prediction uncertainty (solid versus dashed lines). The model ensemble members converge slightly faster than expected from the remaining model-observation differences. Parameter and state covariance underestimation is a common feature in ensemble data assimilation [Li et al., 2009]. In our experiments it happens despite the employed ensemble inflation (equation (36)) and despite the large number of chosen ensemble members.

3.3. Regional Prediction

[41] The global land area is screened by PFT class, where only grid points with at least 25% coverage for a given PFT are included in each respective area. In Figure 5, Taylor diagrams [Taylor, 2001] document the statistical performance of FPAR and LAI predictions by simultaneously drawing the correlation coefficient R between the model and the observations and the normalized standard deviation (the standard deviation of the model divided by the standard deviation of the observation). Table 7 provides the mean bias (bias) and root mean square error (rmse) values for each PFT.

Figure 5.

Performance of regional FPAR and LAI predictions by PFT with the prior parameter set and the parameters constrained by the 256 region experiment. Modeled FPAR and LAI have triangle symbols when they match observations accurately (both bias and rmse <5% of FPAR and LAI range according to Table 7).

Table 7. Bias and RSME (in Parentheses) of Regional FPAR and LAI Predictions by PFT With the Prior Parameter Set and the Parameters Constrained by the 256 Region Experimenta
  • a

    The accuracy of bold values is better than 5% of the full FPAR or LAI range (1.0 and 8.0, respectively).

bar all0.30 (0.31)0.01 (0.02)1.62 (1.63)0.02 (0.04)
enf tem0.02 (0.10)0.02 (0.03)0.93 (1.11)0.17 (0.33)
enf bor−0.12 (0.22)0.00 (0.05)0.32 (1.14)0.03 (0.16)
dnf bor−0.06 (0.21)0.00 (0.08)0.56 (1.32)0.03 (0.23)
ebf tro0.06 (0.06)0.01 (0.02)0.01 (0.23)0.19 (0.27)
ebf tem0.04 (0.12)0.00 (0.02)0.62 (0.93)0.25 (0.41)
dbf tro0.16 (0.17)0.00 (0.01)1.36 (1.39)0.08 (0.15)
dbf tem0.07 (0.12)0.03 (0.04)1.07 (1.26)0.09 (0.38)
dbf bor0.00 (0.21)0.00 (0.07)0.68 (1.39)0.19 (0.42)
ebs all0.01 (0.16)0.04 (0.07)0.79 (1.07)0.10 (0.27)
dbs tem0.33 (0.33)0.04 (0.04)2.14 (2.17)0.25 (0.26)
dbs bor0.04 (0.18)0.01 (0.07)0.61 (1.18)0.04 (0.15)
c3g arc0.02 (0.13)0.00 (0.03)0.76 (1.16)0.03 (0.09)
c3g nar0.27 (0.28)0.02 (0.03)1.95 (1.99)0.11 (0.13)
c4g all0.26 (0.27)0.01 (0.02)2.09 (2.10)0.10 (0.12)

[42] The temperate deciduous broadleaf forest (PFT class 8) FPAR and LAI predictions in the 256 region experiment have a high R > 0.98 but slightly underestimate phenological variability when compared to observations (normalized standard deviation <1). Bias and rmse for both FPAR and LAI substantially decrease (rmse to around 30% of its prior value and the bias in LAI from >1 m2 m−2 to −0.09 m2 m−2, Table 7). For this PFT it would be interesting to evaluate the interannual variability of the spring leaf-out date. However, while the modeled FPAR and LAI output is daily, the spring date cannot be accurately diagnosed from the MODIS observations since they have a 8 day compositing period (and the actual compositing day is not given in the data). The boreal evergreen needleleaf forest (PFT class 3) LAI in the 256 region experiment reaches a very high prediction accuracy with R = 0.97, a bias of 0.03 m2 m−2 and a rmse value which is <15% of its prior value. A similar gain in accuracy is achieved for the boreal deciduous needleleaf forest (PFT class 4) and for the Arctic c3 grass (PFT class 13).

[43] Figure 5 demonstrates that the prediction using the prior parameters generally overestimates phenological variability for most PFTs (normalized standard deviation > 1) and the highest R values are at 0.9. In the 256 region experiment most PFTs cluster in the same “high prediction accuracy” area and the highest R values reach 0.99. The 256 region experiment slightly underestimates phenological variability for most PFTs, which might be a result of residual observation noise (and thus exaggerated observation variability) despite of the employed restrictive quality screening. The correlation coefficient R of evergreen species such as the tropical and temperate evergreen broadleaf forest (PFT classes 5 and 6) or the evergreen broadleaf shrub (PFT class 10) remains low for both FPAR and LAI in the 256 region experiment. However, their bias and rmse values substantially improve. Correlation is not a suitable statistical means in the case of time series with almost constant (evergreen) values. The bias and rmse values given in Table 7 clearly demonstrate that both magnitude (bias) and phase (rmse) significantly gain in realism. Bold values document that for all PFTs the FPAR and LAI biases fall below 5% of the FPAR and LAI range, while most rmse values reach this threshold in the 256 region experiment.

3.4. Local Prediction

[44] Any scientific application that is applied to global scales should be reevaluated at the local scale if possible in order to gain a better process-based understanding and reveal missing model components [see, e.g., Stöckli et al., 2008a; Oleson et al., 2008]. The phenology model using the global parameter set has therefore been tested at the same four FLUXNET tower sites as in our local-scale data assimilation study [Stöckli et al., 2008b]. The aim of this section is to evaluate to what degree the model using the above estimated global parameter set is still able to represent local-scale phenology at specific sites.

3.4.1. Morgan Monroe State Forest

[45] The Morgan Monroe State Forest site (USA) is a temperate deciduous forest interleaved by grassland and crops. The site-level simulation and the 4 region experiment simulate a realistic seasonal cycle (Figure 6a). The 4 region experiment should always be closer to the site-level experiment than for instance the 256 region experiment since the former uses parameters that are constrained over exactly the four regions covering the four sites, where the latter uses parameters that minimize the prediction error for a global area. A two-stage green-up successively appears in the 16, 64 and 256 region experiments. This two-stage green-up is likely due to a unrealistic green-up timing of nonnatural PFTs present in this grid cell. The PFT parameters for maize (14% of the area) and soy (12% of the area) are constrained with information from globally distributed croplands by the 16, 64 and 256 region experiments, but their values do not seem to be valid at this particular site or for this particular year. The employed static PFT map would firstly not be suitable in areas where crop rotation is practiced, and secondly a crop phenology model might be required to realistically simulate the phenological stages of different crops in a global prediction. Senescence is realistic in the 4, 16, 64 and 256 region experiments but is delayed in the site-level experiment. The 64 and 256 region experiments further reveal a underestimation of summer LAI magnitude. It might be related to the negative bias of the temperate deciduous broadleaf forest LAI prediction found in the regional analysis above (Table 7).

Figure 6.

Predicted versus observed site-level (0.5° × 0.5°) LAI using the site-level, the prior parameter set and the parameter sets constrained by 4, 16, 64 and 256 regions during 2003.

3.4.2. BOREAS Old Black Spruce

[46] The high prediction skill at the boreal forest site BOREAS Old Black Spruce (Canada) appears to be independent of whether a site-level or global parameter set is used (Figure 6b). This result firstly demonstrates that the regions where boreal evergreen needleleaf forest occur are spatially more homogeneous than for instance the patchy landscapes encountered in temperate climate zones. The PFT distribution at BOREAS for instance consists of around 50% evergreen needleleaf trees, 20% deciduous shrubs and 20% Arctic grasslands. Secondly, phenological timing for this PFT is controlled by a well defined set of environmental triggers (defined by the climate control parameters in Table 3) that are valid from local to global scales. This result is underlined by the high prediction performance of the boreal evergreen needleleaf forest PFT found in the regional analysis above.

3.4.3. Santarem KM83

[47] The prior parameter set at the tropical evergreen broadleaf site Santarem KM83 (Brazil) creates a unrealistic light-limited leaf loss of around 2 m2 m−2 at the end of the wet season (April–June) while both quality screened observations and all posterior parameter sets show a constant LAI throughout the year. Figure 6c demonstrates that the employed observation quality control is working well and that cloud affected (wet season) and aerosol contaminated (dry season) observations at the site are properly screened and do not affect the data assimilation process.

3.4.4. Tonzi Ranch

[48] Although the Tonzi site (USA) has the same mean monthly precipitation and mean temperature as Morgan Monroe State Forest (the two red boxes that coincide in the center of Figure 1b), it is a mediterranean savanna-type ecosystem with a rather dry late summer and a wet winter season. Figure 6d shows that the magnitude but also the timing of the drought response between May and September (see also Figures 3 and 4 in the work of Stöckli et al. [2008b]) are simulated very realistically by the site-level and the 4 region experiments compared to the prior experiment. The timing is still accurate in the 16, 64 and 256 region experiments, but the peak LAI during April and May is severely underestimated in the 16, 64 and 256 region experiments. The result demonstrates that global parameter sets can become inaccurate at the local scale for ecosystems with a complex canopy. The site-level experiment yields July/August LAI values that are comparable to ground measurements [Ryu et al., 2010b]. However, our simplified canopy radiative transfer neglects the contribution of vegetation structural aspects like leaf clumping while ground measurements often neglect the contribution of the understory LAI that is also measured by the satellite. Currently the comparison of satellite- and ground-observed phenology is best achieved through the analysis of phenological timing [Studer et al., 2007; Stöckli et al., 2008b; Liang et al., 2011]. Newly developed near-surface remote sensing methods are promising to also compare phenological magnitude [Ahrends et al., 2008; Richardson et al., 2009; Ryu et al., 2010a].

4. Discussion

4.1. Data Assimilation

[49] By running the data assimilation over less than 1% of the global land surface the global FPAR and LAI prediction error could be reduced to below 20% of its initial value. The key for this success is most likely the wide climatic and biogeographic range spanned by the chosen subset of assimilation regions (Figure 1). The 4 region experiment already includes a tropical, a temperate, a boreal and a mediterranean climatic environment to constrain a set of parameters that then show substantial skill in a global prediction (Figure 3). Figure 4 suggests that little improvement can be expected when extending the assimilation area beyond the 0.4% of global land area covered by the 256 region experiment. Research on the optimal location of assimilation regions might further reduce the computational resources needed for a global data assimilation of vegetation phenology.

[50] The parameter uncertainties seem to converge much faster than the prediction errors. While the EnKF allows in theory a perfect estimation of the combined posterior model and parameter error by analysis of both the prior model uncertainty and the observation uncertainty, there are many assumptions to be made for the practical implementation of the EnKF in a prediction system. Each of the following assumptions could be the cause for the observed parameter overconfidence:

[51] 1. The ensemble size N for the EnKF should be as large as possible since the sampling error decreases by 1/equation image. 1000 ensemble members are likely too low since around 10000 states and 510 parameters are estimated for each region. With the available computational resources for this project there is little that can be done regarding ensemble size.

[52] 2. If the measurement size exceeds the ensemble size, rank problems can occur because the measurement error covariance needs to be compressed into the ensemble space. We however make use of the inversion presented by Evensen [2004] that uses a measurement operator covering the full rank of measurements to avoid the problem of rank loss reported in the literature [Kepert, 2004].

[53] 3. If the measurement uncertainty is poorly chosen in a bayesian method, the posterior model uncertainty will likely be wrong. The measurement uncertainty is derived from MODIS quality flags that are themselves based on semiempirical detection algorithms for clouds, shadows, aerosols and reflect an incomplete set of retrieval errors [Justice et al., 2002]. Further, arbitrary scaling factors are used to transfer the quality flags into a quantitative set of observation uncertainties. The superobservations derived in equation (29) neglect any spatially correlated measurement errors that are likely to happen with cloud contamination or snow cover.

[54] 4. The EnKF solver is chosen to avoid local minima since the full nonlinear prediction model is integrated without the need to create first order derivatives as needed for instance in the Extended Kalman Filter or in variational data assimilation techniques like 3D or 4D VAR. A yearly analysis guarantees that parameters do not satisfy individual observations but are consistent with the entire seasonal cycle of the leaf state.

[55] 5. By estimating a set of 15 parameters for a total of 34 PFTs several solutions in the parameter space might produce a similar prediction. However, even though equifinality might generate wrong parameters it should to the best of our knowledge not lead to parameter overconfidence.

4.2. Phenology Model

[56] We chose a rather empirical phenology model with a large set of climate control parameters, structural vegetation parameters and time averaging parameters.

[57] It was demonstrated how both climate control and structural vegetation parameters can be thoroughly constrained by the 10 years of MODIS data while time averaging parameters are left with a substantial posterior uncertainty. There is nevertheless evidence that the time averaging needed for temperature and light are likely shorter than 21 days and the averaging time for moisture is higher than 21 days. This result contrasts most temperature-based phenology models that work with growing degree days since they often integrate temperature history over several months [Chuine, 2000]. The long averaging time for moisture further demonstrate that tall trees in temperate climate zones can sustain greenness for prolonged periods of droughts. For short natural vegetation like grasslands and deciduous shrubs the moisture averaging times result well below 21 days, most likely related to their short rooting depths and higher susceptibility to drought.

[58] The tropical evergreen broadleaf forest PFT requires the longest moisture averaging times. Our model yields a seasonally largely constant FPAR and LAI for this PFT. Tropical trees are known to be resistant to the yearly recurring dry periods [Lee et al., 2005]. Recent studies however demonstrate that tropical plant physiology and phenology is very complex and both can sensitive to extreme drought periods [Saleska et al., 2007; Myneni et al., 2007; Phillips et al., 2009; Zhao and Running, 2010]. These studies are based on satellite-based EVI, modeled GPP or on field measurements of biochemical fluxes and carbon stocks and not on satellite-based FPAR or LAI. Further, drought-induced changes in tropical phenology may not be detectable with spectroradiometers like MODIS but only with hyperspectral radiometers like Hyperion [Asner et al., 2004]. These open questions should motivate follow-up research in both modeling and observation of tropical phenology.

[59] Our model entirely depends on a multiplicative set of linearized and time integrated temperature, light and moisture controls. The model therefore excludes several known biophysical and abiotic controls such as chilling requirements, insect pests, harvest, irrigation, nutrient limitations, tree aging, biodiversity effects or frost events. The high prediction skill on the seasonal and interannual timescale spanning local to global spatial scales demonstrates that the main drivers of phenological variability have been included in the model. However, the short observation period of 10 years by definition excludes most climatological extreme events required to exploit the full range seasonal to decadal phenological variability. Especially the climate control parameters of subtropical and tropical drought-deciduous PFTs might benefit from a longer observation period.

[60] Plant physiological research suggests that bud burst of temperate deciduous species is driven by photoperiod (but not necessarily the light intensity Rg as used in this study). Photoperiod can serve as trigger for temperature sensitivity [Körner, 2006]. Our results demonstrate that the best empirical prediction of temperate deciduous broadleaf forest phenology is simulated by a combined temperature-light forcing. Figure 4a of Stöckli et al. [2008b] visualizes that a light trigger (green curve, crosses) precedes the temperature trigger (red curve, stars). However, it is currently debated whether light, temperature or both control bud burst. These relationships also vary by species [Körner and Basler, 2010] and cannot be generalized [Cleland et al., 2007].

4.3. Plant Functional Type Data

[61] Plant functional types [Bonan et al., 2002] are chosen instead of the often used biomes or land cover classes [Hansen et al., 2000] because they are better in line with the separation needed for phenological predictions. The single savanna biome at for instance the Tonzi Ranch is composed of a evergreen broadleaf tree PFT (with maximum LAI in late summer) and of a drought-deciduous c3 grass PFT (with maximum LAI in early spring). Both PFTs display a very different phenological cycle and there is no single parameter set that would enable a realistic simulation of the single savanna biome. The regional analysis however suggests that several PFT classes like the temperate deciduous broadleaf forest PFT, the evergreen broadleaf shrub PFT or the temperate deciduous broadleaf shrub PFT might still be too heterogeneous in terms of phenological behavior and could be separated into sub-PFT classes. Phenological predictions would surely benefit from consistent global PFT maps based on new remote sensing technologies as for instance presented by Ustin and Gamon [2010].

[62] The focus of this study is the estimation of phenology parameters for natural vegetation. However, crop PFTs were also included in the data assimilation. Satellite pixels contain a mixed signal from both natural and managed vegetation that needs to be decomposed in order to estimate parameters for the natural vegetation PFTs. Figure 3 demonstrates that the FPAR and LAI of regions with heavy crop cover are well predicted without the explicit use of a crop phenology model. This shows that even managed vegetation phenology is dominantly weather and climate driven. However, for climate model applications a dedicated crop phenology model should be used since especially the carbon uptake of crops differs from natural vegetation [Gervois et al., 2004; Lokupitiya et al., 2009].

4.4. Satellite Data

[63] The MODIS FPAR and LAI data are derived from MODIS surface reflectances by inversion of a canopy radiative transfer model [Myneni et al., 1999, 2002]. They are more accurate in low biomass areas and generally exaggerate LAI for broadleaf and needleleaf forests [Wang et al., 2004; Cohen et al., 2006]. The LAI retrieval from visible and near-infrared surface reflectances is underdetermined for intermediate and high LAI values which can yield errors in the order of 50% [Garrigues et al., 2008]. The FPAR and LAI data set presented in this study will inherit such errors. We further use a very simplified representation of the canopy light interception that only fits 4 canopy structural parameters per PFT (FPARmin , FPARmax, FPARsat and LAImax). Compared to the MODIS retrieval algorithm it does not include the effects of foliage and canopy clumping, nongreen canopy elements, soil background reflectance, shading or vertical canopy structure [Myneni et al., 1999; Shabanov et al., 2003]. These differences can introduce inconsistencies between the assimilated and predicted FPAR and LAI values. They might be responsible for some of the scaling issues found at Morgan Monroe and Tonzi Ranch.

[64] The restrictive quality screening of MODIS observations employed in this study eliminates the majority of cloud, aerosol, snow and cloud shadow contamination that usually complicates the generation of climate quality biophysical satellite parameters in tropical or high latitude [Los et al., 2000; Poulter and Cramer, 2009]. On the global average 40–50% of all valid observations pass quality screening (Table 6). In tropical areas only 5–10% (not shown) pass the quality screening. Neglecting quality screening can for instance lead to misleading conclusions on the drought response of tropical trees [Saleska et al., 2007] as shown by Samanta et al. [2010].

[65] Remote sensing data assimilation in combination with a predictive model has the capability to complement the classical data-only gap filling procedures such as maximum value compositing or fourier time series fitting employed in most current satellite-based land surface data sets [Los et al., 2000; Jonsson and Eklundh, 2002; Stöckli and Vidale, 2004; Tucker et al., 2005; Fang et al., 2008].

4.5. Weather Forcing Data

[66] The model parameter set and therefore the phenological prediction will be sensitive to the choice of weather forcing data since predicted states are empirically and not mechanistically linked to the meteorological predictors. Potential biases in the ECMWF ERA Interim data might therefore have created unrealistic posterior parameter sets during the data assimilation. We have perturbed the weather forcing data with uncertainties as given in section 2, but the perturbation does not correct for biases in the weather forcing data. Also, a new estimation of model parameters might be required if a new weather forcing data with a different spatial scale or with a different climatology is used or if the phenology model is applied in coupled mode as part of a climate model.

5. Conclusions and Outlook

[67] Our study demonstrates how remote sensing data assimilation can be used to reduce uncertainties in a global phenology model. The assimilation of MODIS data covering less than 1% of the global land surface successfully reduced the global FPAR and LAI prediction errors to 20.6% and 14.8% of their respective prior errors. A too high variance reduction in the posterior parameter set could be mitigated by use of a more quantitative observation uncertainty estimation. Novel data assimilation methods such as the Maximum Likelihood Ensemble Filter MLEF [Zupanski, 2005] employing Hessian preconditioning and a gradient search method might yield more realistic globally applicable parameter sets.

[68] Our study suggests that PFTs are a suitable means to disaggregate mixed satellite pixels on global scale and they allow to create a PFT-specific parameterization of a globally applicable phenology model. The boreal evergreen needleleaf forest PFT and the tropical evergreen broadleaf forest PFT perform realistically over a large range of spatial scales. However, local-scale predictions using a global parameter set can become unreliable in both magnitude and timing as for instance demonstrated for the mixed natural-agricultural temperate landscape (Morgan Monroe) and the savanna landscape (Tonzi Ranch). The phenological data assimilation experiment could now be repeated with a variety of globally applicable phenology models and PFT data sets. In order to increase the compatibility between assimilated and predicted vegetation states the MODIS canopy radiative transfer model could be employed in the prediction of FPAR and LAI. A more complex treatment of leaf and canopy clumping, leaf orientation, shadowing or nongreen canopy elements would further broaden the applicability of our methods and data sets. As a first step global maps of foliage clumping [Chen et al., 2005; Pisek et al., 2010] could enhance our simplified LAI calculation with geometric information on canopy structure.

[69] Our study is a first step to mitigate some deficiencies of current phenological models. As already shown by Stöckli et al. [2008b] the parameterized phenology model can be useful to disentangle the influence of meteorological drivers on the observed phenological variability. It could be a contribution to the currently ongoing discussion on how temperature and light (or photoperiod) govern the timing of phenological spring events [Körner and Basler, 2010]. The 50 year long global phenological reanalysis data set (1960–2009) should be suitable for climate analysis studies. It might for instance contain evidence on whether the light trigger is the hard limit for the currently observed (temperature-related) negative trends for phenological spring events [Cleland et al., 2007; Rutishauser et al., 2007].

[70] Future research should combine process-based knowledge from hydrology, plant physiology and canopy radiative transfer modeling with the highly empirical world of plant phenology. This is needed to better understand and simulate the response of the terrestrial water and carbon cycle to climate variability and change and to quantify the resulting impacts on the other earth system components [Penuelas et al., 2009]. We would therefore like to motivate earth system modelers to experiment with data assimilation and to bring forward a new generation of phenology and land surface models. In order to facilitate this, the presented data set, all program codes, parameters, documentation and simple hands-on experiments are publicly available at http://phenoanalysis.sourceforge.net.

Appendix A:: Plant Functional Type Data Generation

[71] The following modifications are made to the PFT processing by Lawrence and Chase [2007] and Bonan et al. [2002]:

[72] 1. The single crop class is decomposed into 19 individual crop classes according to Leff et al. [2004].

[73] 2. This yields 35 PFT classes in total: 15 natural types, 19 crop classes and water.

[74] 3. The processing is performed at 30″ spatial resolution instead of 0.05°.

[75] 4. The monthly temperature climatology [Wilmott and Matsuura, 2007] is downscaled to 30″ by use of a lapse rate of 0.5 K 100 m−1 applied to the above described topography data set.

[76] 5. MOD15A2 LAI is quality screened as described above in order to evaluate the c4 grass fraction. Following Still et al. [2003] the c4 grass fraction is the sum of LAI for those months that satisfy the c4 growth criteria (temperature >22°C and precipitation >25 mm) over the sum of LAI for all months. Since they have used NDVI instead of LAI, we apply the square-root to the LAI-derived c4 grass fraction in order to account for the almost exponential relationship between NDVI and LAI.

[77] 6. The processing merges 7 sometimes inconsistent data sets into a single continuous plant functional type cover data set. The inconsistencies (e.g. MOD44B indicates 25% tree cover but the AVHRR VCF shows 0% tree cover) are overcome by inverse distance filling where the MODIS data set served as the reference data set.

Appendix B:: Technical Set Up

[78] The data assimilation framework is parallelized by using Version 1.2 of the MPI standard [Message P Forum, 1994] with a one- or two-dimensional process topology (multiple regions and one region per process, or single region distributed along longitude and latitude range). Model state prediction, I/O, observation QA screening, gridding of superobservations, HA and D matrices are calculated on separate processes by assigning one region per process and one process by logical CPU unit. The prior parameters are perturbed once and distributed to all processes in order to end up with a global analysis parameter set. Model states and weather forcing are perturbed by process. One process is reserved for the global analysis, where all regional HA and D are collected at the end of each simulation year and the global analysis is performed. The global analysis matrix (X5 of Evensen [2003]) is finally redistributed to all processes, where the computationally intensive final ensemble update of states and parameters is performed.

[79] The bottlenecks for this framework are its heavy memory usage, the size of the observational data and the global EnKF analysis. The parallelization of the EnKF solver would be an important next step in order to increase data assimilation performance. The state matrix has 7 dimensions (ens × lon × lat × PFT × HGT × state × days), the parameter matrix has 3 dimensions (ens × PFT × parameter), the forcing data has 5 dimensions (ens × lon × lat × HGT × forcing), which exceeds per-process memory availability on today's supercomputers. In order to increase memory efficiency, a subset of HGT and PFT classes for states is integrated in each region, where only HGT and PFT classes are selected that cover more than 2.5% of the area in each region. Water areas (PFT number 35) are screened and not used during the analysis. Furthermore the upper bound of superobservations to be used in the global analysis was set to 50000. The analysis then is within around 1 GB per process (with a maximum of 4–8 GB per node on, e.g., NCCS Discover with 8 CPUs per node and 16–32 GB per node on, e.g., NCAR Bluefire with 32 CPUs per node).


[80] The NASA Energy and Water Cycle Study (NEWS) grant NNG06CG42G, NASA grant NNX11AB87G and MeteoSwiss provided funding for this study. The MODIS Science Team and the MODIS Science Data Support Team provided the MOD15A2, MOD44B and MOD12Q1 data. The NASA Science Mission Directorate (SMD) is acknowledged for granting the SMD-08-0810 and SMD-09-1256 requests with 250,000 and 460,000 CPU hours, respectively, on the NASA Center for Computational Sciences (NCCS) High-End Computing (HEC) Discover system. The National Center for Atmospheric Research (NCAR) Terrestrial Sciences Section (TSS) is acknowledged for providing computational resources on the Bluefire system for testing purposes. The Swiss Center for Scientific Computing (CSCS) is acknowledged for providing computational resources on the Rigi system for testing purposes.