Development of land surface albedo parameterization based on Moderate Resolution Imaging Spectroradiometer (MODIS) data



[1] A new dynamic-statistical parameterization of snow-free land surface albedo is developed using the Moderate Resolution Imaging Spectroradiometer (MODIS) products of broadband black-sky and white-sky reflectance and vegetation and the North American and Global Land Data Assimilation System (LDAS) outputs of soil moisture during 2000–2003. The dynamic component represents the predictable albedo dependences on solar zenith angle, surface soil moisture, fractional vegetation cover, leaf plus stem area index, and greenness, while the statistical part represents the correction for static effects that are specific to local surface characteristics. All parameters of the dynamic and statistical components are determined by solving nonlinear constrained optimization problems of a physically based conceptual model for the minimization of the bulk variances between simulations and observations. They all depend on direct beam or diffuse radiation and visible or near-infrared band. The dynamic parameters are also functions of land cover category, while the statistical factors are specific to geographic location. The new parameterization realistically represents surface albedo variations, including the mean, shape, and distribution, around each dependent parameter. For composites of all temporal and spatial samples of the same land cover category over North America, correlation coefficients between the dynamic component of the new parameterization and the MODIS data range from 0.39 to 0.88, while relative errors vary within 8–42%. The gross (i.e., integrated over all categories) correlations and errors are 0.57–0.71 and 17–26%, changing with direct beam or diffuse radiation and visible or near-infrared band. The static local correction results in a further reduction in relative errors, producing gross values of 11–21%. The new parameterization is a marked improvement over the existing albedo scheme of the state-of-the-art Common Land Model (CLM), which has correlation coefficients from −0.57 to 0.71 and relative errors of 18–140% for individual land cover categories, and gross values of 0.03–0.32 and 37–71%, respectively.

1. Introduction

[2] Surface albedo greatly influences the surface energy budget and partitioning, which in turn regulate circulation patterns, change hydrological processes, and modify the absorption of photosynthetically active radiation (PAR) and thus determine the productivity of the Earth's ecosystem [Charney, 1975; Dickinson, 1983; Mintz, 1984; Sellers, 1985]. It also links climate changes to human activities through land cover/use alterations [Henderson-Sellers and Wilson, 1983; Xue and Shukla, 1993; Betts, 2000; Govindasamy et al., 2001]. As such, surface albedo is a crucial parameter in land surface models (LSMs). Modeling studies have shown complex interactions among albedo, climate, and the biosphere, which are nonlinear with both positive and negative feedbacks [Charney et al., 1977; Cess, 1978; Dickinson and Hanson, 1984; Rowntree and Sangster, 1986; Dirmeyer and Shukla, 1994; Lofgren, 1995], causing responses of strong regional dependence [Liang et al., 2003; Berbet and Costa, 2003; Hales et al., 2004] and substantial uncertainties [Intergovernmental Panel on Climate Change (IPCC), 2001; Myhre and Myhre, 2003].

[3] Albedo is determined by the surface characteristics, depending on the angular and spectral distributions of incident solar radiation. Observations [Nkemdirim, 1972; Kriebel, 1979; Pinker et al., 1980; Irons et al., 1988; Duynkerke, 1992; Grant et al., 2000] and radiative transfer modeling studies [Dickinson, 1983; Kimes et al., 1987] have indicated that surface albedo, over both bare soil and the plant canopy, depends on solar zenith angle. This dependence exists only for direct beam radiation and varies with land cover types. In addition, albedos of different land cover types exhibit distinct dependences on the solar radiation spectrum, all of which, however, show a sudden jump near 0.7 μm [Li et al., 2002]. Green canopies are extremely effective absorbers of solar radiation in the visible interval (0.4–0.7 μm) to drive photosynthesis, whereas they reflect and transmit most of the incident radiation in the near-infrared band (0.7–4.0 μm) due to relatively high leaf scattering coefficients [Dorman and Sellers, 1989]. Thus it is necessary for LSMs to distinguish direct or diffuse and visible or near-infrared surface albedos.

[4] Bare soil albedo also depends on the material texture caused by soil mineral composition and organic deposition [Irons et al., 1988], surface roughness [Matthias et al., 2000], and most evidently, is a decreasing function of surface moisture content [Idso et al., 1975; Ishiyama et al., 1996; Duke and Guérif, 1998; Muller and Décamps, 2001; Lobell and Asner, 2002]. Meanwhile, plant canopy albedo is determined by various biophysical and biochemical factors, especially leaf area index (LAI), leaf angle distribution, leaf transmittance and reflectance, and in sparse canopies, soil albedo, and vegetation coverage [Dickinson, 1983; Sellers, 1985; Sellers and Dorman, 1987; Kimes et al., 1987; Bonan, 1996; Asner, 1998; Wang, 2003; Hales et al., 2004]. Although snow causes large temporal and spatial variations of surface albedo over both bare soil and the plant canopy [Zhou et al., 2003], the lack of accurate measurements of snow characteristics (amount, coverage, age, pollutant levels) inhibits a rigorous evaluation and further improvement for parameterization of the snow-albedo effect. In this study, we focus on snow-free albedo over all land surfaces.

[5] A complete physical representation of all preceding albedo dependences is not possible. However, current land surface albedo models are oversimplified and/or contain substantial biases compared to observations. For example, in the latest release version 2.0 of the next-generation mesoscale Weather Research and Forecast model (WRF; http://www.wrf–, the snow-free surface albedos of all the implemented LSMs [see Liang et al., 2004] are prescribed with tabular values depending only on vegetation types without any dynamic variation. In contrast, more comprehensive albedo treatments have been incorporated into the state-of-the-art Common Land Model (CLM) and its predecessors and variations [Dai et al., 2003, 2004; Dickinson et al., 1993; Bonan, 1996; Bonan et al., 2002; Zeng et al., 2002] (see also Y. Dai et al., The Common Land Model (CLM): Technical Documentation and User's Guide, available at, 2001) (hereinafter referred to as Dai et al., online paper, 2001). Numerous studies, however, have found serious discrepancies in these models compared to satellite measurements [Wei et al., 2001; Zhou et al., 2003; Oleson et al., 2003; Wang et al., 2004].

[6] The recent increasing availability of high-quality, fine-resolution satellite data provides an unprecedented opportunity to develop more realistic dynamic-statistical land surface albedo parameterizations. In particular, the Moderate Resolution Imaging Spectroradiometer (MODIS) measurements facilitate the accurate retrieval of direct and diffuse albedos for visible, near-infrared, and total solar bands using a semi-empirical kernel-driven Bidirectional Reflectance Distribution Function (BRDF) model with multidate, multispectral, cloud-free, atmosphere-corrected surface reflectance [Lucht et al., 2000; Schaaf et al., 2002]. The data are currently available over the global land surfaces at 1-km resolution every 16-day composite period. In comparison with these satellite data, the preceding diagnostic studies have identified major model problem areas, but none has yet developed improved parameterizations. To a certain extent, exceptions include Tian et al. [2004], who improved the CLM and MODIS agreement on diffuse albedo from the use of more realistic land surface data (LAI, plant function type, and bare soil fraction) over the globe, and Tsvetsinskaya et al. [2002], who linked MODIS surface albedo statistics with soil classifications and rock types over the arid areas of Northern Africa and the Arabian peninsula.

[7] In this study, we use the MODIS and other supplementary data to develop an improved dynamic-statistical parameterization for snow-free land surface albedo. The parameterization is initially designed for U.S. mesoscale modeling applications in the framework of the CLM coupled with the climate extension of the WRF (CWRF) [Liang et al., 2004]. The conceptual model and processing procedures, however, are described in detail and can be generally applied to develop improved schemes for other LSMs and over the globe.

2. Data

[8] The data used in this study consist of the MODIS surface albedos [Schaaf et al., 2002], the Land Data Assimilation System (LDAS) [Cosgrove et al., 2003; Mitchell et al., 2004; Rodell et al., 2004b] surface soil moisture, and other CWRF surface boundary conditions, including land cover category (LCC), fractional vegetation cover (FVC), and leaf and stem areas indices (LAI, SAI) [Liang et al., 2004]. The last three parameters are also based on MODIS products, but necessary adjustments are made to ensure consistency with other conventional data sets (see below). All data, except for the static LCC and FVC, are time-varying samples from February 2000 to December 2003. Given that the MODIS albedo data are available at every 16-day composite period, other variables are processed to the same interval by time averaging.

[9] For U.S. applications, the CWRF domain is centered at (37.5°N, 95.5°W) using the Lambert Conformal Conic map projection and 30-km horizontal grid spacing, with total grid points of 196 (west–east) × 139 (south–north). The domain covers the whole continental United States and represents the regional climate that results from interactions between the planetary circulation and North American surface processes, including orography, vegetation, soil, and coastal oceans [Liang et al., 2004]. All variable data are processed onto this CWRF grid mesh before their use in development of the new land surface albedo parameterization.

[10] Given the various data resolution and map projections, the Geographic Information System (GIS) software application tools, Arc/Info and Arc/Map, from Environmental Systems Research Institute, Inc., are employed to do horizontal data remapping [Liang et al., 2004]. In particular, the GIS tools are used to first determine the geographic conversion information from a specific map projection of each raw data set to the identical CWRF grid system. The information includes location indices, geometric distances, or fractional areas of all input cells contributing to each CWRF grid. The remapping is completed by a bilinear interpolation method in terms of the geometric distances if the raw data resolution is low, or otherwise a mass conservative approach as weighted by the fractional areas. For the categorical field LCC, the total fractional area of each distinct surface category contributing to a given CWRF grid is first calculated and then the dominant one that occupies the largest fraction of the grid is chosen.

2.1. MODIS Surface Albedos

[11] The MODIS BRDF/Albedo product (MOD43B1; [Schaaf et al., 2002] includes directional hemispherical (black-sky) and bihemispherical (white-sky) reflectance (albedo) for three broad bands: visible (0.3–0.7 μm), near-infrared (0.7–5.0 μm), and total (0.3–5.0 μm). They are obtained through spectral (seven measured) to broadband conversions [Liang et al., 1999; Liang, 2001] and can be well reproduced by polynomials [Lucht et al., 2000],

equation image

where αs denotes satellite-derived albedo; b, d or bs, ws is the black-sky, white-sky component; λ is the spectral band; and θ is the solar zenith angle (radian). The fitting coefficients gjkbs and gkws are given in Table I of Lucht et al. [2000]. The BRDF model kernel weights fk depend on spectral bands and vary with time and location. They are provided by the MOD43B1 product data at 1-km resolution over the globe for every 16-day composite period.

[12] For the purpose of this study, the reprocess version 004 data for the BRDF model kernel weights are adopted for their improved quality control. Only the snow-free pixels that have a mandatory quality flag of “processed” (QA = 0 or 1) are selected. An additional constraint is that the ratio of near-infrared to visible albedos is larger than 1.2, or otherwise the data are discarded. This is to eliminate potential poor quality data caused by cloud leaking through the MODIS cloud detection algorithm (i.e., when both bands are bright in the presence of clouds, the ratio is close to 1). For every 16-day composite, we calculate, using equation (1), a single sample for white-sky albedo and 24-hourly samples for black-sky albedo corresponding to the actual local solar zenith angle. These αs,b and αs,d represent the direct beam contribution and the entire diffuse portion, respectively. They form the ground truth for developing the new parameterization of snow-free land surface albedo for direct and diffuse solar radiation, visible and near infrared.

2.2. LDAS Surface Soil Moisture

[13] The multiinstitutional LDAS project is designed to provide enhanced soil moisture and temperature conditions for numerical weather/climate prediction models. The North American LDAS (NLDAS) [Mitchell et al., 2004] is an uncoupled data assimilation system, where a suite of LSMs are driven by the same realistic atmospheric forcing data and initialized at the same time with the same relative soil wetness. No feedback from the land surface to atmosphere is included. The forcing data represent the best available proxy of observations, including the near-surface meteorological conditions (such as surface wind, temperature, humidity, and pressure), precipitation analyses (merged gauge, satellite, and model data), and surface radiation budgets [Cosgrove et al., 2003; Pinker et al., 2003]. A similar procedure is followed in the Global LDAS (GLDAS) [Rodell et al., 2004b]. The NLDAS covers the continental United States at 1/8° (∼15 km) resolution while the GLDAS extends over the globe north of 60°S at 1° (∼110 km) resolution.

[14] Currently, hourly outputs from the NLDAS Mosaic [Koster and Suarez, 1992] and GLDAS Noah [Ek et al., 2003] LSMs are available from the Hydrological Sciences Branch of NASA Goddard Space Flight Center ( The Mosaic LSM's physics are based on those of the Simple Biosphere Model (SiB) [Sellers et al., 1986] that accounts for the sub-grid heterogeneity of vegetation and soil moisture. Energy and hydrology balance are computed at each mosaic tile of a distinct vegetation type [Avissar and Pielke, 1989]. Within each grid, up to 10 tiles can be specified and, for each variable, the grid value equals the area average of all tiles. The Noah LSM [Chen et al., 1996; Koren et al., 1999] has been used operationally in National Centers for Environmental Prediction (NCEP) models with continuous improvements [Betts et al., 1997; Ek et al., 2003].

[15] Several studies have evaluated and compared soil moisture performance between the LSMs. Comparisons with in situ measurements in Illinois and Oklahoma indicated that the NLDAS/Mosaic total column soil moisture is strongly correlated with observations [Robock et al., 2003; Schaake et al., 2004]. Since soil moisture exhibits a high degree of spatial variability, it must be cautioned that comparing grid model outputs with point measurements is not a dependable method of validation. Comparisons with gravity-based terrestrial water storage estimates from the Gravity Recovery and Climate Experiment (GRACE) satellite mission showed that the GLDAS/Noah simulates seasonal, large-scale variations with a reasonable degree of accuracy, and better than other existing global products examined [Tapley et al., 2004; Rodell et al., 2004a].

[16] This study uses the top layer (0–10 cm) data for surface volumetric soil moisture from the two LDAS products. Although albedo is only directly a consequence of conditions at the very topsoil layer, vertical diffusion maintains a link with soil moisture conditions over a deeper layer. This is shown by Idso et al. [1975], who demonstrated a good relationship between albedo and 0–10 cm soil moisture, justifying the use of this layer in the present study. The NLDAS/Mosaic gives a better resolution over the United States and is used as the major data set for developing the soil albedo parameterization, while the GLDAS/Noah provides data coverage outside of the NLDAS domain. Given the existence of systematic differences between NLDAS and GLDAS, discontinuities occur along the southern and northern borders between the two moisture data sets. Along each border, the belt of five west-east grid rows inward from the NLDAS data border line is considered as a transition zone. For all grid rows outward from the border line along the same south-north grid column, the ratio between the GLDAS and NLDAS averages within the transition zone is calculated, and its running mean of 21 west-east points is used to scale the GLDAS soil moisture. This scaling effectively removes the discontinuity (see below). To account for the diurnal cycle in correspondence with the direct albedo dependence on the solar zenith angle, hourly composites are obtained by averaging in each 16-day period of the MODIS data.

2.3. CWRF Surface Boundary Conditions

[17] The LCC adopts the U.S. Geological Survey (USGS) land cover classification, which consists of 24 categories ( The USGS land cover data were developed using the Advanced Very High Resolution Radiometer (AVHRR) satellite-derived Normalized Difference Vegetation Index (NDVI) composites from April 1992 through March 1993. Table 1 lists the percent area covered by each LCC type over the globe and the CWRF U.S. domain as well as the total number of quality-controlled data samples (grids plus records) actually used in the new parameterization development. Note that the USGS raw data do not contain land cover categories 4 and 20 over the globe, and additionally 12, 17, and 23 within the present CWRF domain. Moreover, categories 22 and 24 are not chosen as the majority type for LCC. Therefore the final LCC includes only 17 land cover categories over this CWRF domain [Liang et al., 2004]. The water body category (16) is also discarded in this study.

Table 1. USGS Land Cover Categories and Corresponding Area and Data Coverages
LCC TypeUSGS Land Use/Land CoverCoverage, %Number of Data Sample
United StatesGlobalDirect BeamDiffuse Radiation
  • a

    Land cover type that is absent in the global.

  • b

    Land cover type that is absent in the U. S. CWRF domain.

1Urban and built-up land0.340.12339552788
2Dryland cropland and pasture5.495.641360282109361
3Irrigated cropland and pasture0.481.52673535504
4Mixed dryland/irrigated cropland and pasturea
5Cropland/grassland mosaic4.422.0599642979816
6Cropland/woodland mosaic2.263.2752181742640
9Mixed shrubland/grassland0.111.02276462288
11Deciduous broadleaf forest5.462.5597140779324
12Deciduous needleleaf forestb0.000.91
13Evergreen broadleaf forest0.085.753184264
14Evergreen needleleaf forest10.322.262034863163861
15Mixed forest7.593.5984976460535
16Water bodies43.0938.90
17Herbaceous wetlandb0.000.03
18Wooded wetland0.840.431017778144
19Barren or sparsely vegetated0.467.56460983828
20Herbaceous tundraa
21Wooded tundra3.352.99833325716
22Mixed tundra0.370.97
23Bare ground tundrab0.000.02
24Snow or ice0.031.23

[18] The FVC is derived, following Zeng et al. [2000, 2002], from the MODIS NDVI product ( Given its good agreement with field surveys and observational studies and small interannual variability, the FVC derived from the AVHRR NDVI was believed to be robust [Zeng et al., 2002, 2003]. On the other hand, there exist significant differences between the NDVI from the MODIS and AVHRR sensors. The MODIS has generally larger values than the AVHRR [Gallo et al., 2004], causing an overestimation of FVC. Thus the MODIS NDVI is first scaled toward the AVHRR data to remove the systematic difference between the two for each USGS land cover category, and then used to derive the final FVC [Liang et al., 2004].

[19] The LAI and SAI are defined, respectively, as the total one-sided area of all green canopy elements and stems plus dead leaves over vegetated ground area. The operational MODIS product, MOD15A2 with the reprocess version 004, provides LAI data at 1-km resolution for every 8-day composite period [Knyazikhin et al., 1999; Myneni et al., 2002]. We select only the cloud-free pixels that have a quality check flag of “best” or “ok” (QC = 0 or 1). Note that this product exhibits significant differences from that based on the AVHRR [Zhou et al., 2001; Buermann et al., 2002]. Apparent discontinuities exist between the two data sets, where the MODIS values are systematically smaller, especially for the Midwest cropland. As compared with measurements at a central Illinois soybean/corn site (by courtesy of Steven Hollinger of Illinois State Water Survey), the AVHRR-based LAI estimates are in good agreement and capture well the peak values during the growing season. Thus the MODIS LAI is corrected to have the same monthly mean climatology as the AVHRR for the cropland-related LCC categories 2–6 [Liang et al., 2004]. The corrected LAI samples are averaged for 16-day composites corresponding to the MODIS albedo data. For each LCC, SAI is then approximated from LAI as by Zeng et al. [2002].

3. Methodology

[20] Our main purpose is to develop an improved dynamic-statistical parameterization for snow-free land surface albedo that most realistically simulates the MODIS measurements. Since the parameterization is developed for intrinsic surface albedos that are not affected by the prevailing atmospheric conditions, we can directly compare the modeled values with the equivalent intrinsic albedos (equation (1)) provided by the MODIS product. Our starting point is the existing CLM albedo scheme, which represents the state-of-the-art in current climate modeling. It specifies separate albedos for bare soil and vegetation and then determines total surface albedo as an area-weighted mixture of the two. We follow this approach to facilitate the new parameterization.

3.1. Old CLM Albedo Scheme

[21] The complete CLM albedo scheme, mainly based on the work of Dickinson et al. [1990], Verstraete et al. [1990], and Schluessel et al. [1994], is described in detail by Dai et al. (online paper, 2001). A summary of the snow-free land surface albedo portion is given here. For bare soil, albedo is specified as a decreasing linear function of soil moisture,

equation image

where subscript g denotes bare soil or ground; vis and nir are the visible and near-infrared bands; and ϑ is the surface volumetric soil moisture (m3/m3). The saturated soil albedo αsat depends on local soil color that varies from light to dark according to the global 1° soil classification distribution of Wilson and Henderson-Sellers [1985]. Note that no distinction is made between direct and diffuse albedos, and solar zenith angle dependence is not considered. These simplifications are not realistic as discussed in section 1. In addition, the ratio between near-infrared and visible albedos is fixed as 2, which disagrees with the MODIS data [Zhou et al., 2003]. The specification of αsat as a function of soil color is also problematic, since no credible (high-quality and fine-resolution) color data exist. Currently, the CLM prescribes αsat as a tabular function [e.g., Zhou et al., 2003, Table 2a].

[22] The vegetation albedo is defined by a simplified two-stream solution under the asymptotic constraint that it approaches the underlying soil albedo or the thick canopy albedo (specific of vegetation type) or when the local LAI is close to zero or infinity,

equation image

where superscripts v, c, b, and d denote vegetation, thick canopy, direct beam, and diffuse radiation, respectively; μ = cos(θ); Lsai = LAI + SAI; the single scattering albedo ωλ is prescribed as 0.15 (vis) and 0.85 (nir), and the coefficient β is set to 0.5. Note that αv,d = αv,bμ=0.5. By default, the CLM prescribes the canopy albedo αc as tabular functions of vegetation types [e.g., Zhou et al., 2003, Table 2b]. A big concern is that the dependence on Lsai is identical for all vegetation types and no distinction is made for dense canopy albedos between direct beam and diffuse radiation. This is apparently not realistic [Asner, 1998].

[23] The bulk snow-free surface albedo, either direct or diffuse, is determined by

equation image

Here fv = FVC.

[24] The recent CLM treats the radiative transfer within vegetation canopies by the full two-stream solution [Dai et al., 2004], which is computationally expensive. This introduces additional input parameters: leaf angle distribution factor; reflectance and transmittance separately for live and dead leaf, visible and near-infrared band. All of these parameters must be, and currently subjectively, specified for each of the 24 USGS vegetation categories. They have presently been tuned to approximate the simplified scheme of equation (3). Since it is much more difficult to realistically estimate these extra parameters, we choose equation (3) as the baseline for new parameterization development.

3.2. New Conceptual Model

[25] Since the structure among surface types varies considerably, it is difficult to develop a universal scheme to model the surface albedo zenith angle dependence [Briegleb et al., 1986]. For a semi-infinite plant canopy consisting of randomly oriented leaves, Dickinson [1983] described the dependence by α(μ) = α0(1 + d)(1 + 2dμ)−1, where α0 is the albedo for μ = 0.5, depending on surface types; d is a fitting parameter, 0.4 for arable land, grassland, and desert, and 0.1 for all other types. In contrast, Xue et al. [1991] found that the shape of the diurnal variation of surface albedo resulting from the two-stream solution of the SiB [Sellers et al., 1986] is very regular and can be adequately simulated by a quadratic fit. The MODIS albedo dependence on solar zenith angle is also approximated by a polynomial in equation (1) [Lucht et al., 2000]. Thus we choose the following shape function to depict the μ dependence:

equation image

where Cjμ,λ are coefficients depending on spectral bands. For now, our only concern is the shape, which is assumed to be the same for all bare soil with or without any type of vegetation canopy.

[26] Masson et al. [2003] formulated bare soil albedo as a linear function of the sand fraction, representing soil mineral composition, and the relative area of woody and herbaceous vegetation types, depicting soil organic deposition. However, we found very little correspondence between these factors with the MODIS data. As noted earlier, the CLM saturated soil albedo varies with soil color, for which there is a lack of credible data. In addition, Tsvetsinskaya et al. [2002] linked surface albedo with soil classifications and rock types over arid areas using 1-km resolution. This relationship was not evident in our comparison over the CWRF 30-km grid mesh between the MODIS albedo and soil characteristic factors described by Liang et al. [2004]. The coarser data resolution may play a role for the lack. Hence these factors are not considered in this study.

[27] On the other hand, the dependence on surface soil moisture has been clearly demonstrated in both observational and modeling studies (see the introduction). The CLM adopts a decreasing linear function in equation (2), which may trace back to Idso et al. [1975], who observed such relationship for a very thin surface layer (<0.2 cm thick) but much sharper slopes for thicker layers. (Note that the top CLM layer is 1.75 cm thick [Liang et al., 2004].) More recent studies indicated a decreasing exponential function [Duke and Guérif, 1998], which is similar for different soil types [Lobell and Asner, 2002]. We therefore choose the following shape function to depict the ϑ dependence:

equation image

where Cjϑ,η,λ are coefficients depending on direct beam or diffuse radiation (η = b, d) as well as spectral bands. Again, this applies for all bare soil with or without any type of vegetation canopy.

[28] It is interesting to note that decreasing albedo-soil moisture relationships were also identified from Amazonian forest data, where no soil is visible from above the canopy [Culf et al., 1995]. This feature cannot be explained by a color change of the exposed soil when moist, as it can for short vegetation canopies. Rather, the decreasing albedo likely results from a canopy water content increase in response to a greater soil moisture availability, since leaf dehydration generally increases reflectance [Mooney et al., 1977]. Given that no data for canopy water content are available, we approximate this relationship, if any, by the soil moisture dependence. This simplification is implied by assuming the background soil albedo (defined below) as a function of LCC type.

[29] We can now parameterize the bare soil albedo by the product of Rμ,λ and Rϑ,η,λ,

equation image

where the coefficients are redefined as

equation image

By this definition, α0g,η,λ = αg,η,λϑ=0,μ=0, which is referred to as the maximum background soil albedo; Cϑ,η,λ is introduced to account for the varying range of soil albedo [e.g., Duke and Guérif, 1998]; both may depend on LCC types. As such, the μ or ϑ dependence is normalized to a fractional shape varying between 0 and 1. Later we will show that this normalization provides a more meaningful depiction of the shape functions. Note that a distinction is made for the Cϑ,η,λ correction between soils of tall trees and of short vegetation. For those of tall trees (wooded or forested LCC types 6, 10–15, 18, 21), the correction is applied only in the vegetated area fv while setting Cϑ,η,λ = 0 over the bare soil portion (1 − fv). This is to account for likely differences in soil characteristics under trees, including leaf coverage on the ground (shading the soil from exposure) and water intake from deep root zones (weakening canopy sensitivity to the skin soil water), both of which reduce the albedo dependence on surface soil moisture as sensed by the MODIS. For the other LCC types, the Cϑ,η,λ correction is assumed uniform over the whole grid.

[30] Canopy albedo tends to decrease with increasing vegetation amount due to the high PAR absorption by plants. Hales et al. [2004] found a decreasing exponential function of LAI that well reproduces the annual mean observations. They, however, considered only albedo for total radiation. Duke and Guérif [1998] showed that, as LAI increases, vegetation albedo exponentially decreases (increases) due to enhanced absorption (reflection) of the visible (near-infrared) radiation. This renders certain support to the use of equation (3), where vegetation albedo can increase or decrease with LSAI depending on whether the first or second term on the right side of the equation is larger. It is also important to distinguish characteristic structure differences between vegetation types. We choose to use the following general form of vegetation albedo:

equation image

where the canopy albedo αc,η,λ and the upward scattering coefficient Λη,λ are distinguished between direct beam and diffuse radiation; both also depend on spectral bands and vegetation types. The factor mλ = equation image is to account for the reduced absorption (and thus higher transmittance) of intercepted and scattered radiation by non-black leaves [Goudriaan, 1977]. Our sensitivity experiment (not shown) suggests that ωvis = 0.2 and ωnir = 0.8 yields an overall good fit for most LCC types. Accordingly, we choose mvis = 0.894 and mnir = 0.447. The transmittance or extinction coefficients Γη are defined as

equation image

where G(μ) is the projected area of phytoelements in direction μ. We follow Goudriaan [1977] to fit this factor by

equation image

[31] Goudriaan [1977] specified ϕ2 = 0.877(1 − 2ϕ1), and ϕ1 = 0.5 − 0.633χ − 0.33χ2, where χ is an empirical parameter representing the departure of leaf angles from a spherical or random distribution (=0), varying from −1 for vertical leaves to +1 for horizontal ones. We, however, choose ϕ1 and ϕ2 as free parameters to fit the MODIS data. Equation (10) can now be expressed by

equation image

[32] We can see that the CLM equation (3) is a special case of the general form equation (9) as if ϕ1 = 0.5, ϕ2 = 0 (i.e., χ = 0, indicating spherical distribution leaves), and αc,η,λ = αc, αg,η,λ = αg, Λη,λ = ωλβ/αc (ignoring distinction between direct beam and diffuse radiation).

[33] Initial diagnosis of MODIS data for FVC greater than 0.6 indicates that there exists a tendency for visible (near-infrared) albedo to decease (increase) with LAI and that such dependence varies with LCC types. This is consistent with the dominant effect of plant enhancement on absorption (reflection) of visible (near-infrared) radiation. The second term in equation (9) simulates the absorption effect through the combination of the Lsai exponential decay of transmittance and ground reflection, while the first term is supposed to depict the plant reflection effect. The Λη,λ term, however, considers only the scattering increase as radiation passes through the depth of canopy. The direct amplification of reflection by canopy greenness must be represented in αc,η,λ. We choose a linear shape function to describe the canopy albedo dependence on greenness,

equation image

where GRN is the canopy greenness and is currently approximated with local LAI divided by the maximum over all locations having the same LCC type. This approximation may not represent the actual greenness but a simple numerical factor as a normalized LAI that is effectively linked with canopy albedo. For simplicity, we ignore the dependence in the visible band (λ = vis), where the scattering effect is small. By definition, for the near-infrared band (λ = nir), Cc,η,λ > 0 and α0c,η,λ = αc,η,λGRN=1, which is referred to as the maximum background canopy albedo. Both Cc,η,λ and α0c,η,λ may depend on LCC types.

[34] Our main goal is to search for the best set of parameters Cjμ,λ, Cjϑ,η,λ, Cϑ,η,λ, Cc,η,λ, α0g,η,λ, α0c,η,λ, Λη,λ, and ϕj that most realistically capture the predictable dynamic variations of snow-free land surface albedo inherent in the MODIS data. These parameters also vary with LCC types (except Cjμ,λ, Cjϑ,η,λ) but are not a direct function of location. Many other factors, however, are currently not measurable or predictable. In particular, local soil characteristics (e.g., soil color, surface roughness) and canopy structures (e.g., mosaic distribution of multiple vegetation categories) have great impacts on surface albedo and are yet not represented in LSMs. We therefore introduce a soil albedo localization factor (SALF) to depict the static portion of albedo that is geographically dependent. Equation (4) now becomes

equation image

where γη,λ = SALF varies with geographic locations and spectral bands, and differs between direct beam and diffuse radiation. The αη,λ is the dynamic component of the new parameterization that represents the predictable albedo dependencies on solar zenith angle, surface soil moisture, land cover category, fractional vegetation cover, leaf plus stem area index, and greenness. The final albedo α′η,λ incorporates a statistical correction for other unexplained but accountable static effects specific to local surface characteristics.

[35] Considerable spatial variability in surface albedo over deserts and semideserts has been observed [Pinty et al., 2000; Strugnell and Lucht, 2001; Tsvetsinskaya et al., 2002; Zhou et al., 2003]. The spatial variability is more readily represented by the SALF rather than soil color or texture types which lack global measurements. The SALF is a regression fit that minimizes the modeled and measured albedo statistics, and thus provides a direct and most realistic way to depict the static albedo effect of surface characteristics.

3.3. Solving the Nonlinear Constrained Optimization Problem

[36] The new conceptual model, as described in equations (7)(14), requires specification of eight groups of parameters (Cjμ,λ, Cjϑ,η,λ, Cϑ,η,λ, Cc,η,λ, α0g,η,λ, α0c,η,λ, Λη,λ, and ϕj) to define the dynamic component of albedo temporal and spatial variations, which is corrected by local γη,λ to account for other unresolved effects that are specific to each geographic location. Given 16 LCC types with vegetation canopies, there are 304 independent parameters of the dynamic component and four geographic distributions of localization factors for the static contribution. It is impossible for any optimization solver to faithfully estimate all these unknowns at once. Our strategy is to use a subset of the data that pertain to the physical regime dominating a specific group of parameters and solve the optimization problem group by group in a pre-sorted sequential order.

[37] The optimization solver used in this study is the FORTRAN Feasible Sequential Quadratic Programming (FFSQP) [Zhou et al., 1997]. The FFSQP is designed to find the optimal solution for the minimization of the maximum of a set of smooth objective functions subject to equality and inequality constraints, linear or nonlinear, and simple bounds on the variables. It requires the accurate definition of the objective functions and constraint functions, and, from our own experience, the gradients of these functions to achieve a robust solution. Most of the parameters to be estimated have distinct physical meanings and thus must be objectively constrained. The FFSQP solver finds the shortest path in the multiparameter space. This path is only one of many numerical solutions satisfying the specified functions and by itself preserves no physical meaning of the parameters except their specified ranges, gradients, and other mathematical constraints. As such, careful pre-thinking of the physical representation of each parameter must be taken and conceptualized into the mathematical constraints to increase the likelihood that the resulting solution reflects a large part of the true dynamical processes. We take the following steps to solve the system.

3.3.1. Bare Soil Albedo μ Dependence

[38] This is the simplest problem, involving only the MODIS black-sky albedo data. We assume that all grids having FVC · LAI smaller than 0.15 represent the group of samples for this problem's solution. The objective function for each spectral band is defined as

equation image

where αs,b and Rμ,λ are calculated by equations (1) and (5), and the summation is over hourly data of all the grids selected, with Ng total number of samples. The gradients of the objective function to individual Cjμ,λj=0,1,2 can be readily derived. Given these functions, the FFSQP provides the optimization solution of Cjμ,λ separately for visible and near-infrared bands.

3.3.2. Bare Soil Albedo ϑ Dependence

[39] This problem, still relatively simple, involves the MODIS black-sky and white-sky albedo data as well as NLDAS soil moisture outputs. The data grids are the same as in section 3.3.1. The objective function for each spectral band, direct beam or diffuse radiation, is defined as

equation image

where αs,η,λ, Rμ,λ, and Rϑ,η,λ are calculated by equations (1) and (5)(6). Note that αs,b is first normalized by the known Rμ,λ via equation (5) to remove the μ dependence. The gradient of the objective function to each individual Cjϑ,η,λj=0,1,2 can be easily derived. Given these functions, the FFSQP provides the optimization solution of Cjϑ,η,λ separately for direct beam or diffuse radiation and visible or near-infrared band.

[40] At this point, the normalized shape functions for μ and ϑ dependence can be constructed by solving equations (7)(8) through minimization of equations (15)(16). These functions (Figure 1) are assumed to be identical for all LCC types, and the corresponding coefficients are listed in Table 2. Although equation (5) was solved separately for visible and near-infrared bands, the resulting normalized μ dependence shape functions (see equation (7)) are almost identical between the two. This is consistent with the fact that the soil scattering property changes little between spectral bands [Bänninger and Flühler, 2004]. Similarly, equation (6) was solved individually for direct beam or diffuse radiation and visible or near-infrared band, and the resulting normalized ϑ dependence shape functions (see equation (7) with Cϑ,η,λ = 0) closely resemble each other between direct beam and diffuse radiation for either visible or near-infrared band. This agrees with the fact that the albedo dependence on soil moisture is determined by the absorption property of water, which differs little between direct beam and diffuse radiation. These results indicate that the optimization solutions are robust. For consistency, we therefore remove the dependence of Cjμ,λ on λ and that of Cjϑ,η,λ and Cϑ,η,λ on η in the subsequent solutions. Note that α0g,η,λ and Cϑ,η,λ remain to be estimated later as a function of LCC type.

Figure 1.

The shape functions define the normalized soil albedo dependence on (a) solar zenith angle and (b) surface moisture.

Table 2. Coefficients of the Normalized Shape Functions for Soil Albedo μ and equation image Dependence
λημ Dependenceequation image Dependence
 diffuse  0.68614.588
 diffuse  0.65011.857

[41] We realize that the MODIS data are unreliable near dusk or dawn (θ > 70°) when atmospheric correction of the input data degrades and the BRDF models themselves grow weak. To derive the complete shape function for the μ dependence, we have used all daytime data with μ > 0.01. As such, some MODIS data used may not be realistic. This, however, may have little impact on the result, since the shape function is not sensitive for small μ where, for example, the difference between our scheme and that of Dickinson [1983] is negligible (Figure 1a).

3.3.3. Vegetation Dependence

[42] This is the most complicated problem, involving all data and all parameters of the dynamic component. To simplify the system, we solve the problem for each LCC type. All hourly samples over all grids that have the same LCC type are collected as a single group for the FFSQP to solve the corresponding vegetation-dependent parameters. Since ϕjj=1,2 depend only on canopy structure and are identical for direct beam or diffuse radiation and visible or near-infrared band, we choose to solve them through the diffuse albedo because of its μ independence. To ensure consistency between the visible and near-infrared bands, a joint objective function for each LCC group is defined as

equation image

where αs,d and αd are calculated by equations (1) and (14), respectively; Nv is the total number of samples for this LCC group (see Table 1). The gradients of the objective function to all individual parameters listed in the function can be derived. Important constraints include 0.001 ≤ Λd,vis ≤ 1.0, 0.001 ≤ Λd,nir ≤ 4.0, 0.15 ≤ α0g,d,nir ≥ 1.2α0g,d,vis, 0.20 ≤ α0c,d,nir ≥ 1.2α0c,d,vis, and 0.022 ≤ α0c,d,vis ≤ 0.07. These limits are chosen somewhat subjectively. It is reasonable to assume that the upward scattering effect is less in the visible than near-infrared band, so a larger Λ upper bound is set for the latter (greater than 4.0, which makes no difference due to the exponential decay), and that canopy reflects more near-infrared radiation than bare ground, so a bigger lower bound is assigned for the former. The ratio factor 1.2 is adopted from the raw data screening procedure (section 2). Given these conditions, the FFSQP provides the optimization solution of ϕjj=1,2, together with initial estimates of other parameters listed.

[43] Given the known ϕj, the final estimates of all other parameters can be obtained by solving the optimization problem separately for direct beam or diffuse radiation and visible or near-infrared band. The objective function for each LCC group is now defined as

equation image

For both direct beam and diffuse radiation, the solution begins with the visible band, where we require Cc,η,vis = 0, 0.001 ≤ Λη,vis ≤ 1.0, α0g,η,vis ≥ 0.125, 0.022 ≤ α0c,d,vis ≤ 0.07, and 0.022 ≤ α0c,b,vis ≤ 0.08. The near-infrared band is then solved with additional constraints Cc,η,nir > 0, 0.001 ≤ Λη,nir ≤ 4.0, 0.15 ≤ α0g,η,nir ≥ 1.2α0g,η,vis, and 0.20 ≤ α0c,η,nir ≤ 0.42. Note that the upper bounds for α0c,η,λ are subjective due to the lack of observations. Sensitivity tests show that decreasing these bounds by 15% does not significantly change the outcome. Direct measurements are needed to specify realistic α0c,η,λ and their bounds.

[44] Table 3 lists all parameters that depend on LCC types. Note that the solution for α0c,η,λ is not well defined, especially for diffuse radiation in both visible and near-infrared bands where many LCC types have values at the respective numerical bounds. Several Λη,λ values are given with the upper limits for diffusive radiation. The Cϑ,η,λ solution reaches the upper bound for soils beneath tall trees except Savanna. This indicates the lack of albedo dependence on surface soil moisture, as effectively sensed by the MODIS. Recall that no such correction is made over the bare soil portion for these LCC types. The dependence is also negligible for the urban and built-up land category, which seems to agree with our physical expectation since the surface materials are dominated by impervious man-made concrete and asphalt (roads, plazas, buildings). In addition, this category has soil albedos among the lowest values. Brest [1987] found that rural vegetation has higher near-infrared albedos than most urban surface materials. Our time series analysis of MODIS data further shows that urban visible albedos have smaller peaks and flatter variations than surrounding vegetated categories, although the averages are larger in the former. It is thus reasonable for urban and built-up areas to have smaller maximum background soil albedos than vegetated ones. (Here, for convenience, “soil” refers to the mixture of all ground surface (natural and impervious man-made) materials.) Other parameters are well within the numerical bounds.

Table 3. Parameters of the Dynamic Component of the New Albedo Parameterization That Depend on LCC Types
LCCBackground Soil Albedo α0g,η,λBackground Canopy Albedo α0c,η,λUpward Scattering Coefficient Λη,λCanopy StructureCc,η,λCϑ,η,λ
Dir, VisDif, VisDir, NirDif, NirDir, VisDif, VisDir, NirDif, NirDir, VisDif, VisDir, NirDif, Nirequation image1equation image2Dir, NirDif, NirVisNir
  • a

    Tall tree categories where Cϑ,η,λ correction is applied only in the vegetated portion of a grid; for others, the correction is applied uniformly over the whole grid. The boldface values are those equal to the imposed numerical bounds for the optimization solver.


[45] Although the model was carefully developed with full consideration of important physical processes, it must be cautioned, however, to automatically draw physical interpretation for the solution of individual parameters. As emphasized earlier, the FFSQP solution only gives the shortest path in the multiparameter space to minimize the objective function in equation (18). For each LCC type, it is the whole set of parameters listed in Table 3 that makes the model equation (9) produce the minimum variance between αη,λ and αs,η,λ as integrated over the entire group. A bias in one parameter may be compensated for by changes in others to maintain the minimization. This is an expected consequence of the optimization solution for an overdetermined multiparameter model. Furthermore, equation (9) is a still-simplified model of the complex reality. Individual model components may compensate each other in representing certain dependencies. For example, when the dependence of albedo variation range on the LCC type is removed (Cϑ,η,λ = 0), the resulting Λη,λ of the visible, direct beam reaches the upper limit for all LCC types. This indicates that Cϑ,η,λ can mimic the scattering effect of the Λη,λ term, albeit only in the case of the visible, direct beam and having a lower correlation score.

3.3.4. Localization Factor

[46] This is the final step to incorporate the remaining local contribution from distinct surface characteristics. The objective function is defined at each CWRF grid as

equation image

where Nt is the total number of samples or records for a specific grid. One constraint is 0.8 ≤ γη,λ ≤ 1.2. Similar to the steps in sections, it is important to explicitly provide the correct gradient of the objective function to each parameter, and here at every grid, for the FFSQP to give a robust solution.

[47] Figure 2 shows the γη,λ geographic distributions. A striking feature is that the localization factors closely resemble each other between direct beam and diffuse radiation for either visible or near-infrared band. The spatial pattern correlation coefficients over land areas of the entire domain are over 0.97 for both bands. Even between the visible or near-infrared bands, the general patterns are similar, except over the far northwest corner of the domain (Washington, British Columbia) dominated by evergreen needleleaf forests, where adjustments are upward in the former and opposite for the latter. The result indicates that the corrections tend to be systematic. The spatial variability, however, is not trivial. Upward adjustments are applied in the southeast United States, an area with predominant mixed forests but containing large fractions of evergreen forests. Upward adjustments also prevail over the north-central United States, covered by croplands mosaic. In contrast, downward adjustments are made in the western United States, primarily in the dry intermountain basin areas where shrublands are most popular but coexist with other greener vegetation types. Reductions also occur in Ontario-Quebec along the belt of evergreen needleleaf forests, which include fractions of mixed forests dominated on both sides. All these regions are identified with a small fraction (often less than 15%) covered by the respective dominant LCC type, while the remaining large portion of the grid is a mixture of other vegetation types. The result suggests that the localization factor is partially caused by the use of a single dominant LCC type in a grid while accounting for contributions from other types. Another significant contributor to the geographic distribution of the localization factor is spatial variability of soil moisture.

Figure 2.

Geographic distributions of the localization factor γη,λ for (a) direct beam visible band (dir, vis), (b) direct beam in the near infrared band (dir, nir), (c) diffuse radiation in the visible band (dif, vis), and (d) diffuse radiation in the near infrared band (dif, nir).

[48] Only the NLDAS/Mosaic soil moisture (mostly over the United States) is used for developing the dynamic component of the new parameterization (αη,λ). Hence the corrections over northern Canada and southern Mexico (except the U.S. borders) are made to provide the U.S.-based dynamic component with local surface characteristics, where soil moisture is replaced by the GLDAS/Noah. The GLDAS soil moisture scaling based on the two transition zones (section 2) seems effective such that discontinuity is not noticeable in Figure 2. This applies to other diagnostic fields to be discussed in the next section. In addition, γη,λ reaches either the lower (0.8) or the upper (1.2) bounds in approximately 7–9% of the total land area, most often over the Rockies and Appalachians.

4. Comparison Between Simulations and Observations

[49] Figure 3 shows correlation coefficients and relative errors of direct and diffuse albedos at the visible and near-infrared bands simulated by the old CLM albedo scheme equations (2)(4) as compared with MODIS data for each LCC type. The correlation and bias are calculated by

equation image

where N is the total number of samples in the summation. Here N = Nv, using all samples for each LCC group. Clearly, the old scheme produces very poor results, with group correlation coefficients of −0.57–0.71 and relative errors of 18–140%. The skill is especially bad in the near-infrared band for both direct beam and diffuse radiation, where correlation coefficients are large negative or near zero for most land cover categories; the best correlation is 0.45 for shrubland diffuse albedo. Correlations generally increase in the visible band, except for wooded wetland diffuse albedo. On the other hand, relative errors are substantially greater in the visible band than in the near-infrared band. They are extremely large for wooded tundra, mixed forest, and mixed shrubland/grassland, consistently in all four albedo components.

Figure 3.

Bulk correlation coefficients (open bars) and relative errors (solid bars) of the four albedos (see Figure 2 for the convention of abbreviations) simulated by the old CLM albedo scheme as compared with MODIS data for each LCC type. A number is shown if the value exceeds the scale.

[50] Figure 4 compares the dynamic component (i.e., before the localization) of the new albedo parameterization equations (7)(14) with MODIS data. Relative to the old scheme, the new model significantly reduces relative errors in all four albedo components for all LCC types. This is expected as a direct result of the optimization solution that minimizes the model-data deviation. Relative errors now range from 8 to 42%. The largest improvement is obtained for wooded tundra, mixed forest, and mixed shrubland/grassland, whose errors for the old scheme are excessive. More importantly, correlation coefficients are greatly enhanced for all four albedo components, especially in the near-infrared band. Correlations are now in the range of 0.39–0.88. The lowest correlations are identified with diffuse albedo in the near-infrared band, although its relative errors are among the smallest. Note that relative errors measure the systematic model bias, which could be subjectively removed by simpler methods without considering the temporal or dynamic structure. These methods would not attain high correlations, nor does the optimization solution by design ensure that this will occur. The significant correlation gains imply that the new conceptual model does capture the dominant variables that control the dynamic variations of surface albedos. This becomes more obvious below.

Figure 4.

Same as Figure 3 but simulated by the dynamic component of the new parameterization (before the localization).

[51] The localization further reduces relative errors of the new model across the board (Figure 5). The range now falls within 5–37%. The reduction is most remarkable for wooded tundra, barren or sparsely vegetated, and shrubland categories, indicating that these types contain substantial inhomogeneity. In addition, correlation coefficients are uniformly high in the range of 0.69–0.94. The improvement is more pronounced in the near-infrared band than in the visible band, suggesting more dominant local control in the former. Given that the static γη,λ does not improve the temporal or dynamic structure, the correlation increases after the localization is purely attributed to the enhanced spatial coherence of the albedo climatological means between the model and data.

Figure 5.

Same as Figure 3 but simulated by the new parameterization (after the localization).

[52] The bulk (temporal plus spatial) correlation before the localization is a composite representation of the pointwise correlation as integrated over all grids of the same LCC type. The localization reduces the overall pointwise deviation of the model values from measurements, and consequently increases the bulk correlation. The pointwise (temporal) correlation is a direct indication of the predictive skill of the dynamic component. The pointwise correlation coefficient and relative error are calculated by equation (20) using only temporal records at each grid, i.e., N = Nt. Given the data quality control, Nt vary geographically and differ between the four albedo components. Each MODIS black-sky αs,b is the direct beam albedo at each single solar angle while the white-sky αs,d is the integral of all the black-sky possibilities and thus represents the diffuse albedo under an isotropic illumination. Thus, direct albedos have approximately 12 times more samples than diffuse albedos. We assume that each 16-day composite is independent, i.e., having 1 degree of freedom. During the entire 2000–2003 analysis period, almost every grid has more than 45 composites available with good quality data. Assuming this minimum number of degrees of freedom (43), the correlation threshold for statistical significance at the 95% confidence level is 0.26. Below, we consider correlation coefficients greater than 0.26 are statistically significant for all four albedo components.

[53] Figure 6 depicts the geographic distributions of pointwise correlation coefficients between albedos observed by MODIS and simulated by the new parameterization. Recall that the distributions are identical before and after the localization since γη,λ are static. Strikingly, correlations for direct beam albedos are greater than 0.50 almost everywhere and mostly higher than 0.75, especially in the near-infrared band. Correlations generally decrease for diffuse radiation albedos. In the visible band, large reductions occur over the western United States, primarily in the Rockies and intermountain basins, and over the southeast United States. Similar changes are found in the near-infrared band, where reductions are more severe and extensive over the western United States while much smaller in the southeast United States. This band also has substantial reductions over southern Canada. In a large portion of these areas, correlations drop below the statistical threshold. Unfortunately, we cannot at this time offer a reasonable explanation why these reductions occur and whether they are caused by input data problems or incomplete model processes. For example, we may argue that low correlations over mountainous regions could result from poor NLDAS soil moisture simulations and MODIS albedo retrieval qualities associated with the orographic shadow effect (causing biases in rainfall and solar radiation, respectively). Such an effect should, however, be seen consistently for both direct beam and diffuse radiation in the visible (mostly absorption) or near-infrared (largely reflection) band. This is not the case, since reductions are identified from direct beam to diffuse radiation.

Figure 6.

Geographic distributions of pointwise correlation coefficients (×100) between albedos observed by MODIS and simulated by the new parameterization (see Figure 2 for the convention of abbreviations). Values below the statistical significant threshold (26) are distinguished by the gray color.

[54] Figure 7 presents the geographic distributions of pointwise relative errors between the dynamic component of the new albedo parameterization and MODIS data. One striking feature is that the patterns of relative errors are remarkably similar between direct and diffuse albedo in either the visible or near-infrared band. The spatial pattern correlation coefficients over land areas of the entire domain are over 0.95 for both bands. In the near-infrared band, relative biases are below 30% almost everywhere and mostly smaller than 15%. In the visible band, relative errors of greater than 30% occur over most areas of southern Canada (except north of Montana-North Dakota) and the northwest United States. For all four albedo components, the localization further reduces errors as designed (Figure 8). The near-infrared band has errors under 15% almost everywhere. Although general reductions also happen in the visible band, relative errors remain high over southern Canada and northwest United States. We speculate that these large errors are likely caused by snow contamination on MODIS retrieval. Note that Canadian and/or winter data are used in the diagnosis but the majority of them are excluded from the model development. Since the MODIS data quality control is based on the dominance in a 16-day window, some “snow-free” pixels may still be contaminated. Because vegetation albedo is much smaller in the visible band than in the near-infrared band while snow albedo has a relatively small difference between the two, any snow contamination may greatly enlarge relative errors in the visible band but less so for the near-infrared band where vegetation and snow albedos are more similar. Removal of winter data from the diagnosis does largely reduce these errors.

Figure 7.

Geographic distributions of pointwise relative errors (%) between the dynamic component of the new parameterization before the localization and MODIS data (see Figure 2 for the convention of abbreviations).

Figure 8.

Same as Figure 7 but for those after the localization.

5. Parameter Sensitivity

[55] Given the excellent performance of the new model discussed above, a question arises whether all parameters are statistically significant or some could be removed to have a more stable but equally skillful scheme [Schluessel et al., 1994]. We have taken the bottom-up approach, starting from the simplest formulation with the least number of free parameters and gradually increasing the complexity by inclusion of more physically important variables, to build the new model equations (7)(14). During the course of the development, numerous sensitivity experiments were conducted to ensure the uniqueness of each variable and parameter. Rather than comparing the full model with a simplified version (by eliminating some variables so that all parameters must be re-derived), we believe it is more important to determine how much each parameter contributes to the overall model skill. We therefore apply the complete model to conduct diagnostic analyses, each incorporating one single change of the following: (1) α0c,η,λ is set to the mean value of all LCC types; (2) α0g,η,λ is set to the mean value of all LCC types; (3) Λη,λ is set to the mean value of all LCC types; (4) Cjμ,λ = 0, no soil albedo dependence on solar zenith angle; (5) Cϑ,η,λ = 0, no LCC type correction to soil albedo dependence on surface moisture; (6) C2ϑ,η,λ = 0, no soil albedo dependence on surface moisture; (7) Cc,η,λ = 0, no canopy albedo dependence on greenness; (8) ϕ1 = 0.5, ϕ2 = 0, assuming spherical distribution leaves as in the old CLM scheme; (9) soil albedo dependence on daytime-mean cosine of solar zenith angle; and (10) soil albedo dependence on time-averaged surface moisture. Except for the specified change, all other parameters are identical and are defined in Table 3. Hereinafter, each diagnostic analysis is referred to as “E” followed by the corresponding exception number listed above. For simplicity, we adopt the gross correlation coefficient and relative error as the measure of importance. They are calculated by equation (20), where N equals the sum of Nv over all LCC types. Note that the bulk mean of each group having the same LCC type is first removed from each sample of the group. These departure values are then used in the calculation of equation (20).

[56] Figure 9 summarizes the diagnostic results using the dynamic component of the new parameterization (i.e., before the localization). Those of the full model and the old CLM scheme are also shown for comparison. Without a change, the new model has gross correlation coefficients of 0.57–0.71 and relative errors of 17–26%, varying with the four albedo components. This is a substantial improvement over the old CLM albedo scheme, which has gross values of 0.03–0.32 and 37–71%, respectively.

Figure 9.

Comparison of gross correlation coefficients (×100) and relative errors (%) as simulated by the old scheme (CLM) and the dynamic component of the new parameterization using the full set of the parameters (NEW) as well as sensitivity diagnoses with exceptions (see text for detail).

[57] The relative contribution or sensitivity of an individual parameter differs between the four albedo components. For direct beam albedo in the visible band, the biggest contribution to the dynamic component (as measured by the correlation score) is assuming nonspherical distribution leaves. If otherwise (E8), the correlation drops by 0.20 while the relative error rises by 19%. The second largest contributor is the inclusion of soil albedo dependence on solar zenith angle. Without this (E4), the correlation is reduced by 0.11 and the error is increased by 23%. The correlation difference between E9 and E4 is tiny, indicating that the dynamic albedo dependence on solar zenith angle is dominated by the diurnal cycle while the annual cycle contributes little. The inclusion of the annual cycle, however, has a noticeable reduction in the relative error (21%), suggesting its large role on albedo magnitude. The third important factor is to account for the LCC type correction in soil albedo dependence on surface moisture. Excluding this correction (E5) causes a correlation decrease by 0.10 and a small error increase of 7%. If soil albedo dependence on surface moisture is totally removed (E6), relative error jumps substantially by 51%, although the correlation decrease remains about the same. This indicates the importance of soil moisture in accounting for albedo spatial inhomogeneity (see below for further discussion along with the E10 result). Similarly, the Λη,λ effect is mainly on albedo inhomogeneity; the use of a uniform value independent of LCC types (E3) produces a 12% larger relative error. The least sensitivity is identified with α0c,η,λ, primarily because of its relatively small variation across the LCC types.

[58] For diffuse beam albedo in the visible band, the sensitivity to individual parameters is quite similar to direct albedo. One exception is that Cjμ,λ (E4) and the removal of the solar zenith angle diurnal cycle (E9) have no effect as designed. Note that for both direct and diffuse albedos, Cc,η,λ = 0 (E7) is prescribed and hence has no impact in the visible band. For direct beam albedo in the near-infrared band, parameters Cjμ,λ, Λη,λ, and C2ϑ,η,λ are among the largest contributors and have close sensitivities. If changed from the original design (E4, E3, E6), they decrease the correlation by 0.13 and increase the relative error by 16%. The remaining most sensitive parameter is ϕj; a spherical distribution assumption (E8) leads to a correlation drop by 0.08 and a relative error rise of 15%.

[59] As noted previously, the dynamic component of the new parameterization has the lowest skill for diffuse albedo in the near-infrared band, as compared with the other three albedos. The sensitivity to certain parameters is also greater. The biggest contribution is to include canopy albedo dependence on greenness. Without this (E7), the correlation is reduced by 0.37 and the error is increased by 39%. The second largest contributor is to incorporate soil albedo dependence on surface moisture. If removed (E6), the correlation drops significantly by 0.25 while the relative error rises by 23%. The result is a marked contrast to the visible band, where this soil moisture dependence contributes mostly to the relative error rather than the correlation. The third important factor is assuming nonspherical distribution leaves. If otherwise (E8), the correlation decreases by 0.07 while the relative error increases by 11%. Parameters α0g,η,λ, Λη,λ, and Cϑ,η,λ are the remaining large contributors and have close sensitivities. If changed from the original design (E2, E3, E5), they decrease the correlation by 0.05 and increase the relative error by 8%.

[60] The localization yields an overall reduction in the relative error, which falls within 11–21% for the new model using the full set of parameters (Figure 10). Not surprisingly, the localization may compensate for the loss by removing a dynamic factor. A large portion of the dynamic component can be mimicked by applying statistical corrections. Our long-term goal, however, is to develop the dynamic component to capture, as much as possible, the dominant physical processes that govern most temporal and spatial variations as observed, while minimizing any unexplained statistical corrections. When accomplished, such a dynamic component can be applied alone without loosing noticeable skill. The existence of large differences between the results before and after the localization (Figures 910) implies that there still remains much unknown and the dynamic component of the new model can be further improved. Note that we have carefully constructed the conceptual basis of the model with full consideration of important physics to increase the likelihood that the FFSQP solution represents a large part of the true dynamic processes. The results of Figures 910 can be used to formulate hypotheses for further refinement of the dynamics to be tested through model sensitivity studies, field experiments, and more refined data analyses.

Figure 10.

Same as Figure 9 but for the new parameterization after the localization.

[61] The gross correlation coefficient and relative error are not effective measures for differentiating the relative contributions of spatial versus temporal variability of surface moisture to albedo variations. Both Figures 9 and 10 show that the differences between E10 and the full new model are very small, suggesting that the impact of soil moisture temporal variability may be negligible. This is actually not the case, but results from a cancellation of substantial soil moisture variability between all grids of the same LCC type. Figure 11 compares the spatial frequency distributions of pointwise temporal correlation coefficients and relative errors between E10, E6, and the full new model before the localization. The removal of soil moisture temporal variability (E10) has little effect on the frequency distribution of relative errors for all four albedo components. This removal, however, has pronounced impacts on correlation coefficients. For direct albedos, the shape of the frequency distribution is shifted toward smaller correlations, a manifestation of systematic reductions in temporal correlations. For diffuse albedos, the actual frequencies are generally reduced in the correlation range between 0.40 and 0.80, indicating that a greater number of points result in lower temporal correlations. In both cases, the result clearly demonstrates the importance of soil moisture temporal variability in determining dynamic albedo variations, especially over sparsely vegetated areas such as the Rockies and intermountain basins. On the other hand, a comparison between E10 and E6 shows that soil moisture spatial variability, mainly determined by local soil texture, has more profound impacts on albedo relative errors (in magnitude) than correlation coefficients (in temporal sequence). The large correlation frequency distribution changes for diffuse albedos between E10 and E6 also result from the magnitude effect caused by the use of spatially varying (in terms of surface moisture) versus constant background soil albedo. In summary, both temporal and spatial variability of soil moisture must be accounted for to adequately represent albedo variations.

Figure 11.

Spatial frequency distributions of pointwise temporal (a–d) correlation coefficients and (e–h) relative errors for the full new model (thick solid line), E10 (thin solid line), and E6 (dotted line). See Figure 2 for other legends.

6. Discussion

[62] The new land surface albedo parameterization so developed represents a marked advance in the integration of a physically based conceptual model with satellite measurements. As compared with the MODIS data, it results in high gross correlation coefficients of 0.57–0.71 and low relative errors within 11–21%. This is a significant improvement over the existing CLM albedo scheme, which has gross values of 0.03–0.32 and 37–71%, respectively. The result, however, contains several important caveats, each of which must be rigorously assessed for its impact before the true predictive skill of the model is established.

[63] First, substantial biases in LDAS soil moisture, MODIS albedo, and other input data are expected to directly influence the result, including the validity of the derived parameters and the credibility of the model. Soil moisture has large differences in its dynamic range inherited between different LSMs, and thus its absolute value is less credible than its temporal variation. Soil albedo may also have important impacts on soil moisture simulation. Thus the albedo dependence on soil moisture may have to be redefined for coupling with different LSMs. An appealing approach is to couple the new albedo parameterization with the target LSM, run the entire host model (such as LDAS or CWRF), and repeat the optimization procedure to account for albedo-soil moisture feedbacks. This iterative inverse modeling approach will provide an optimal solution of albedo dependence on soil moisture that is most suitable for the LSM and host model chosen. This will be a focus of our future research. In addition, we have used MODIS data with quality flags of both QA = 0 and 1 to achieve sufficient sample size. The latter, having poorer quality, may contain certain snow, cloud, or other contaminations that may partially explain the poor model skill in winter and mountainous regions. The model is built upon the bold assumption that all these data are accurate. We plan to revisit the issue using improved data when available, such as the upcoming MODIS version 005 that includes both Aqua and Terra measurements for fuller retrievals and thus provides sufficient samples with QA = 0 for model development, as well as the satellite retrievals of soil moisture from the European Space Agency's SMOS mission in 2006 and NASA's Hydros mission in 2010 [Wigneron et al., 2000; Entekhabi et al., 2004; Crow et al., 2005; X. Zhan et al., Retrieving surface soil moisture from coarse resolution radiometer and fine resolution radar observations using the Kalman filter, submitted to IEEE Transactions on Geoscience and Remote Sensing, 2005].

[64] Second, the optimization solution is not unique, especially for individual model parameters. The only insurance is that it produces a set of parameters to minimize the integrated model-data variance. It is valuable to conduct sensitivity studies by using multiple data sources (especially for soil moisture having great variability and uncertainty) as well as different data periods and optimization procedures. The statistical stability and physical robustness of the solution can be evaluated by cross examination of the prediction for one period using the model developed from another and vice versa. Similarly, the dynamic component built from one region can be evaluated for application in different regions. These tests are in progress.

[65] Third, the conceptual model physics is far from complete. For example, the effects of LAI and SAI are currently treated equally. Dead leaves have negligible photosynthesis, and thus very different single scattering albedos from those of green leaves [Asner, 1998]. As highlighted by Zhou et al. [2003], this equal treatment exaggerates the SAI contribution to surface albedo, especially in winter. In addition, the model is developed for the specific CWRF domain at a 30-km grid spacing where a dominant LCC type is assumed. Thus the parameters derived for each LCC type may contain contributions from other vegetation types. This problem can be resolved when consistent, fine-resolution, and high-quality input data are available in the future.

[66] Given these and other caveats, our conceptual model solution as integrated provides the best fit to the data, with the highest correlation and lowest deviation between the two. The real advantage of this model lies in its cost-effectiveness for applications in global and regional climate models. We however concur with Schluessel et al. [1994] that the model is not adequate at this stage for ecological applications seeking faithful representation of plant and soil characteristic parameters. More field or laboratory measurements are needed to more physically bound or specify some of the model parameters to reduce the degrees of freedom so that the optimization solver can give a more robust and meaningful estimate of the remaining parameters. Nonetheless, the conceptual model so developed and the objective procedure to solve the parameters can be more generally applied.


[67] We appreciate two anonymous reviewers for their constructive comments. We thank AEM Design, Inc., for making the FFSQP solver freely available. This research was partially supported by the United States Department of Agriculture (USDA) UV-B Monitoring and Research Program grant to University of Illinois at Urbana-Champaign (AG CSU G-1502-5) and the China National 973 Key Project Award G19990435. Yongjiu Dai was supported through the NSFC of China under grant 40225013. The data processing was mainly conducted on the UIUC/NCSA supercomputing facility. The views expressed are those of the authors and do not necessarily reflect those of the sponsoring agencies or the Illinois State Water Survey.