Developing a spatially continuous 1 km surface albedo data set over North America from Terra MODIS products



[1] Surface albedo is an important factor governing the surface radiation budget and is critical in modeling the exchange of energy, water and carbon between Earth surface and atmosphere. Global satellite observation systems have been providing surface albedo products periodically. However, due to factors associated with weather, sensors and algorithms, albedo products from satellite observations often have many gaps and low quality pixels. Take the North American Moderate Resolution Imaging Spectroradiometer (MODIS) albedo products (MOD43B3) from 2000–2004 as an example, only 31.3% of pixels were retrieved with fully data-driven inversions and labeled with the highest possible mandatory quality flag of “good”. Conversely, 13.3% did not contain any valid retrieval and the remaining values vary in quality from moderate full inversions to poor retrievals utilizing a back-up algorithm. This indicates considerable potential for product improvement through the use of gap-filling algorithms. Our objective is to generate spatially and temporally continuous albedo products through gap filling and filtering based on multiyear observations and high quality neighboring pixels. The resultant albedo data set substantially improves the time series of surface albedo, especially in the winter, when for some areas the MODIS albedo products may have no retrievals. A comparison with field measurements shows that the filtered albedo correlates well with the measured surface albedo with the root mean squared errors (RMSEs) around 0.064 for snow-free pixels and 0.078 overall. The generated spectral albedo, broadband albedo and monthly albedo can be used for climate modeling and data assimilation applications.

1. Introduction

[2] Land surface albedo, the ratio of reflected radiation to incoming solar radiation, is an important biophysical description of the Earth's surface. It is indispensable for understanding the basic land surface processes at all scales, and as such is crucial both for the biophysically based land surface modeling and also for atmospheric general circulation models used in climate simulation and weather forecasting.

[3] Satellite remote sensing has been providing useful estimates of land surface albedo at regular temporal intervals. For example, the Moderate Resolution Imaging Spectroradiometer (MODIS) provides a suite of standard albedo data sets, including the bihemispherical (white-sky albedo) and directional hemispherical (black-sky albedo) albedo (noted as MOD43B3) [Schaaf et al., 2002]. The black-sky and white-sky albedos are provided for seven spectral bands (0.47–2.1 μm) and three broad bands (0.3–0.7, 0.7–3.0, and 0.3–5.0 μm) at 1 km resolution every 16 days. The black-sky albedos are computed for the local noon solar zenith angle for each location.

[4] The standard MODIS albedo data have generally been used as a reference data set to evaluate the results from climate models [Roesch and Roeckner, 2006; Wang et al., 2004; Zhou, 2003]. Zhou et al. [2005] have used the MODIS albedo data to derive a soil albedo data set in Northern Africa and the Arabian Peninsula. However, one problem with the current albedo product is the extensive data gaps caused by cloud cover, seasonal snow and instrument problems (e.g., non-functional or noisy detectors). Roughly half of the global land surface can be obscured due to cloud coverage and snow cover on a yearly equal-angle basis [Moody et al., 2005]. Moreover, the quality of the retrieved albedo data is variable — only 31.3% of the pixels are fully data-driven inversion retrievals and flagged with the highest possible mandatory quality assessment of “good quality” for the 2000–2004 North America data (Figure 5 below). Such variability can limit the application of albedo in land surface process simulation, climatic modeling and global climate change research.

[5] One way to improve the continuity of albedo data is to use various temporal or spatial filters. Mathematical filters such as the simple linear interpolation, Fourier wave adjustment [Sellers et al., 1994], polynomial fitting [Cihlar et al., 1997; Karnieli et al., 2002], asymmetric Gaussian filter [Jönsson and Eklundh, 2002], cubic-spline capping [Chen et al., 2006] or the piecewise logistic function fitting [Zhang et al., 2003], etc., can be used for the discontinuous albedo problem although they have been mainly used to restore the vegetation index profile by the remote sensing community. Spatial filters, e.g., the co-kriging method, tend to derive the target information using pixel-level or regional ecosystem statistical data [Dungan, 1998].

[6] A temporal interpolation technique was described by Moody et al. [2005] to fill missing or seasonally snow covered albedo data. Their method assumes that any missing pixel's temporal profile can be represented with the local regional ecosystem-dependent phenological profile. We build on this work and show that the missing pixel value can also be represented with its multiyear average. This has been demonstrated by the multiyear leaf area index (LAI) products and is the basis for the temporal and spatial filtering (TSF) technique to produce continuous LAI data [Fang et al., 2006]. This characteristic also stands for the multiyear albedo products (see Section 3.3 below).

[7] However, the TSF algorithm can not be used directly to process the MODIS albedo products because of the differences between the albedo and LAI products, such as the temporal resolution, data quality, cloud and snow effects. Albedo is fundamentally different from LAI in several aspects: (1) LAI is mostly used to represent vegetation characteristics while albedo is a general property of surface for both vegetation and the ground; (2) albedo is more affected by ground conditions compared with LAI. One related problem is the difference between snow covered and snow-free surfaces. One has to separate snow covered and snow-free pixels when the filters are applied for albedo; (3) The MODIS albedo products have seven spectral bands and three broad bands for both the black-sky albedo and white-sky albedo. Therefore each albedo band needs to be processed separately based on the band-dependent quality control (QC) layer (Table 1); (4) The MODIS albedo products have been validated against field data and a relatively good quality has been reported [Jin et al., 2003; Liang et al., 2002], while algorithm improvement and validation studies are still necessary for the MODIS LAI data [Fang and Liang, 2005; Shabanov et al., 2005]. These all make the approach in generating improved albedo products different from the one developed for LAI.

Table 1. The Overall and Band Dependent Quality Control (QC) Flags of the MOD43B3 Albedo Data
 QC Bits (Binary, decimal)Description
QC_Word1 (Band independent)00 = 0Processed, good quality
01 = 1Processed, see other quality assessment (QA)
10 = 2Not processed due to cloud effect
11 = 3Not processed due to other effects
QC_Word2 (Band dependent)0000 = 0RMSE good, WoD(NBAR) good, WoD(WSA) good
0001 = 1RMSE good, WoD(NBAR) good, WoD(WSA) moderate
0010 = 2RMSE good, WoD(NBAR) moderate, WoD(WSA) good
0011 = 3RMSE good, WoD(NBAR) moderate, WoD(WSA) moderate
0100 = 4RMSE moderate, WoD(NBAR) good, WoD(WSA) good
0101 = 5RMSE moderate, WoD(NBAR) good, WoD(WSA) moderate
0110 = 6RMSE moderate, WoD(NBAR) moderate, WoD(WSA) good
0111 = 7RMSE moderate, WoD(NBAR) moderate, WoD(WSA) moderate
1000 = 8magnitude inversion (numobs >=7)
1001 = 9magnitude inversion (numobs >3 & <7)
1010 = 10magnitude inversion (numobs <=3)
1011 = 11Bus-in DB parameters (not currently used)
1111 = 15Fill value

[8] On the basis of the above rationale, the central objective of this paper is to develop an algorithm for providing spatially and temporally continuous albedo products appropriate for data assimilation studies [Barlage et al., 2005]. The algorithm exploits and integrates both the spatial and temporal dimensions. The new data set is compared with the original MODIS products. The performance of this new method is also evaluated with field data obtained from the AmeriFlux, the SURface RADiation (SURFRAD) and the Greenland Climate Network (GC-Net) networks.

2. Overview of the Temporal and Spatial Filtering (TSF) Method

[9] The new filtering method derives value-added products based on multiyear observations. The basic principle of the TSF method is that a pixel's biophysical characteristics are relatively stable over multiple years compared with the spatial variations among different pixels. Thus a pixel's multiyear average is considered first to fill the missing gaps.

[10] In practice, the TSF method combines a background value and an instantaneous observation to generate the improved products. The background value represents a pixel's average amplitude over several years or its similarity to other pixels of the same plant functional type (PFT). The following equation was used to calculate albedo αa at point ri [Liang, 2004],

equation image

where αb is the first guess (background), αo is a set of estimates around ri, n is the size of the influence window, w(ri, rj) is the weighting function dependent on the distance di,j between points ri and rj. The weighting function was designed to give more importance to the values closer to the ri and takes the form

equation image

where R is a predefined radius of influence. For the albedo application, points ri and rj represent the Julian day of year (DOY). A low R (e.g., =16) value only includes a limited number of neighboring observations that could have no valid values. A high value for R increases the points but also increases the error estimate. An R = 48 was used to calculate w(ri, rj) and an n = 2, i.e., two MODIS albedo data cycles on each side of the center, was used in Equation (1).

[11] The general process for the new filter involves several steps (Figure 1):

Figure 1.

Flowchart of the albedo reprocessing and filtering algorithm.

[12] Step 1: For pixels labeled with the highest quality (e.g., from the full retrieval algorithm), their values are used as references to calculate the background multiyear average, αb and are maintained in the final product. The algorithm only processes pixels with low quality or missed values.

[13] Step 2: Pixels marked with lower QC (e.g., from the backup retrieval algorithm) are treated as observational values (αo) with uncertainties. The pixel's multiyear average calculated from the full retrieval data is treated as the background value. At least one full retrieval is necessary to calculate the pixel's “true” value with Equation (1).

[14] Step 3: If a pixel has no full retrieval over years, an ecosystem curve fitting (ECF) algorithm is applied. The background values are accumulated from the full retrieval pixels for each 1° latitude zone in a tile (1200 km × 1200 km) with the vegetation stratified according to the percentage of trees, herbaceous and bare areas for each pixel using 10% intervals of the MODIS vegetation continuous field (VCF) product [Hansen et al., 2003]. Background values are used for the pixel's missing gaps on the condition that both are of the same PFT and within the exact VCF range.

[15] Step 4: For no retrieval pixels a temporal filter is used to calculate the observational value from surrounding days having either a full or backup retrieval. In case there is no valid retrieval in the surrounding days, the corresponding background values are used in the temporal filtering process. The Savitzky-Golay (SG) filter [Chen et al., 2004; Savitzky and Golay, 1964] was used for this purpose [Fang et al., 2006], but special attention needs to be paid to days when there is snow cover or partial snow cover. Similar to Equation (1), we have considered two neighboring observations on each side (n = 2) for the SG temporal filter. Figure 2 shows an example of calculating albedo for a certain day (113). Days 17 ∼ 97 are snow covered and days 129 ∼ 209 are snow-free. If snow is present on day 113, the albedo is extrapolated from the albedos of days 81 and 97, otherwise from days 129 and 145.

Figure 2.

Scheme to estimate the albedo of day 113 based on observations from neighboring days. Days 17 ∼ 97 are snow-covered while 129 ∼ 209 snow-free values. Points A and B are the filtered values based on snow or snow-free conditions on day 113, respectively. In this case, point B is the estimated observational value since the quality control (QC) is marked as snow-free. The original MODIS albedo is shown with a plus sign.

[16] Step 5: The TSF algorithm combines the multiseasonal background and the annual observation within the synthesis window of size R (Equation 1). The process is repeated by shifting the composition window by the temporal resolution of the albedo data, i.e., 16 days. The result is a set of values centered on ri.

[17] The TSF algorithm integrates both spatial and temporal characteristics for different PFTs in order to generate improved products. A similar algorithm has been applied successfully to process the MODIS LAI products [Fang et al., 2006]. The algorithm for albedo bears some similarity to that used for LAI, but the tactics are different in applying the TSF to albedo due to the different biophysical characteristics, snow effect, QC flags, and temporal coverage between albedo and LAI.

3. Analysis of the MODIS Albedo Product

[18] Surface ecosystems are represented as patches of plant functional types (PFTs) provided in the MODIS standard products (layer 5 of MOD12). In total, there are 11 primary PFTs: evergreen needleleaf and broadleaf trees, deciduous needleleaf and broadleaf trees, shrub, grass, cereal and broadleaf crops, urban and built-up, snow and ice, barren or sparsely vegetated (Table 2). The three major plant functional types in North America are shrubs, evergreen needleleaf trees, and grasses, occupying 27.85%, 20.06% and 14.91% of the area, respectively (Table 2).

Table 2. Major Plant Functional Types (PFTs) Over North America and the MODIS Albedo Data (MOD43B3) Quality for Year 2000–2004a
PFTAreaQC_Word2 (Red Band)QC_Word1Average cloud coverageAverage snow coverage
Full inversionBackup algorithmProcessed & good qualityProcessed with reference to other QA
  • a

    The table shows the percentage of pixels labeled with different Quality Control (QC) words and average cloud and snow coverage. QC_Word2 lists the red band only. The numbers separated by “/” are mean values and standard deviations, respectively.

Evergreen needleleaf tree20.0623.1/16.565.7/11.826/1863.8/12.210.1/9.940.0/34.3
Evergreen broadleaf tree3.5919.6/1165.1/7.522.3/12.163.5/7.714.2/8.815.5/7.8
Deciduous needleleaf tree0.1525/21.965.5/18.127.9/23.463.7/18.98.4/10.144.3/39.4
Deciduous broadleaf tree6.2824.6/15.365.6/14.427.3/15.963.5/14.59.2/915.6/14.8
Cereal crop6.3730.1/22.463.1/18.731.6/22.762.1/19.16.3/6.823.8/27.7
Broadleaf crop8.5924.9/15.767.8/13.826.2/15.966.9/146.9/4.513.1/11.8
Urban and built-up0.6225.1/15.364/13.126.1/15.563.4/13.310.5/8.314.1/10.1
Snow and ice6.4126.9/28.332.1/19.638.1/37.727.4/21.134.5/3799.4/1.5
Barren or sparsely vegetated5.1624/17.744.8/18.328.8/20.643.9/1727.4/26.264.8/26.2
Overall 28.056.731.355.413.339.3

3.1. Band Dependent Quality Assessment

[19] The operational MODIS albedo algorithm for the MOD43B3 product makes use of a kernel-driven, linear Bidirectional Reflectance Function (BRDF) model which relies on the weighted sum of an isotropic parameter and two functions of viewing and illumination geometry to determine reflectance [Roujean et al., 1992]. The snow-free MODIS surface reflectance product (MOD09) is used to estimate the BRDF model parameters if the majority of MOD09 observations during the period are snow-free. A full retrieval of the parameters for the BRDF model is attempted if seven or more observations survive the screening process. A backup algorithm retrieval is used if less than seven observations survive the screening or if a robust full retrieval can not be made. The backup algorithm uses a priori estimates of the BRDF shape for each pixel around the globe, and fits these predetermined shapes to any MOD09 observations in order to estimate the model parameters. A fill value is flagged if no observations are retrieved during a 16-day interval [Schaaf et al., 2002].

[20] The MODIS QC layer provides information about the pixel's processing status, data quality, cloud effects, and snow condition. The MODIS albedo data quality is indicated by two 32 bits words (Table 1). The first word, QC_Word1 (band independent) shows the overall albedo quality assessment. Quality value 0 indicates the highest quality, while 2 and 3 are for pixels not processed due to cloud and other effects, respectively. Quality 1 is processed but with reference to other quality assessment parameters (denoted as variable quality hereafter). QC_Word1 also points out the cloud and snow conditions in the retrieval process. The second word, QC_Word2 (band dependent) specifies the processing status for each spectral albedo band. Qualities 0 ∼ 7 are all full inversions and are high quality data. Qualities 8, 9 and 10 (magnitude inversion) are from the backup algorithm which has to specify a first guess BRDF rather than entirely let the data govern the retrieval.

[21] An example of the distribution of the band dependent QC for each plant functional type in North America is shown in Figure 3. This figure depicts the QC characteristics for the red band from the Collection 4 MODIS albedo data from 2000 to 2004. Overall, the number of pixels retrieved by the backup algorithm (56.7%) doubles those by the full inversion (28.0%) and 15.2% has no valid retrieval. Among all the PFTs, grass has most pixels retrieved by the full inversion (36.5%) compared with the deciduous needleleaf trees which has the least (19.6%). As low as 32.1% of the snow and ice areas are retrieved by the backup algorithm, but this is offset by the large amount of filled values (41.0%) in the PFT. Seasonally, the full inversion algorithm is applied more frequently in summer than in winter, mostly due to the cloud and snow effects. In the summer, the snow and ice surfaces usually have a high percentage of full inversion at around 79.1%. These observations necessitate development of new algorithms to improve the original albedo products. The reprocessed data will not only improve the data continuity but also the data quality.

Figure 3.

MODIS albedo band dependent quality assessment for the red band in 2000–2004. The percentage of full inversion, backup algorithm, and filled values are shown with dark solid, dark dashed, and pink lines, respectively.

[22] The background value (αb) in Equation (1) is calculated for each spectral band based on the band dependent QC. Over the multiyear periods, at least one full retrieval value is necessary to calculate the background value for a pixel. Figure 4 shows the number of 16-day periods (Ψ) across North America when a background value can be calculated during 2000–2004. The number Ψ (0 ∼ 23) is calculated by:

equation image

Where ξi = 1 if at least one full retrieval value exists during the study period, otherwise zero. For the red band, 2.80% pixels have a Ψ = 0, 24.31% pixels have a Ψ < 10 and only 4.53% pixels have a Ψ = 23. For pixels with ξi = 0, the background value will be derived with the ecosystem fitting algorithm (Step 3 in the above section). Full retrieval pixels (higher Ψ) are mostly concentrated in southwest of the continent and low latitude regions, while in other areas, such as the northeast side of the continent, the Ψ value is relatively lower. For other bands, the value and spatial characteristics of Ψ are similar to those of the red band.

Figure 4.

The number of 16-day periods (out of 23) when at least one full retrieval value exists and a multiyear background value can be calculated for band 1 (657 nm) during 2000–2004. The blue dots show pixels having no full retrieval values over the study period.

3.2. Overall Quality, Cloud and Snow Coverage

[23] In the MODIS albedo Collection 4 products (MOD43B3), the band independent QC_Word1 takes some of the band dependent QAs and tries to assign an overall QA to the pixel. If all bands have full retrievals with a quality flags 0–5, QC_Word1 gets zero otherwise it gets a QA of 1 (processed, see other QA). Some full retrievals are only from moderately good sampling. They also get counted as QA = 1 in QC_Word1, as well as those from the backup algorithms [Schaaf et al., 2002]. It is possible that some bands are better than others and a pixel is processed, but there may be no retrieval for a given band.

[24] The overall quality of the Collection 4 MODIS albedo products for North America was examined from 2000 to 2004 (Figure 5). Among all the PFTs, the percentage of good quality ranges from 22.3% for evergreen broadleaf trees to 38.1% for grass and snow/ice surfaces (Table 2). Overall, only 31.3% pixels are retrieved with processed good quality in the study period. Seasonally, the overall quality is much higher in summer than in winter, similar to the findings based on the band dependent QC. The highest percentage of good quality data was observed for snow and ice surfaces (95.2%) in the summer.

Figure 5.

MODIS albedo overall quality assessment for year 2000–2004. The percentage of good quality, “no retrieval” due cloud or other effects, and snow coverage are shown, respectively, for different plant functional types (PFT).

[25] On the basis of the overall quality assessment, about 13.7% data are impaired by cloud coverage and have no valid retrieval. The evergreen needleleaf tree class is a major PFT in North America, but 10.1% of the data is not valid due to clouds. The barren or sparsely vegetated area is also frequently influenced by cloud (27.4%), second only to the snow and ice region (34.5%). In summary, more than 70% of the albedo data have either variable quality or no valid retrieval. This is especially significant in the winter when a substantial number of gaps persist.

[26] The TSF algorithm mainly targets the pixels with variable quality or covered by seasonal snow. For MODIS albedo data, a pixel is labeled as ‘snow’ if snow or ice background is present for more than half of any given 16-day period [Schaaf et al., 2002]. Figure 5 shows the snow coverage (pink line) for each PFT in North America. The snow covered surface usually results in a lower quality. The snow coverage varies between PFTs (Table 2). The broadleaf crops and urban areas are least covered by snow. The sparsely vegetated area and the shrub areas have more days with snow cover than any other vegetation type since they are mainly distributed at high latitudes in North America.

[27] The presence of snow can significantly alter surface albedo values and thus the global and local climates. Climate, weather-forecast and hydrological modelers incorporate increasingly realistic surface schemes into their models. In land dynamics models, surface albedo over snow covered regions is usually parameterized as an average of snow-free albedo and snow albedo [Milly and Shmakin, 2002] or a weighted combination of soil and snow albedos by snow cover [Loeson et al., 2004]. For global applications, models also require a maximum snow covered albedo for all land pixels [Barlage et al., 2005]. A global maximum albedo for snow covered land has been developed from the MODIS albedo products and is used in the North American Land Data Assimilation System [Barlage et al., 2005]. Our goal is not just to improve the continuity of the albedo products, but also to include the snow days and provide researchers both snow covered and snow-free albedo data set. This would certainly improve the representation of surface albedo in data assimilation studies. Albedo products from this study can be compared with snow modeling results, including both mean conditions and year-to-year variability.

[28] To conform to the ground snow condition, an additional QC constraint was imposed such that snow and snow-free pixels are treated separately over different years. The albedo snow mask was examined whatever the pixel's quality. If a pixel is marked as “snow” or “snow-free” over the MODIS observation period, it is treated as permanently snow- covered or snow-free. Permanent snow and ice are mainly distributed in high latitudes or mountainous areas. For seasonal snow, precautions were taken to deal with the transition period from snow to snow-free, or the reverse, in order to calculate the observation values in gaps (αo in Equation 1). A temporal filter was used to restore the phenological curve for both the snow and snow-free period (Figure 2). Depending on the snow flag of the target day, either points A (snow covered scenario) or B (snow-free scenario) could be chosen as the observation value (Figure 2).

3.3. Spatial and Interannual Variability

[29] The land surface albedo varies spatially by the ground cover type and temporally by the season. There is a great deal of interest in investigating both the spatial and interannual variability of albedo for major plant functional types. MODIS albedo data show considerable spatial variability related to surface characteristics [Barnsley et al., 2000; Tsvetsinskaya et al., 2002]. In contrast, Gao et al. [2005] pointed out that the interannual differences between minimum and maximum shortwave white-sky snow-free albedo are mostly less than 0.01. They also revealed that the broadband albedos averaged over 20° latitude belts vary little between consecutive years [Gao et al., 2005].

[30] The standard deviation (STD) and coefficient of variation (CV) at both temporal and spatial dimensions were calculated for different PFTs from the full inversion data. The interannual variation at the pixel level and its variation within an ecosystem at the tile level were compared (Figure 6). At the pixel level, the STD and CV were first calculated over multiple years for a single pixel and then averaged for each PFT. At the tile level, the STD and CV were first calculated within a tile for each PFT and then averaged over the study area for multiple years. The CV shows a strong seasonal character. It is higher in winter and spring and decreases during the growing season. For trees, shrubs, grasses, and crops, their CVs vary around 0.1 for most of the season but they are around 0.3 in the first quarter of the year. The highest variation in the beginning of a year coincides with the winter snow and spring thaw in North America. For most PFTs, the multiyear STD is consistently lower than the tile STD and so is the CV. The only exception is for snow and ice that have a higher CV in January and February, possibly due to different thaw dates among different years. The lower STD and CV indicate stable interannual albedo variations which are similar to those observed by Gao et al. [2005]. These findings support our assumption that the multiyear average is more representative and stable than the ecosystem average. The STD and CV were also calculated based on the overall QC (QC_Word1) and compared for the visible, near infrared and total shortwave albedos. Although they are not shown here, the results are the same as the above for the spectral bands. Therefore the pixel's multiyear average is considered first to fill the missing value of a given year.

Figure 6.

Comparison of standard deviation (STD, solid lines) and coefficient of variation (CV, dashed lines) for different pixels (pink) and PFTs. The statistics of different PFT were calculated at tile level (dark). “Pixel” are multiyear statistics for each pixel and are clustered into different PFTs for display, and “_Tile” is for each PFT. The red band (657 nm) spectral albedos with full inversion are shown. Other bands show similar characteristics.

4. Implementation of the TSF Algorithm

[31] In practice, the generic TSF algorithm implements a two-step procedure where the seven spectral albedo bands are processed first based on the band dependent QC, and second, combined together to estimate the broadband albedos in the visible, near infrared and total shortwave regions, respectively. The most recent versions of the formulae and coefficients to perform narrow band to broadband albedo transformation are given by Liang [2001] for the generic surfaces and Stroeve et al. [2004] for high albedo snow surface.

[32] The background albedo values were generated in advance. The multiyear average values were computed for both snow covered and snow-free conditions, separately, based on the pixel's snow flag. As we have discussed earlier, if a pixel has no full retrieval over the study period, an ecosystem fitting algorithm will be initiated to calculate the background value of a pixel. Once the background value is prepared, the next step is very fast and easy to implement. Thus the final product will represent realistic ground conditions, either snow or snow-free. The computational improvement makes such calculation attractive for continental and global processing. In the real environment, the multiyear average is updated periodically, e.g, for every five years, to take into account the land surface changes, such as those resulting from disturbance (e.g., fire, insect, droughts, and flooding).

[33] Applying the TSF to albedo presents difficulties because of the temporal resolution, data quality, snow effect and large data volume. In this framework, we implemented several modifications to the original TSF algorithm in order to deal with the problems posed by albedo data. Some measures were taken to improve the computational efficiency and handle the huge data volumes. For example, the algorithm was tuned to process terrestrial land pixels only, reducing the data volume by more than 50% for North America. The black-sky albedos were calculated at this stage. Moreover, the original hierarchical data format (HDF) files were transformed into general binary format for fast processing.

[34] The treatment for snow and snow free pixels in TSF is significantly different (Figure 2). In the TSF algorithm, the MODIS snow mask was used to identify the surface snow cover, but it is more likely that some bright surface and cloud pixels might be classified as clear-sky snow covered pixels under some circumstances. In addition to the MODIS QC flag, some other criteria were used to properly separate snow and snow-free pixels. For example, urban and built-up areas in the low latitude (<20°) and in the midlatitude (<50°) summer were treated as snow-free even if they were labeled as snow covered.

[35] For the broadband albedos, the overall QC was originally implemented in the TSF algorithm since there is no specific QC for the three broadband albedos. The TSF algorithm provided the capability to generate continuous albedo products directly based on the overall QC. Nevertheless, the analytical relationship between the narrow and broad band albedos allows us to avoid using the overall QC for the present exercise. In fact, the broadband albedos generated based on either the band dependent or the overall QC are very similar.

5. Results

5.1. Spectral Albedo and Climatology

[36] The new filter was applied to process each pixel of the black-sky albedo data sets and resulted in a spatially continuous land surface albedo data set. The original MODIS seven spectral and three broadband albedo products were processed separately. Figure 7 shows an example of the original MODIS spectral albedo data and the improved products for band 2 (864 nm) and 5 (1243 nm) on days 209–224, 2001. The spectral albedo values are compressed to 0 ∼ 0.5. Values outside this range are set to white, for example for Greenland, where the spectral albedo could be higher than 0.9 and 0.6 for the 864 nm and 1243 nm, respectively, because of permanent snow and ice. The pixels processed with the new filter are shown in Figure 7e with grey or dark colors. The white color (70.78%) identifies the pixels with the full inversion which were not processed with the TSF filter, and are very similar for both band 2 and 5 in this case. Grey pixels (20.07%) indicate that the values in Figures 7a and 7c are from the backup algorithm. Dark pixels (8.55%) denote that there is no valid inversion.

Figure 7.

Comparison of MODIS spectral albedo before and after filtering (DOY 209–224, 2001). (a) Band 2 (864 nm), before; (b) Band 2, after; (c) Band 5 (1243 nm), before; (d) Band 5, after; and (e) MODIS albedo band 2 QA mask, with white, grey and dark representing full, backup and no retrievals, respectively. The band dependent QCs for band 2 and 5 are nearly the same in this case.

[37] Our new value-added products (Figures 7b and 7d) are spatially complete and greatly improved the data quality and their utility, especially in areas where the majority of pixels is derived from the backup algorithm or with no valid retrievals. Improvements are observed in Alaska and the southeastern United States with more continuous values corresponding with landscapes. For all the land surface area, the mean albedo for 864 nm and 1243 nm are 0.292 and 0.282, respectively.

[38] Figure 8 displays the albedo climatology of two spectral bands (657 nm and 864 nm) before and after filtering for the year 2001. For the original MODIS albedo, pixels with the full inversion were computed (Figure 7). To generate the 2001 product, the original MODIS albedo at the end of 2000 (DOY 305, 321, 337, and 353) and the beginning of 2002 (Day 001, 017, 033, and 049) were also included in the temporal filtering in order to get a better description of the winter conditions whether snow covered or snow-free. After the filtering, the new products have improved the quality of the original data products by filling gaps and smoothing spikes (Figure 8). For vegetated areas, the albedo climatology displays a clearly seasonal characteristic, where the spring thaw and leaf emergence reduce (increase) the red (NIR) spectral albedo, and the leaf senescence and the beginning and end of the season result in an albedo increase (decrease) in the red (NIR) band. Winter snow adds to the spectral albedos in both red and NIR bands. The deciduous broadleaf trees are less affected by snow because they are predominately distributed at lower latitudes and have a lower (higher) value in the summer for the red (NIR) band. The evergreen broadleaf trees in North America have a relative stable value over the season with little increase in summer. Shrubs are mainly located at latitudes higher than 50°N. The albedo time series for shrubs tends to descend in early May, have a relatively short constant period, and rise higher than the original MODIS data in winter due to snow cover. The profile for barren area is similar to that of shrubs. The albedos for permanent snow and ice are very high throughout the year.

Figure 8.

MODIS band 1 (657 nm) and 2 (864 nm) spectral albedo climatology in 2001. The dark and pink lines are before and after the filtering processing, respectively.

5.2. Reprocessed Broadband Albedo and Climatology

[39] The MODIS broadband visible, near-infrared and total shortwave albedo data were calculated with the coefficients given by Liang [2001] and Stroeve et al. [2004]. Figure 9 shows an example of the total shortwave albedo on days 97–112, 2001 before and after filtering. Since there is no specific QA for the broadband albedos, the overall QA was explored. On this day, about 51.88% of pixels that were processed had an variable quality (QC_Word1 = 1) and 2.4% pixels had no valid inversion (dark areas in Figure 9c). During this period, a large amount of areas in the high latitudes are still covered by snow, reflected in the higher total shortwave (TSW) albedo (>0.3) in these areas. The dark pixels in Figure 9a corresponds to the areas with no valid retrievals. Compared with the original MODIS albedo map, the improvement in spatial coverage is obvious, especially in the high latitude regions.

Figure 9.

North America total shortwave (TSW) black-sky albedo (DOY 97–112, 2001). (a) The original MODIS albedo; (b) Derived with the new filter in this study; and (c) The albedo overall QA mask. White, grey, and dark colors represent the good, variable, and no retrieval areas, respectively.

[40] The climatology of the broadband albedo products was calculated based on different PFTs. Figure 10 shows the climatology of total shortwave albedo for 2001. The albedo profiles show a strong seasonal response with albedo reaching the highest values around the winter and spring seasons due to snow coverage during this period. Evergreen broadleaf trees, nearly not affected by snow, show the smallest seasonal variation from 0.133 to 0.193.

Figure 10.

MODIS total shortwave (TSW) albedo climatology before (dark lines) and after (pink lines) filtering process. The solid and dashed dark lines represent the broadband albedos with good and variable quality flags, respectively. Year, 2001.

[41] Some obvious differences are observed before and after filtering (Figure 10). For illustrative purpose, Figure 10 also shows the mean values of the original albedo that are marked with an “variable” quality (QC_Word1 = 1, dark dashed line in the figure). The filtered TSW albedo tends to converge with the good MODIS albedo values in summer and fall, but converges with the “variable” quality values in winter and spring. This characteristic demonstrates the unique nature of our “true” albedo signature over the year. The higher values in the beginning and end of the year are mainly caused by the original “variable” data that are covered by seasonal snow. These data represent the real condition of the ground and are treated as the “true” ground albedo.

5.3. Monthly Spectral Albedo and Climatology

[42] The monthly spectral albedo was also calculated for year 2001. Examples of the 657 nm and 865 nm spectral albedos for February and July, 2001 are depicted in Figure 11. The February figure is calculated from the mean spectral albedo values on days 33 and 49, while the July figure from days 193 and 209. By monthly compositing, the MODIS albedo images are visually more continuous than the individual 16-day ones. An example of the MODIS albedo in February is shown in Figure 11a and the TSF-processed counterpart in Figure 11b. In February (Figure 11b), the 657 nm spectral albedo ranges from 0 to 1.0 for most of the continent, with a mean value around 0.33. The geographic gradient of the spectral albedo is clear - with lower values in the southern part of the continent and higher values in the northern part. The yellow and red colors represent the pixels that are mostly covered by snows. In the very high latitude, there is no albedo data due to the polar night. In this month, the original MODIS data (Figure 11a) are higher than the TSF results in the middle of the continent and in the sparsely vegetated high latitude zones (by about 0.1), while the TSF results are higher sporadically, e.g., in the southern part of the Great Lakes.

Figure 11.

Monthly spectral albedo from the MODIS products (a) and after TSF processing (b). (a) and (b) are for the red band (657 nm) in February 2001, respectively. (c) The TSF spectral albedo in the near infrared (864 nm) band in July 2001. (d) MODIS red band QA mask of day 49, 2001, with white, grey and dark representing full, backup and no retrievals, respectively.

[43] In July, both MODIS and the TSF results are alike, illustrating that the data quality of the original products are very good. The spectral albedo in the near infrared (NIR) band is shown in Figure 11c. In this month, the spectral albedo for the NIR band is more homogeneous spatially than in the winter. The values range from 0.2 to 0.4 with a mean value around 0.28 for most of the continent except in Greenland. In the mid-West US, the yellow and red patches represent albedo values around 0.35 to 0.50, corresponding to continuous crop areas in this region.

[44] The climatology of the monthly spectral albedo for the first four spectral bands is shown in Figure 12. The filtered spectral albedo displays a typical albedo phenology. The red, blue and green bands show very similar phenology for all plant functional types. In contrast, the near infrared band (864 nm) exhibits a different pattern, which is higher than the red, blue and green bands in the summer for all vegetation types. The winter and spring seasons are marked by higher albedo, except for the evergreen broadleaf trees, due to vegetation phenology and seasonal snow coverage. The shrublands, barren or sparsely vegetated areas are the most variable PFT classes for albedo. The permanent snow and ice show a constant high albedo over the season.

Figure 12.

Climatology of the integrated monthly albedo for North America, 2001. The first four spectral bands (657 nm, 864 nm, 460 nm, and 550 nm, respectively) are shown.

[45] The monthly spectral albedos were compared with those calculated from the original MODIS data (day 273 was used for Sep and Oct) at different latitude bands (Figure 13). Similar to the 16-day products, a vivid phenological cycle is revealed by all the four spectral bands shown in the figure. The spectral albedos are generally higher in the winter than in the summer because of snow cover, vegetation senescence and defoliation. The lowest values are usually observed in July and August. The TSF curves show a similar but smoother pattern compared with the MODIS monthly profiles. Their values are comparable with each other, but some differences can be noticed in high latitude bands (e.g., 60°–70°). In the winter of the 60°–70° zone, the TSF values are higher than the MODIS ones, for example, by about 0.05 in the NIR band and 0.14 in the red band, respectively, in December. Leaf senescence and winter snow usually result in an albedo increase. This has been revealed by the TSF processing which removes the low quality pixels and filling in with values according to adjacent snowy pixels. In the 30°–40° latitude band, the spectral albedo is more stable over the year than those in other latitude bands. In the 40°–50° zone, the TSF albedo is lower than the original MODIS data by about 0.1–0.2 in both January and February. The difference is explained by the large snow coverage and the backup retrieval algorithm used in the original products. For example, 69.1% of pixels are retrieved by the backup algorithm and 10.9% have no valid retrieval on day 49, 2001 (Figure 11d). It can also be attributed to the higher interannual variability in the spring season (Figure 6) which will lead to larger uncertainties in the estimation of multiyear background values.

Figure 13.

Comparison of the mean monthly spectral albedo between the TSF results (pink) and the original MODIS products (dark) for different latitude bands in 2001. (a)∼(d) are for the first four spectral bands centered at 657 nm, 864 nm, 460 nm, and 550 nm, respectively.

5.4. Comparison With Field Measurements

[46] Seventeen field measurement sites in North America were selected for comparison (Table 3). The first seven sites are from the AmeriFlux network, part of the FLUXNET, which offers regional and global energy exchange observations at ecosystem level ( The next six sites are from the SURface RADiation budget observing network (SURFRAD), which measures downward and upward components of broadband solar irradiance [Augustine et al., 2000]. The last four sites (CP1, JAR1, JAR2, and Swiss Camp) include field measurement carried out at the Greenland permanent snow and ice surfaces [Steffen and Box, 2001; Stroeve et al., 2004]. We derived the surface total shortwave albedos (0.4–2.5 μm) from the measured downward and upward radiation at local solar noon for each AmeriFlux and SURFRAD site. For a better comparison with the MODIS albedos, the median values of every 16-day measurement were calculated and compared. Following data quality screening, a time series of the original MODIS albedo from the single pixel overlying each measurement point was extracted throughout the study period.

Table 3. Albedo Field Measurement Points Used in the Comparison of MODIS Albedo and New Filtered Albedo Data Sets
 SiteLatitudeLongitudePlant functional type
1Blackhills44.16−103.65Evergreen needleleaf tree
2Lost Creek46.08−89.98Deciduous broadleaf tree
3Mead (irrigated)41.10−96.29Broadleaf crop
4Mead (rainfed)41.40−96.44Broadleaf crop
5MMSF39.32−86.41Deciduous broadleaf tree
6Niwot40.03−105.55Evergreen needleleaf tree
7Willow Creek45.91−90.08Deciduous broadleaf tree
8Bondville40.05−88.37Broadleaf crop
10Desert Rock36.63−116.02Shrub
11Fort Peck48.31−105.10Grass
12Goodwin Creek34.25−89.87Broadleaf crop
13Penn State40.72−77.93Deciduous broadleaf tree
14CP169.88−46.98Snow and ice
15JAR169.50−49.68Snow and ice
16JAR269.42−50.06Snow and ice
17Swiss Camp69.57−49.30Snow and ice

[47] Figure 14 plots the filtered albedo (TSF) against the annual time series of the ground observed albedos in the shortwave band for the Morgan Monroe State Forest (MMSF) and the Penn State (PS) sites. The original MODIS black-sky albedo (MOD43B3) is also plotted with hollow triangles. The original MODIS albedo data and the reprocessed data are consistent with field measurements. Seasonal variation in albedo is well characterized by the remotely sensed albedos. Timing of green-up and leaf senescence is consistent between the MODIS and TSF albedos. There are some deviations at the two sites during the growing season between field and satellite observations; however, the magnitude of the differences is very small (<0.02).

Figure 14.

Surface total shortwave albedos observed from two stations in year 2001: (a) the Morgan Monroe State Forest; and (b) Penn State, PA. The dashed and solid lines represent the measured daily and 16-day median values, respectively. The solid circle and hollow triangle refer to the TSF and the original MODIS albedos, respectively.

[48] The TSF algorithm provides winter albedos when there is no MODIS data. Some discrepancies are noted for these two sites in the winter: the mean differences between the TSF albedo and the 16-day median values in January are 0.02 and 0.25 for the MMSF and the PS sites, respectively. In this month, the MMSF and the PS sites are snow-free and snow covered, respectively, as indicated by the MODIS QC mask. Spatial variability and the scale difference between point measurements and the 1 km MODIS pixels may be the cause of the deviation.

[49] The MODIS albedo products and the TSF results were acquired for comparison over all the field measurement sites (Figure 15). Totally 343 measurement points were used in the comparison. Overall, the field-measured total shortwave albedo ranged from 0.077 to 0.965 over the year. The corresponding MODIS albedo data ranged from zero to 0.84, and were moderately well correlated with the field estimates (R2 = 0.825, RMSE = 0.082, Figures 15e and 15f). The TSF results provided a very similar RMSE (= 0.089) with field measurements to the MODIS albedo, but a stronger correlation between albedo and field measurements was observed for the TSF data (R2 = 0.874). If only snow-free pixels are considered, the TSF results improved significantly over the original MODIS products (Figures 15a and 15b). The R2 increased from 0.525 to 0.614 and the RMSE decreased from 0.074 to 0.066. For the permanent and seasonal snow and ice surfaces, the TSF algorithms give slightly better results than the MODIS albedo (Figure 15c). The TSF algorithm has successfully retrieved the albedo for those snow and ice pixels when MODIS has no retrieval (Figure 15d).

Figure 15.

Comparison of the new filtered albedo and MODIS albedo with field measured shortwave albedo. (a) and (b) compare MODIS and TSF albedo, respectively, for the snow free pixels. (c) and (d) are for the snow and ice surfaces including permanent snow and ices in Greenland and seasonal snow covered pixels. (c) compares the paired MODIS and TSF albedo. (d) is for the TSF albedo only while MODIS has no retrieval. (e) and (f) are for all snow free and snow covered pixels. See Table 3 for the sites.

[50] The performance of the TSF algorithm can also be examined under different MODIS band independent QCs. Table 4 lists the compared results of MODIS and TSF albedos with field data based on both good and variable qualities. In general, the TSF results agree much better with field measurements than the original MODIS data. For example, the R2 improved from 0.831 to 0.893 for all pixels after the TSF processing. The improvement is even more obvious for the variable quality pixels (QC_Word1 = 1) when the R2 improved from 0.595 to 0.734. For the snow free pixels, the TSF performs very well for the variable quality pixels. However, there is no obvious quality improvement for the snow and ice covered surfaces based on QC_Word1. This is mainly attributed to the spatial variability within the range of a pixel and the scale mismatch between satellite and field observations. Nevertheless, as noted above, the TSF data have filled pixels that have no valid MODIS retrieval.

Table 4. Comparison of Field Measured Total Shortwave Albedo With the Original MODIS Data and the TSF-Processed Dataa
  Snow freeSnow coveredAll data
  • a

    See Table 3 for the field measurement sites.

QC_Word1 = 1MODIS0.4740.0850.690.1710.5950.092
TSF data0.6140.0730.6050.1980.7340.09
QC_Word1 = 0 or 1MODIS0.5470.0720.710.120.8310.081
TSF data0.6470.0640.6950.1170.8930.078

6. Summary

[51] A sophisticated temporal and spatial filtering (TSF) algorithm was developed to fill the gaps and provide continuous “true” albedo products. This algorithm makes use of the multiyear average to represent the background value of the missing pixels since the pixel's interannual variation is usually less than within an ecosystem.

[52] Value-added spectral and broadband albedo products were generated from the standard MODIS products over North America. Both low quality data and missing days were filled with the new products. The reconstructed albedo data set improved the MODIS albedo in both temporal and spatial coverage. The continuous data set would be appropriate for data assimilation studies. The climatologies were calculated for both spectral and broadband albedos based on different plant functional types. The climatology shows seasonal albedo dynamics and would be useful for climatic modeling.

[53] Validation results using field measurements show that the filtered albedo captures the full extent of seasonal albedo variation, especially in the winter season. In addition, the TSF albedo results are also quantitatively sound. Compared with field measurements, the R2 and RMSEs are 0.614 and 0.066 for snow-free pixels and 0.874 and 0.089 for all pixels. While these results use Terra data, this algorithm can be equally applied to MODIS albedo products from both Terra and Aqua globally.


[54] The work was partially supported by NASA under NNG04GL85G. We thank the FLUXNET and SURFRAD networks which made the surface flux measurements available and the Greenland Climate Network (GC-Net) which provided the albedo data at Greenland. We also thank the support team at the Land Processes Distributed Active Archive Center (DAAC) (, who helped set up the Machine-to-Machine Search and Order Gateway (MTMGW) which greatly facilitated the MODIS data downloading from the EROS Data Center (EDC). The filtered albedo data sets are available from the University of Maryland Global Land Cover Facility ( The anonymous reviewers all made valuable comments that improved this manuscript.