2.1. MODIS and Model Albedo
 We use two years of the MODIS broadband albedo data (Collection 4) for visible (VIS, 0.4–0.7 μm) and near-infrared (NIR, 0.7–5.0 μm) from September 2000 to August 2002 at 0.05° resolution. The MODIS albedo was generated by a semiempirical, kernel driven linear bidirectional reflectance distribution function model [Schaaf et al., 2002]. This model relies on the weighted sum of three parameters retrieved from the multidate multiangular cloud-free atmospherically corrected surface reflectances at 1-km resolution, acquired by MODIS in a 16-day period. The MODIS albedos represent the best quality retrieval possible over each 16-day period and consist of local noon black-sky (direct) and white-sky (wholly diffuse) albedos. Since the white-sky albedos vary spatially as do the black-sky albedos, only results of the former are shown in this paper.
 Model VIS and NIR albedos were produced from the latest version of CLM2 (version 2.02) coupled to the Community Atmosphere Model, using observed sea surface temperature from 1979 to 1989. CLM2 is the land surface parameterization used with the Community Climate System Model at about 2.8° × 2.8° resolution [Blackmon et al., 2001]. Each model grid cell is divided into four primary land cover types: glacier, lake, wetland, and vegetation. The vegetated portion of a grid cell is further divided into patches of up to 4 of the model's 15 PFTs, each with its own leaf and stem area index and leaf optical properties. Albedo at each grid is calculated as a sum of albedos for each land cover types based on their fractions. Details about the model albedo can be found from Oleson et al. . Here we only use the model diffuse albedos (comparable to the MODIS white-sky albedos).
 A climatology of monthly albedo was produced for both the model and MODIS. The model albedos are from the last 10 years data of the 11-year simulations. The MODIS albedos were first aggregated spatially to the model grids using area weighting and then temporally to monthly data.
2.2. A New Land Surface Date Set
 To assess the accuracy of the land surface dataset currently used in CLM2 (referred as the old data thereafter), we create a new land surface dataset from the latest MODIS land products using similar procedures as described by Bonan et al. [2002b].
 We aggregate MODIS 500m collection 3 Global Vegetation Continuous Fields (VCF) [DeFries et al., 1999] from 2000–2001 to generate 1 km FVC data. The VCF data contain percent of tree cover (tall trees), herbaceous cover (shrubs and grasses) and bare. The sum of these three components equals 100% ground cover. The FVC data is calculated as a sum of percentage of tree cover and herbaceous cover.
 We generate a 15 PFT dataset at 0.5° × 0.5° resolution from the MODIS 1 km PFT and IGBP land cover maps. The MODIS PFT map consists of 7 primary PFTs, needleleaf evergreen or deciduous tree, broadleaf evergreen or deciduous tree, shrub, grass and crop. It is expanded to 15 PFTs based on climate rules [Bonan et al., 2002b]. Since the current VCF data does not distinguish between evergreen versus deciduous and broadleaf versus needleleaf for the tree cover or shrub versus grass for the herbaceous cover, we assume that each 1 km pixel has only one PFT and its abundance equals its FVC. The bare fraction is 1-FVC. The old PFT data was derived without access to consistent FVC data and so assumed the non-tree-covered land in forests, savanna, and grasslands was covered by grasses, in shrub lands by shrubs, in croplands by crops. The new PFT data define a pixel as grass, shrub or crop only if it is classified so by the PFT map and its fraction as FVC. This is a major difference between the old and new PFT data. The 1-km data are aggregated to grid cells at 0.5° resolution by averaging the 1-km percentages per 0.5° grid cell, which normalized the percent of each grid cell covered by a particular PFT by the vegetated area [Bonan et al., 2002b]. The bare ground in each grid cell is always considered to be the cumulative canopy opening.
 We generate an LAI dataset at 0.5° resolution from two and half years of collection 4 MODIS 1-km LAI data [Myneni et al., 2002] in 2000, 2001, and January through June in 2003, with 8-day compositing periods. These data are further composited over 4 (or 3) consecutive 8-day periods to produce monthly data. To minimize cloud and snow contamination, the 2.5-year data with the best quality are further composited to produce a climatology of monthly LAI. These data are used to derive the seasonal course of LAI for every PFT at a 0.5° grid cell. First, LAI at 1-km is divided by the FVC to produce LAI with respect to vegetated area only. Second, evergreen needleleaf PFTs are adjusted to be no less than 70% of their maximum LAI to correct the MODIS biases of lower winter LAI values in the presence of snow [Tian et al., 2004]. It is no longer necessary to adjust the evergreen broadleaf LAI since the current MODIS data has substantially improved LAI retrievals in tropical and subtropical regions compared to AVHRR. Third, for each PFT, a pure PFT LAI is estimated at a 0.5° grid cell by averaging only the LAIs over 1-km pixels whose abundance of the PFT is greater than 60%. The old model LAI was derived from only one year of AVHRR data (April 1992 to March 1993) based on NDVI-LAI relationship [Bonan et al., 2002b]. Thus, improvements with the present analysis should be expected.
 Finally, for use in CLM2, the 0.5° data are aggregated into the model grids. The surface data for land cover types of water, wetland, lake, and snow-ice are unchanged and not considered here. We also use the MODIS Collection 4 snow product to examine differences in snow cover between the model and MODIS.