To determine the agreement/disagreement of the two data sets, the forest area of the two data sets in different land cover types is calculated, and the pixels with area more than zero are counted. Meanwhile, mean forest percentages per pixel, mean percentages difference and the total area difference in different land cover types are calculated. The results are shown in Table 3. From the table, we can see that there is 25.3% forest in the UMD data set and 21.4% in the NLCD data set, distributed in the grass class with 1.8% in the UMD data set and 1.7% in the NLCD data set in the nonvegetation class. The estimations of the total forest area between the two data sets are in good agreement for an evergreen needleleaf forest, nonvegetation, evergreen broadleaf forest and grass, with the relative differences of 4.8%, 4.4%, 0%, and 6.3%, respectively. However, the disagreement is relatively large for deciduous broadleaf forest, shrub, deciduous needleleaf forest, and mixed forest, with relative differences of 31.4%, 24.9%, 22.8%, and 16.7%. These differences also indicate that the estimations of forest area in the NLCD data sets are larger than those in the UMD data sets for these land cover types. The pixel counts are very close for deciduous broadleaf forest, deciduous needleleaf forest, shrub, evergreen broadleaf forest, mixed forest, and evergreen needleleaf forest, with relative differences of 0%, 0%, 1.3%, 1.4%, 1.8%, and 1.9%. However, for Alpine forest, grass and nonvegetation, the forest pixel counts are discrepant, with relative differences of 30.5%, 63.7%, and 85.6%. The mean percentage of tree cover is very close between the two data sets for evergreen broadleaf forest and evergreen needleleaf forest, with relative differences of −1.4% and −6.6%. However, for deciduous broadleaf forest, nonvegetation, grass, shrub, deciduous needleleaf forest, mixed forest and Alpine forest there are large disagreements, with relative differences of −31.3%, −89.4%, −58.5%, −26.0%, −22.7%, −18.7%, and −18.4%. In particular, the nonvegetation and grass classes have the most discrepancy, which may be explained partly by the difference in defining the forest between the two data sets, and demonstrate also that the UMD data set is inclined to contain more sparse forest than the NLCD data set.
 Figure 5 shows a comparison of the pixel frequency counts of different tree canopy percentages in the nine land cover types. These histograms show that there are more pixels in the NLCD data set than in the UMD data set with forest percentage more than 70% for all seven forest land cover types. However, with tree canopy percentage less than 10%, the pixel counts of the UMD data set are much larger than those of the NLCD data set for all land cover types, especially for grass and nonvegetation types, where the count differences are ∼4 times.
 According to these comparisons, we can find that the tree canopy percentages are very consistent for evergreen needleleaf forest and evergreen broadleaf forest in terms of pixel counts, total forest area, and mean percent forest. For deciduous broadleaf forest and deciduous needleleaf forest, there are almost identical counts. However, the total forest area is very discrepant between the two data sets; the forest area of the NLCD data set is more than that of the UMD data set, with relative differences of −31.3% and −22.7%. This may indicate that the UMD data set underestimates the forest area in these two land cover types. For nonvegetation and grass, which contain both sparsities of forest, the total forest area is close in the two data sets, but the pixels count vary greatly, which suggests the UMD data set contains a larger percentage of the sparse forest. The area agreement is favorable for shrub and mixed forest types, but there are relatively high pixel variations between the two data sets. The statistical analysis of the pixel frequency suggests that the NLCD data set has higher pixel counts for high forest percentage, while the UMD data set has higher pixel counts for low forest percentage.
 From the two data sets, it can be found that more than 30% of Chinese forests are distributed in shrublands and more than 20% in grasslands. This statistics demonstrates that many forests are the shrubs from secondary or man-made forest. Many forests are dispersed in grassland or agricultural regions, though they are sparse, owing to the large area of grassland, their total area is large. If we use the hard classification data, these forests would not be accounted for.