Fourteen grid data sets of different cell resolutions were generated, from 0.5 × 0.5 km to 64 × 64 km, to estimate CH4 and N2O emissions from paddy soils in the Tai Lake region of China using the Denitrification-Decomposition (DNDC) model. The grids were derived from a polygon-based data set (1:50,000 digital soil map/database), which was the most detailed soil database for the region. Comparison of simulated CH4 and N2O concentrations from input of the 14 grid data sets with the original polygon data demonstrated (1) no distinct variability (relative errors <5%) of the results when grid data sets of cell size ≤2 km were used as input for the DNDC model; (2) slight variability (<10%) in the results when grid data sets with cell size in the range of 2 to 8 km were used as input; and (3) distinct variability (>10%) in the results when grid data sets with cell size of >8 km were applied as input. A grid data set with a cell size of 8 km was found to be optimal based on accuracy and computational efficiency of DNDC simulations. The results can be used as a guideline for optimizing field sampling strategies for locations where there is a lack of or insufficient soil data, whereby soil data can be collected through sampling in cell centers of designed grid frames.
 Rice paddies and other agricultural soils have been identified as one of the major sources of atmospheric CH4 and N2O, contributing about 19% and 42%, respectively, of global anthropogenic CH4 and N2O emissions [Intergovernmental Panel on Climate Change (IPCC), 2007]. China has approximately 22% of the world's rice paddies, the associated CH4 and N2O emissions are responsible for roughly 28% of the total CH4 emission from the world's rice fields [Jiang et al., 2004] and 3% of total N2O emission from the world's agricultural soils [Xiong et al., 2007; IPCC, 2007].
 Over the past 2 decades, midseason drainage has been adopted throughout China as an alternative water management approach to reduce CH4 emissions [Li et al., 2006; Cai et al., 1999, Zhang et al., 2007], and usage of chemical N fertilizer has been controlled through application reduction to decrease N2O emission [Xiong et al., 2007]. Field measurements at the plot scale indicate that midseason drainage can significantly reduce CH4 and increase N2O emission in China [Cai et al., 1999; Li et al., 2002, 2005; Xiong et al., 2007]. However, the CH4 and N2O emission trends need to be validated at the regional scale. Therefore, how to accurately estimate CH4 and N2O emissions from rice fields at large areas (region or country) are vitally important to evaluate impacts of midseason drainage, and possibly other management options in China [Xiong et al., 2007; Ma et al., 2009].
 Much research has been carried out to estimate CH4 and N2O emissions from agroecosystems using the Denitrification-Decomposition (DNDC) model [Tang et al., 2006; Levy et al., 2007]. The DNDC model developed by Li et al. [1992a, 1992b, 1994], simulates the biogeochemical C and N cycle of agricultural soils based on land use and activity data (crops, tillage, fertilization, manure application, grazing), soil parameters (texture, soil organic matter, pH, bulk density, hydraulic properties), and daily temperatures and precipitation. It has been validated through many applications at the plot scale in many sites of North America, Europe, Asia, etc. [Li et al., 1996, 2002; Cai et al., 2003; Pathak et al., 2005; Jagadeesh Babu et al., 2006; Levy et al., 2007], and is one of the most widely accepted biogeochemical models in the world [Li, 2000, 2007].
 At the regional scale, the DNDC model has also been utilized to assess CH4 and N2O emissions from agricultural fields. To date, most of the DNDC modeling conducted at the regional scale has used counties as the basic assessment unit [Li et al., 2005; Zheng et al., 2006; Tang et al., 2006], where minimum and maximum soil parameter values for each county were derived from soil maps to simulate an upper and a lower estimate of several C and N pools, including soil CH4 and N2O fluxes. Soil polygons at different map scales (1:50,000, 1:14,000,000, etc.) or soil grids at different grid resolutions (1 × 1 km, 50 × 50 km, etc.) also have been used as assessment units in modeling [Zhang et al., 2009a, 2009b; Levy et al., 2007; Xu et al., 2006; Neufeldt and Schäfer, 2008], where all physical parameters within any unit are assumed to be homogeneous.
 The homogeneity assumption is a possible major source of error when extending DNDC modeling from plot to regional scale [Li et al., 2004]. As the area of the basic modeling unit increases so does soil property variability, calling into question the accuracy of its capture in terms of resolution and attribution, especially in comparison with typical plot scale measurement precision [Sass et al., 1991; Smith and Dobbie, 2001; Bouwman et al., 2002; Xu et al., 2006]. For example, the uncertainties in upscaled N2O emissions modeled by DNDC associated with the spatial scaling effect were 63.6%, and from the partitioning of a sensitive model parameter (SOC) was 86.4%. Spatial heterogeneity of soil factors resulted in lower regional N2O emission estimates [Xu et al., 2006]. In the Tai Lake region of China, DNDC model simulation was conducted with two databases using polygons and counties as the basic units for quantifying CH4 emissions from rice fields [Zhang et al., 2009a]. The county level database contained soil information coarser than the 1:50,000 polygon soil database. The total CH4 emissions generated from the polygon-based database is 2.6 times and 0.98 times the minimum and maximum CH4 emissions, respectively, generated from the county-based database. The average value of the relative deviation ranged from −20% to 98% for most counties, which indicates that a more precise soil database, e.g., 1:50 000 soil database, is necessary to better simulate CH4 emissions using the DNDC model at the regional scale [Zhang et al., 2009a]. Thus, the spatial heterogeneity of the soil parameters indicate that soil databases of a higher resolution or larger map scale are necessary to better simulate CH4 and N2O fluxes using the DNDC model [Li et al., 2004, 2005].
 Although soil data sets with high spatial resolution and rich attributes improve simulation accuracy, they require more labor force, material resources and financial support in collecting and analysis of soil samples, editing and preparing the soil data sets; and also require more processing time in the course of modeling. For example, one case study utilized a total of nearly 75,000 polygon simulations per scenario when applying the DNDC model to an area of 35,752 km2 in Baden-Württemberg, Germany, where the regional polygons were prepared from a detailed soil map, the Soil Map of Baden-Württemberg [Neufeldt and Schäfer, 2008]. Another case study centered on a 36,500 km2 area in the Tai Lake region of China, where 52,034 polygon simulations were executed using the DNDC model, which took 3 years for preparation of the soil data sets, compared to 37 county simulations in which the county-based data set was applied [Zhang et al., 2009a, 2009b].
 Given the variety of data sets and number of simulations, in combination with data accuracy and computational costs [Schmidt et al., 2008], important questions are raised. How sensitive is DNDC modeling to different resolution data sets? What data set resolution is optimal to DNDC simulation for error control at the regional scale? The objectives of this study were to (1) quantify CH4 and N2O emissions from rice paddy fields in the Tai Lake region of China, using DNDC model with different resolution data sets derived from the 1:50,000 soil database (polygon-based soil data set); (2) assess CH4 and N2O emissions variability associated with the different resolution data sets; and (3) identify an optimal resolution data set for DNDC simulation for this region based on error control to reduce the number of simulations and computational cost. Results will indicate an optimal compromise between accuracy and computational efficiency for DNDC model application.
2. Materials and Methods
2.1. Study Area
 The Tai Lake region (118°50′–121°54′E, 29°56′–32°16′N), is located in the middle and lower reaches of the Yangtze River of China, which includes the entire Shanghai City administrative area and a part of Jiangsu and Zhejiang provinces, with 37 counties and a total area of 36,500 km2. Annual rainfall is 1,100–1,400 mm, with a mean temperature of 16°C, average annual sunshine of 1,870–2,225 h, and over 230 frost-free days [Zhang et al., 2009a].
 As one of the oldest agricultural regions in China it has a long history of rice cultivation spanning several centuries. Approximately 66% of the total area is covered with Paddy soils [Xu et al., 1980]. The Paddy soils are derived mostly from loess, alluvium, and lacustrine deposits and are described by six soil subgroups, 137 soil families and 622 soil species according to the Genetic Soil Classification of China (GSCC). The six GSCC subgroups' reference name in U.S. Soil Taxonomy (ST) are Typic Epiaquepts (Bleached, Percogenic, Hydromorphic) and Typic Endoaquepts (Gleyed, Degleyed, Submergenic) [Soil Survey Staff, 1994; Shi et al., 2006]. Most of the cropland in the region is managed with a rice and winter wheat rotation system. Rice is planted in June and harvested in October, with winter wheat planted in November and harvested in May [Zhang et al., 2009a].
2.2. DNDC Model and its Validation
 The Denitrification-Decomposition (DNDC) model is a process-oriented simulation tool for modeling soil carbon and nitrogen biogeochemistry cycling [Li et al., 1992a, 1992b, 1994, 1996]. The model contains six interacting submodels, such as soil and climate, nitrification, denitrification, SOC decomposition, plant growth and fermentation submodels, which describe the generation, decomposition, and transformation of organic matter and outputs the dynamic components of SOC and greenhouse gas fluxes [Li, 2007; Zhang et al., 2009a, 2009b]. The DNDC model can simulate C and N biogeochemical cycles in paddy rice ecosystems, whereby the model has been supplemented by adding a series of anaerobic processes [Li et al., 2002, 2004; Li, 2007; Cai et al., 2003].
 Data sets describing the soil properties, daily weather, cropping systems, and agricultural management practices of rice paddy fields, are required to initialize and run the DNDC model at the regional scale. In the present study, 14 different resolution grid data sets (Table 1) are converted from a 1:50,000 polygonal database of the Tai Lake region, including descriptive attributes of soils, crops, agricultural management and climate data set using Spatial Analyst Tools (SAT, a GIS extension module contained in ESRI ARCGIS 9.0 software) functionality. All polygons of the polygonal database have their own idiographic data record, which are used as units in the DNDC modeling of CH4 and N2O emissions [Zhang et al., 2009a, 2009b]. Based on the maximum county land area among 37 counties into account, which reaches the area of 3,246 km2, the maximum grid cell is set to 64 × 64 km. All converted grid data sets have the same projection and map analysis extent as the polygon coverage. The SAT conversion method assigns the attributes of the polygon found at the center of each cell.
Table 1. Grid or Polygon Numbers of Paddy Soils Contained in Different Resolution Soil Data Sets in the Tai Lake Region of Chinaa
 For all simulations, farming management scenarios were compiled based on five assumptions from Zhang et al. [2009a, 2009b]. The DNDC modeling runs span the time period 1982 to 2000 or a duration of 19 years. Nearly 45,000 grid cell simulations were executed, in addition to 52,034 polygon simulations [Zhang et al., 2009a, 2009b].
2.4. Data Comparison and Analysis
 Results of CH4 and N2O emissions quantified by DNDC modeling with the original polygon-based database are recognized as a benchmark for comparison with the results of the DNDC model runs with the 14 different resolution grid data sets as input. The former are thought to be more accurate than the latter, because all resolution grid data sets extracted from the polygon-based database are coarser than the original database applied in the previous studies [Wang et al., 2001; Zou et al., 2005; Li, 2007; Zhang et al., 2009a, 2009b]. Area of paddy soils (APS, ha), annual mean emission of CH4 or N2O (AME, Gg yr−1) and rate of their emission (RGE, kg ha−1 yr−1) are selected as three indices for the comparisons:
where APSi is area of a polygon or a grid cell of paddy soil; MRGEi (kg ha−1 yr−1) is annual mean rate of CH4 or N2O emission from the polygon or grid cell from 1982 to 2000, as estimated by the DNDC modeling; i is the polygon or grid cell number.
 Variation of an index value (VIV, %) of all paddy soils or a soil subgroup in the region is calculated as equation (4). The index values were quantified for the polygon-based database (IV-polygon) and grid data sets (IV-grid) for another point of comparison. An absolute mean value of VIV (AMVIV, %) of a soil subgroup is calculated as equation (5). The VIV and AMVIV are estimates of relative errors associated with DNDC modeling using the grid data sets as input. Subjective classification is based on the following assumptions: (1) if VIV or AMVIV is less than 1%, the modeling is considered to be error free; (2) if it is greater than 1% and less than 5% or between 5% and 10%, the modeling is satisfactory or acceptable, respectively; otherwise, it is unacceptable [Whitmore et al., 1997].
where ABS is the absolute value of VIV and i is the soil subgroup number (there are a total of six soil subgroups in the region).
 The study primarily focuses on the CH4 emission quantified by DNDC model with different resolution data sets, with the estimated N2O emissions used only as evidence to support the conclusion obtained from the CH4 emission estimates due to limits on publication length. More details on N2O emission are omitted in the article.
3. Results and Discussion
3.1. Variations of Soil Features Among Different Resolution Data Sets
 Converting the polygon format data set to different resolution grid data sets resulted in a change to the soil feature spatial distribution. First, the number of cells within grid data sets is different from polygon numbers, whereby they decrease with increasing size of the cell or decreasing of grid resolution (Table 1). Computationally, the number of DNDC executions will shrink with the decrease of grid resolution in the region, reflecting a reduced number of cells and implying reduction of the computational cost. The computational load is a significant challenge to DNDC model application [Neufeldt and Schäfer, 2008; Zhang et al., 2009a, 2009b].
 Second, soil type diversity changes with the grid cell size. When cell size is bigger than 12 × 12 km, one soil subgroup (Submergenic paddy soil) does not get mapped; when cell size is bigger than 32 × 32 km, three soil subgroups no longer are found in the data set. At the soil family and species level, numbers of soil types contained in the grid data sets decrease significantly at the 0.001 level with increasing grid cell size (Figure 1). Similarly, the area of different soils vary with the different resolution of these grid data sets (Table 1).The missing soil types and soil area variation is the source of DNDC estimated CH4 and N2O emission variability associated with grid resolution, which is representative of CH4 and N2O emission potentials from different soils [Cai et al., 2003; Zhang et al., 2009a, 2009b].
 Third, the spatial distribution characteristics of soil properties depicted by these different resolution data sets of the region differ from each other. Soil organic matter and clay content are more sensitive than other soil parameters (pH, bulk density, hydraulic properties, etc.) to DNDC simulation of CH4 and N2O emissions [Li et al., 2002; Cai et al., 2003; Levy et al., 2007; Zhang et al., 2009a, 2009b]. In one case, an increase of grid cell size yielded a change in the texture of soil clay content as it becomes coarser (Table 2), with a corresponding change in the simulation of CH4 and N2O emissions [Zhang et al., 2009a, 2009b; Rüth and Lennartz, 2008]. Spatial variation of soil organic matter is omitted in here due to limited publication length.
Table 2. Statistics of Soil Clay Content (%) for Different Resolution Soil Data Sets in the Tai Lake Region of Chinaa
 As to variations of weather data (precipitation, maximum and minimum air temperature) and farming management scenarios (sowing method, nitrogen fertilizer application rates, livestock, planting and harvest dates, etc.) among 14 grid data sets, they are slight and can be neglected, because all these data were compiled at the county level [Zhang et al., 2009a, 2009b], and gird cell attributes are assigned with these data according to the county in which the cell center located. Change in soil type and their attributes are the main source of DNDC estimated CH4 and N2O emission variability associated with soil grid resolution.
3.2. Comparison of CH4 Emission Quantified From All Paddy Soils by DNDC Model With Different Resolution Data Sets
 The DNDC modeling results from input of the complete range of gridded resolution data sets are displayed in Figure 2. AME and RGE of CH4 emission from paddy soils quantified by DNDC model with input polygon-based data sets in the Tai Lake region of China, is 296.1 Gg yr−1 and 127.6 kg ha−1 yr−1, respectively. The area of all paddy soils (APS) in the region is 2.32 M ha according to the polygon-based data set [Zhang et al., 2009a].
 CH4 emission estimates derived from input of the grid data sets all vary with the resolution as assessed by three indices, APS (ha), AME (Gg yr−1) and RGE (kg ha−1 yr−1) as well as their VIV, when compared with the results derived from input of the polygon-based data set, see Figure 3. For 7 of the 14 different resolution grid data sets (e.g., grid data sets with single side dimension of 0.5 ∼ 8 km and 16 km), the absolute VIVs of the three indices are less than 5%. The results obtained from DNDC modeling with input from those 7 grid resolution data sets are all satisfactory, according to the assumptions and the evaluation criteria described earlier [Whitmore et al., 1997].
 Considering the minimum number of DNDC modeling runs, the grid data set with a cell size of 16 × 16 km is the most efficient under the error assumptions mentioned above. The number of DNDC runs executed with the 16 × 16 km grid data set is only 0.2% of that with polygon-based data set (Table 1), which in combination with a low VIV indicate it is the most appropriate polygon to grid conversion for DNDC modeling [Neufeldt and Schäfer, 2008; Zhang et al., 2009a, 2009b].
3.3. Comparison of CH4 Emission From Paddy Soil Subgroups Simulated by DNDC Model With Different Resolution Data Sets
 Comparison of CH4 APS, AME and RGE quantified for the six subgroups of Paddy Soil with that derived from the polygon-based data set [Zhang et al., 2009a] and from the different resolution grid data sets were conducted, and it was found that the differences varied with the resolution of grid data sets. Absolute VIVs of the three indices for the six subgroups are all less than 10% when soil properties are represented by grid data sets whose grid cell sizes are equal or smaller than 1 × 1 km. That is to say, only these grid data sets are acceptable for the simulation of CH4 emission using DNDC model, according to the evaluation criteria [Whitmore et al., 1997]. In the case of 1 × 1 km grid data set, the relative errors (VIV) of CH4 APS, AME and RGE from all paddy soils are 0.5%, 0.8% and 0.3%, respectively, in the Tai Lake region, being less than 1%, are considered error free. They run with a minimum of 11,427 DNDC executions which is only 22% of that with the polygon-based data set as input (Table 1).
 Further, it was found that the AMVIVs of the three indices of all subgroups are less than 10% for those grid data sets with cell size of ≤8 × 8 km (Figure 4). The relative errors (VIV) of APS, AME and RGE associated with CH4 emission from paddy soils in the Tai Lake region are 3.2%, 0.6% and 2.7%, respectively (Figure 3), and so satisfy the <10% criteria for acceptable performance [Whitmore et al., 1997]. The minimum 324 executions of DNDC model runs with the 8 × 8 km resolution grid is about 0.6% of the total executions when using the polygon-based data set as input (Table 1). The problem presented by relatively high numbers of DNDC executions can be solved reasonably with the reduced resolution, which many researchers have attempted to address [Neufeldt and Schäfer, 2008; Zhang et al., 2009a, 2009b].
3.4. Optimal Grid Cell Size of DNDC Modeling in the Tai Lake Region of China
 The three VIV indices and AMVIV of all paddy soil subgroups in the Tai Lake region of China vary with the grid data set resolution in the form of a wave, where the distance to a balance becomes longer with increasing grid cell size (Figures 3 and 4). Three wave pattern changes are distinguishable from Figures 3 and 4 based on the distance along the x axis. The first pattern is no obvious variation where the VIV and AMVIV (<5%) associated with the grid data sets of cell size ≤2 km. The second variation is slight variation for which the VIV and AMVIV are <10% attributable to grid data sets with cell size in the range of 2 ∼ 8 km. The final wave pattern is serious variation is characterized by VIV and AMVIV >10% which is associated with grid data sets of cell size >8 km.
 Considering the minimum number of DNDC executions and relative error of results obtained by the DNDC simulation with different resolution grid data sets, the grid data set with 16 × 16 km resolution is thought to be the best compromise for DNDC simulation, which is further supported as VIVs of the three indices derived from all paddy soils are less than 5% (Figure 3). However, AMVIVs of the three indices derived from paddy soil subgroups independently are all over 10% (Figure 4), placing the 16 km resolution data set in error group, or an unacceptable substitute for the polygon data set. Optionally the grid data set with cell size of 8 km is acceptable to simulate CH4 emission by the DNDC model, as the VIVs of all paddy soils are less than 5% and AMVIVs of all paddy soil subgroups are less than 10%.
 N2O emission quantified from paddy soils by DNDC modeling with the polygon-based data set and grid data set with an 8 km cell size (Table 3) demonstrates that VIVs of N2O emission (AME) and its rate (RGE) of all paddy soils are less than 5%, and their AMVIVs of all paddy soil subgroups are less than 10%. These results provide additional support that the grid data set with cell size of 8 km is optimal to simulate N2O emission by the DNDC, as it is associated with low error rates and a relatively fast processing time. Therefore, in terms of DNDC modeling of both CH4 and N2O emission, the grid cell size of 8 × 8 km is optimal for application to the Tai Lake region.
Table 3. Comparison of N2O Emission Quantified From Paddy Soils by DNDC Model With Polygon-Based and 8 × 8 km Grid Data Sets, Respectively, in the Tai Lake Region of Chinaa
Subgroups of Paddy Soil
N2O Emission (AME)
N2O Emission Rate (RGE)
Polygon (Gg N yr−1)
8 km Grid (Gg N yr−1)
Polygon (kg N ha−1 yr−1)
8 km Grid (kg N ha−1 yr−1)
B, bleached paddy soil; G, gleyed paddy soil; P, percogenic paddy soil; D, degleyed paddy soil; S, submergenic paddy soil; H, hydromorphic paddy soil; AME, annual mean emission of N2O; RGE, rate of N2O emission; VIV, variation of AME or RGE obtained by DNDC model with 8 × 8 km grid data set, contrasting to that with the polygon-based data set.
All paddy soils
 The effect of grid cell size on DNDC model simulation at the regional scale is dependent on soil property variability input resulting from attributing differences associated with changes in spatial resolution, in nature is an issue of the model uncertainty, due to variations of soil and its attribute in different spatial resolutions. CH4 and N2O emission error were quantitatively validated prior to multiscale modeling [Cai et al., 2003; Zhang et al., 2009a, 2009b]. Although error exists in all DNDC simulations, with both polygon and girded inputs, it is similar as systematic error owing to limitation of the model mechanism, instead of variations of the external parameters input such as soil, weather and farming parameters data input. It does not severely limit analysis of CH4 and N2O emission estimation variability generated from these simulations due to sensitivity of the DNDC model to these external inputs, especially soil parameters. Spatially related generation of input soil attribute is the main source of DNDC estimated CH4 and N2O emission variability associated with soil grid resolution.
 In the present study, all grid data sets were extracted from an original polygon format data set, in which the grid cells are given the value (data record) of the polygon found at the center of each cell. In many regions, soil data are often obtained from collecting soil samples at the center of cells in a grid frame designed for a given region [Lark, 2000], without considering the spatial distribution of soils. The center of cell method is similar to the SAT polygon to grid conversion method used to parameterize soil data for DNDC modeling in the present study. Smaller cell sizes capturing data for a given region imply higher-resolution soil data and improved model accuracy. However, smaller grid cells are also linked to increased cost and time taken in collecting soil samples, physical/chemical analysis and model execution. In combination with data accuracy and computational costs, the simulation units with grid cell size of 8 km for the DNDC modeling is a valuable reference to those regions with limited or no soil information.
 In the south and east China, complicated natural conditions, e.g., topography, climate, vegetation and land use, result in a rich diversity of soil types that can vary dramatically across the landscape. Generally, soil map of a county in the region was mapped in the scale of 1:50,000 in south and east China. While in the North West China, it was mapped in the scale of 1:100,000 and even smaller because of simpler natural conditions and reduced spatial variability in contrast to southern and eastern China. County soil maps are the most detailed and reliable soil data in China [Shi et al., 2006]. For the present study, the investigation was based on the 1:50,000 soil map/database of a county. Quantitative assessment determined the 8 × 8 km grid cell is optimal based on estimation and processing cost. The 8 × 8 km should be suitable for CH4 and N2O emissions modeling using DNDC, at least in the southern and eastern regions of China, where distributes totally 95% of paddy soil in China. As to regions of the North West China, we may conclude with some confidence that the optimal size of grid cell can be larger than 8 × 8 km, due to relative lower complexity of natural conditions and decreased spatial variability. The grid cell size of 8 km to parameterize the DNDC model could meet the request of simulation accuracy. Although the optimal grid cell size was obtained from a specific case study, and it would vary with the investigated region, the knowledge can be used as a guideline for optimizing field sampling strategies [see Wu et al., 2008] to support DNDC modeling of CH4 and N2O emissions from agricultural soils in China.
 We gratefully acknowledge support for the research from Natural Science Foundation of China (40921061), the National Basic Research Program of China (2010CB950702), and the Grouping Projects of the Chinese Academy of Sciences (KZCX2-YW-Q1-07, KSCX1-YW-09-01&02).