To What Extent Did We Change Our Soils? A Global Comparison of Natural and Current Conditions

The food security–climate change nexus rapidly gains momentum. Soil degradation plays an important role in this context while dealing with, for example, the productive capacity of our soil resources or carbon sequestration for climate change mitigation. However, little has been done to assess the pristine soil conditions despite the fact that these provide the basis to put changes into context. Various methodologies have been developed to assess the global distribution of current soil conditions. We used the S‐World methodology that was developed to generate global soil property maps for environmental modelling studies. Up till now, the S‐World methodology assessed current soil conditions by disaggregating the Harmonized World Soil Database using detailed information on climate, topography, land cover, and land use. This study used the S‐World methodology to derive global soil conditions under natural vegetation. A large number of natural areas around the globe were identified for which land cover, expressed by the Normalized Difference Vegetation Index, could be successfully correlated to environmental conditions such as temperature, rainfall, and topography. Using this relation in regression kriging, the vegetation index under natural conditions was derived for the entire globe. Subsequently, the S‐World methodology was used to calculate the soil properties under natural land cover and absence of human land use. Soil property maps for natural and current conditions were compared and showed large local differences. The results indicate that there are major changes due to land cover and land use change and that these changes are concentrated on the globe. The results are the basis for future assessments on, for example, land degradation, food security, or the sustainable development goals. © 2017 The Authors. Land Degradation & Development Published by John Wiley & Sons Ltd.


INTRODUCTION
The human presence over the last millennia has influenced ecosystems globally. The impact has been substantial, with global deforestation and climate change receiving considerable attention. There has been less attention for changes in our soil resources despite their key role in food securityclimate change nexus that was recently re-iterated at the 21st Conference of the Parties of the UNFCCC (CoP21). Nevertheless, soil quality and its deterioration could hamper many of the global targets that the global community agreed upon in the Rio conventions and the sustainable development goals such as increasing food security, reducing poverty, and conserving biodiversity. Soils play an important role in the sustainable development goals (Bouma & Montanarella, 2016). However, major efforts to assess soil degradation at the global level show significant discrepancies. Gibbs & Salmon (2015) discuss differences between four studies by Oldeman et al. (1991), Bai et al. (2008), Campbell et al. (2008), and Cai et al. (2011) and come to a shocking conclusion: "our uncertain knowledge of degraded lands is not surprising given the significant measurement challenges of capturing a dynamic and subjective condition." A proper discussion requires the assessment of current soil degradation rates and restoration potential. However, data availability and the lack of proper definitions hamper the development of these databases as illustrated by the differences in the various existing attempts. An estimate of the current status of soil conditions and an estimate of soil conditions under natural conditions may help to frame the discussion and provide an objective indication on the extent of the problem and reveal where changes are most serious.
Currently, there are different global soil databases that provide a general description of the present state of our soil resources. Examples include the Harmonized World Soil Database (HWSD;FAO et al., 2012), SoilGrids (Hengl et al., 2014), and S-World (Stoorvogel et al., 2017). However, little is known about the world's natural soil conditions. The ultimate question is how these could be reconstructed? Recently, new developments in soil science changed the basic paradigm of the soil survey where soil observations are related to auxiliary data either statistically (e.g., digital soil mapping, McBratney et al., 2003) or more mechanistically (Finke, 2012). This paper aims to take stock of these developments and to estimate, besides the current conditions, the natural soil conditions in such a way that they can be compared with current estimations. We will first explain one of the recently developed methodologies for the assessment of global soil properties. This methodology, denominated S-World (Stoorvogel et al., 2017), disaggregates the HWSD using concepts of digital soil mapping. Land use and land cover play a central role in the estimation of global soil conditions by S-World. This study takes as its starting point that (i) we will get a proper assessment of current soil conditions if we apply the S-World methodology using current land use and land cover, and (ii) we will get a proper assessment of natural soil conditions if the S-World methodology is applied using natural land cover. Therefore, the S-World methodology was applied using land cover and land use for current and natural conditions to predict the current and natural soil conditions (Figure 1).
In the Materials and Methods section, we describe the S-World methodology, provide details on how we derived current and natural land cover and land use, and the strategy on the data analysis. In the Results and Discussion sections, we describe (i) the derivation of natural land cover, (ii) look at the changes in land cover and land use, and (iii) study in detail the changes in soil organic matter contents depth. The general discussion focuses on the methodological elements of the study but also at the potential contribution of the assessment in the international debate.

The S-World Methodology
The S-World methodology was developed to create global, high-resolution (30 arc-second) grids of soil properties for environmental modelling on the basis of available auxiliary and legacy data (Stoorvogel et al., 2017). The methodology disaggregates the HWSD (FAO et al., 2012). Just like other exploratory soil surveys, the HWSD has many complex mapping units described by multiple soil types. These complex mapping units are disaggregated by defining toposequences and linking soil types to topographic positions in the landscape. By coupling the soil map with a digital elevation (GMTED2010: the Global Multi-resolution Terrain Elevation Data 2010; Danielson & Gesch, 2011), a spatial disaggregation can take place resulting in mapping units that are described by single soil types. The HWSD makes use of the Revised Legend of the Soil Map of the World (FAO et al., 1990) with 148 different soil types. For each of the soil types, the ranges in soil properties were characterized in terms of the first and ninth decile using the World Inventory of Soil Emission Potentials database that contains soil profile descriptions from 10,253 locations (Batjes, 2009). The possible range for variable i for soil type where v is1 corresponds to the first decile and v is9 to the ninth decile. The soil properties included the thickness of the topsoil, soil depth, soil organic matter (in the topsoil and subsoil), and soil texture. S-World builds, in line with the recent developments in digital soil mapping, on the hypothesis that the range in soil properties can be explained by various landscape properties. If the landscape properties are correctly chosen, they correlate to the soil-forming factors (climate, topography, land use, parent material, and time) and are indeed underlying the variation within the soil types. For S-World, average annual temperature and annual rainfall (from the WorldClim database, Hijmans et al., 2005) were selected to represent climate, the local variation in elevation (GMTED2010) was selected to represent topography, and the land use intensity (derived from GlobCov; Bontemps et al., 2010) and vegetative cover (from SPOT NDVI images, http://free.vgt. vito.be/) were selected to represent land use. It is assumed that the soil type itself embeds the information on parent material and time (in soil formation interpreted as the age of the soil). Conceptually, S-World uses a model to downscale a certain soil property v for a particular soil type with a known range based on various landscape properties. Before landscape properties can be used to explain the spatial variation for each of the soil properties, the ranges in landscape properties need to be known per soil type. The occurring landscape conditions for each of the soil types are characterized by the first and ninth decile for each of the landscape properties. The observed range for landscape property j on soil s is defined as [p js1 .. p js9 ] in which p js1 corresponds to the first decile and p js9 to the ninth decile. S-World estimates v i at location x where we find soil s as . Different explanatory factors represented by the landscape properties determine w. Weight w at location x for soil property i is calculated as a linear combination of weights for the individual landscape properties: The weight w ijx for landscape property j is calculated as Case : in which c ij is a constant that indicates the relative importance of the landscape property j for soil property i. Landscape property j on location x is described by p jx . The sign of c ij indicates whether there is a positive or negative relationship between the landscape property and the soil

Current and Natural Land Cover and Land Use
Land cover and land use are important input data for S-World. Land cover provides insight in the protective cover against erosion and is related to the total amount of organic matter that may be added to the soil profile. Land use plays a role as biomass may be removed as part of the harvest and tillage may lead to increased mineralization. The global change in land cover and land use is the main cause behind differences in soil properties between the current and natural conditions (Smith et al., 2016).

Land cover
A good and commonly used proxy for land cover is the Normalized Difference Vegetation Index (NDVI) as a measure to describe vegetative cover. One of the reasons why NDVI has become so popular for global modelers is that it can easily be derived from remote sensing images using the reflectance of green and near infrared bands or even with specific sensors. Good examples are provided by Miao et al. (2010) and Lanfredi et al. (2003) who used NDVI data for the assessment of vegetation cover dynamics and land degradation. We used 5 years (2006-2010) of monthly 30 arc-second resolution SPOT NDVI images (http://free.vgt.vito.be/) from which an average NDVI was calculated. This average NDVI is used as a proxy for the current vegetative cover of the soil. Unfortunately, observations are limited and trends in observed NDVI are too short to backcast towards natural conditions. Differences in land cover have been analyzed in terms of the 5-year mean NDVI. One could consider including seasonal or interannual variability in the analysis. However, the data to derive the relationships with soil properties are lacking and other studies have shown that the use of annual NDVI is a good approximation for global application like primary productivity (Schut et al., 2015). While the current NDVI is known for the entire globe on the bases of remote sensing, the natural NDVI is only known for those areas where natural vegetation is found. With a proper relation between known ecological conditions, the NDVI in the natural areas can be extrapolated to the entire globe. A search for environmental factors that were available at a 30 arc-second resolution and with a global coverage included (i) a range of different climatic factors from the WorldClim database and (ii) various topographic indicators from GMTED2012. Although soil conditions obviously play a role in the vegetative cover, they were left out from the analysis to prevent any circular reasoning, because the final aim is to predict soil conditions. The vegetative cover in natural conditions is to a certain extent independent from the soil conditions as vigorous vegetation can still develop under relatively poor soil conditions with sufficient rainfall because of the internal recycling of nutrients. However, there are also cases where very thin soils will still have an impact on the vegetation. The methodology to derive a global grid of the NDVI under natural conditions is illustrated in Figure 2. First, a large database was created with locations that can be considered to be "natural" (Step 1). Two databases of natural areas were used: ( . After randomly locating a very large (125,000) number of points in the natural areas indicated on the two maps, a sub-sample was taken that guaranteed a proper geographic distribution over the world with locations in each 500 × 500-km block and at least 500 locations in each of the 14 global biomes (Olson et al., 2001). This yielded a database of 10,000 locations that were considered to be natural, evenly distributed over the globe and over the global biomes. The current NDVI for each of the locations was derived from the 5-year average (Step 2). Regression kriging (Hengl et al., 2004) was used to predict the NDVI under natural conditions for all other locations on the globe. First, the current NDVI and environmental conditions (including altitude, topography, temperature, and rainfall) were extracted for each of the locations. Multiple step-wise linear regression was used to describe the relationship between these environmental conditions and the current, natural NDVI ( Step 3). However, the spatial patterns in the residuals revealed that there were areas with significant overestimation and underestimation. Therefore, the residuals were interpolated by ordinary kriging (Step 4) and subsequently used to correct for the NDVI values estimated by the regression ( Step 5). The estimates of 2,000 well-distributed independent "natural" locations not included in the regression were used to validate the aforementioned procedures.

Land use
Because high-resolution land use maps are lacking, land use intensity was used as a proxy. The GlobCov database provides insight on this land use intensity at a 10 arc-second resolution. The classes of the GlobCov database were reclassified into land use intensity ranging from 0% in natural areas with little to no human influence to 100% in areas with very intensive management under arable farming where soil may be tilled, organic matter (crop product and, possibly, crop residues) may be removed, and the soil may be bare for part of the year. The intermediate classes include less intensive forms of land use (e.g., pasture) or mixed cropping. Details on the reclassification are provided in Table I. After reclassification, the global grid was aggregated to a resolution of 30 arc seconds by averaging the land use intensities. The aforementioned procedure yields a map representing the variation in the current land use intensity at the global scale. The map of land use intensity under natural conditions is a single value map with 0% throughout the global land surface.

Current and Natural Soil Property Maps
To assess the differences in soil properties between current and natural conditions, we used current and natural land use and vegetative cover. It is assumed that soil type, topography, and climate did not change. S-World was run for (i) the current conditions making use of the GlobCovderived land use intensity and the 5-year average of global NDVI, and (ii) the natural conditions in which land use intensity is set at zero and the map with the predictions of natural NDVI was applied. Although S-World has been used to evaluate a range of soil properties (Stoorvogel et al., 2017), the methodology is illustrated for soil organic carbon in this manuscript. Other variables are available in the database.

Current and Natural Land Cover
In a stratified random sampling procedure, 10,000 locations were identified that were well distributed over the natural areas in the various biomes (Step 1, Figure 3a). The natural NDVI (NDVI nat ) and the environmental characteristics were derived for the locations ( Step 2) and correlated to each other in a stepwise regression ( Step 3). The regression equation predicted the natural NDVI nat as NDVI nat ¼ 3:23Á10 1 þ 3:45Á10 À1 D þ 1:53Á10 À1 R À4:28Á10 À5 R 2 À 8:16Á10 À2 R CV þ 7:39Á10 À2 T þ6:57Á10 À2 T CV þ 6:97Á10 À5 RT in which D is an indicator for topography based on the standard deviation in altitude within each grid cell (in m), R is the average total annual rainfall (in mm yr À1 ), R CV is the coefficient of variation of monthly rainfall throughout the year (in mm yr À1 ), T is the mean annual  temperature (in°C), and T CV is the coefficient of variation of monthly average temperature throughout the year (in°C). Key factors that played a role in the description of the natural NDVI included the average climatic conditions, the level of seasonality in the year (described by R CV and T CV ), and topography. The r 2 of the regression was 0.84 with a RMSE of 13 (Table II). The validation results of the regression equation with 2,000 independent locations gave very similar results as illustrated in Table II. Nevertheless, the regression results differed between the various biomes ( Figure 3b). Also, when looking at the residuals of the regression analysis, clear spatial patterns were observed with regions where the NDVI was overestimated and regions where the NDVI was underestimated. These regions were clearly related to the global biomes. This would suggest that the inclusion of biomes as predictors could further improve the regression. However, biomes do not change abruptly, but continuously (through ecotones). In addition, the map of the biomes is not available at the 30 arc-second resolution. Also, surface geology is a possible predictor that can be considered in future studies. The residuals presented a strong spatial correlation as described by the semi-variogram with a nugget-to-sill ratio of 0.063 and a range of 1331 km. For this study, the residuals were interpolated with ordinary kriging to create a map of the residuals with which the NDVI map estimated by the regression could be corrected (Step 4, Figure 3c). The interpolated residuals were finally superimposed over the map with the regression results (Step 5) to create a final estimate of the NDVI under natural conditions (Figure 4b). The correction for the overestimations and underestimations further improved the regression as shown by the validation with the 2,000 independent locations as shown in Table II. Figure 4c presents an overview of the global changes in NDVI.
In general, we observe changes in NDVI and land use intensity on places with arable agriculture, but also in urban areas. However, despite a general positive correlation between the current land use intensity and NDVI changes, there is not a direct relationship between the two. Six different situations can be identified: (i) A low current land use intensity with small changes in the NDVI. This is a typical situation for (semi-) natural areas where changes have been minimal. (ii) A low current land use intensity with an increase in NDVI. Although this situation is rare, temperate regions that were originally covered with deciduous forests have been replanted with evergreen forests resulting in an increase in its NDVI. (iii) A low current land use intensity with a decrease in NDVI. This either occurs in situations where the original NDVI is very high (e.g., tropical rainforest) and minor changes already result into considerable changes (e.g., the conversion of tropical rainforest into tree plantations) but also in areas that were more intensively used and degraded in the past leading to a low land use intensity under the current situation. (iv) A high current land use intensity but with minor changes in NDVI. This situation is mostly found in temperate regions where mixtures of agricultural land use and pasture areas have similar NDVI as in the pristine natural land cover. (v) A high natural land use intensity with an increase in NDVI. This is most obvious in irrigated areas that are typically located in drier regions with a low natural land cover and where the supply of irrigation water and fertilizer leads to a significant increase in the vegetative cover (e.g., the Nile basin). (vi) A high current land use intensity with a decline in NDVI. This is a logical combination that can be expected as intensive management typically results in a decline in vegetative cover. It is most obvious in regions where the natural vegetative cover is relatively high and where, as a result of annual cropping with a fallow period, the vegetative cover can drop considerably (e.g., the soybean production areas in Brazil and Argentina).

Soil Property Changes
Using the current and natural NDVI maps in combination with the map on land use intensity, the S-World methodology yielded maps of a range of different soil properties for current conditions as well as natural conditions (following the procedures described in Figure 1). Figure 5 provides an illustration of the results for one of the soil properties: soil organic carbon in the topsoil (0-50 cm). The only drivers behind changes in soil properties are the differences in NDVI and land use intensity. The impact that these drivers have on the soil properties depends on the soil type and more precisely on the range of the soil property [v is1 .. v is9 ]. Some soil types are per definition very shallow (e.g., Regosols) or have a low soil organic carbon content (e.g., Arenosols). As a result, there is little to lose for those soil types. This is also true for soil types where the variation in the soil type is very limited. Other soil types present very large ranges in the soil properties and hence have more to lose. As a result, we see the larger absolute changes in areas with intensive agriculture. The main exceptions are found, for example, in the Sahel region, where the difference in soil organic matter content between intensive agriculture and natural are limited, because the soils are shallow and have a low soil organic matter content.

DISCUSSION
The methodology that was applied aimed to assess and map the impacts of human land use on the global soil resources. By expressing the impacts in terms of tangible changes in soil properties, we provide an objective assessment and avoid discussions on the term "soil degradation" being a subjective condition (Gibbs & Salmon, 2015). The results of the study will be available through the S-world website and through the principle author of this paper.
However, the impacts of human-induced changes in soil properties still need to be interpreted to assess the impact on, for example, the Sustainable Development Goals. A relatively small decline on the poor Ferrasols in West Africa may have large impacts for food production whereas the same change on Andisols in Indonesia may have no impact at all. Stoorvogel et al. (2017) discuss the strengths and weaknesses of the S-World methodology. The map of the natural NDVI was essential for evaluating soil properties under natural conditions. Multiple efforts were tried out to develop a map of the natural vegetative cover including regression in which current NDVI is explained by environmental factors and land use, neighborhood statistics where natural NDVI was assumed to correspond to natural areas in the proximity, and frontier analyses where the highest NDVI values under specific ecological conditions were considered to represent the natural conditions (Silva et al., 2017). These efforts failed for a number of reasons: (i) Natural areas do not necessarily represent the highest NDVI values. NDVI may increase due to intensive management practices (e.g., fertilization and irrigation) or with the anthropogenic change of natural deciduous forests into evergreen pine forests. (ii) Many areas are considered to be natural in many databases whereas in reality they have been managed or converted into another land cover. In addition, human-induced badlands may be classified as natural. (ii) There is a very uneven distribution of true natural conditions. As a result, some environmental conditions may have a lower weight in the regression analysis leading to poor prediction in agro-ecosystems in areas without much true natural conditions such as in Europe. Ultimately, the regression kriging that was followed yielded very satisfactory results in the validation (Table II). A question that remains is that only changes in land use and land cover have been evaluated. Changes in climatic conditions and topography may also have taken place but it is unlikely that these changes, given their order of magnitude, would have impacted this particular assessment. An unknown factor is the potential change of soil types. The current assessment hypothesizes that the changes from natural conditions to current conditions did not lead to a change in soil types. Data on changes in soil types are however scarce and do not provide the basis for a global analysis. A good example of such a change in soil type is the transformation of Mollisols into Alfisols as observed in Iowa (Fenton, 2012). The difference between the natural and current maps can be interpreted as an estimate for human-induced "degradation" or change. However, degradation is a dynamic process. It matters whether conversions have taken place recently or a long time ago. The current conceptual model assumes equilibrium in the soil properties. This will not always be the case. It depends on the soil property and the time and extent of change in environmental conditions how valid the assumption is. Changes in soil depth, for example, may take a long time. Changes in soil organic matter (after, e.g., deforestation) may happen faster. This is particularly true for the major changes after deforestation for agriculture. A new equilibrium may take several decennia. The database provides the modelling community with a first estimate of soil properties under natural conditions. Because of the global extent, the high resolution, its quantitative character and its availability, it allows for the modelling community to evaluate the human-induced impact on various soil functions like, for example, food production or carbon sequestration.
The database that is provided here gives essential information for policy makers. Reports by international organizations like the Food and Agriculture Organization who aim to assess the state of the world's land resources (FAO, 2011) need to put the current soil conditions and observed changed in a context. The increasing call to assess impacts of global land use on a range of different ecosystem services (Newbold, 2015) is increasingly being recognized. The effect on soil resources through, for example, soil degradation plays a pivotal role in this discussion (Gibbs & Salmon, 2015). Looking back is essential, but is increasingly recognized that similar information is required to look forward and to see what can be expected with or without conservation policies (Montanarella, 2015;Bouma & Montanarella, 2016). The scientific community needs to provide insight in the opportunity space and support the definition of true targets set in, for example, the Sustainable Development Goals (ICSU & ISSC, 2015). A recent example is the discussion about the recent initiative where soils play a crucial role in mitigating climate change through carbon sequestration (Rogelj & Knutti, 2016). The establishment of realistic targets (such as the 4‰ carbon sequestration target) requires global soil databases. As such, this research aimed to contribute in mobilizing the data revolution for sustainable development (UN-IEAG, 2014).

CONCLUSION
We presented a global quantitative assessment of soil properties under current and natural conditions. The database was established through the assessment of natural land cover and an analysis with the S-World methodology. The difference between current and natural soil conditions is revealed and can be used as an estimate of the influence of land use and land cover changes on the world's soil resources. The methodological framework also enables analyses of future possible trajectories.