Improved prediction of quasi-global vegetation conditions using remotely-sensed surface soil moisture


Corresponding author: J. D. Bolten, Hydrological Sciences Laboratory, NASA Goddard Space Flight Center, Greenbelt, MD 20705, USA. (


[1] The added value of satellite-based surface soil moisture retrievals for agricultural drought monitoring is assessed by calculating the lagged rank correlation between remotely-sensed vegetation indices (VI) and soil moisture estimates obtained both before and after the assimilation of surface soil moisture retrievals derived from the Advanced Microwave Scanning Radiometer-EOS (AMSR-E) into a soil water balance model. Higher soil moisture/VI lag correlations imply an enhanced ability to predict future vegetation conditions using estimates of current soil moisture. Results demonstrate that the assimilation of AMSR-E surface soil moisture retrievals substantially improve the performance of a global drought monitoring system - particularly in sparsely-instrumented areas of the world where high-quality rainfall observations are unavailable.

1. Introduction

[2] Variations in soil moisture availability can provide a leading signal for subsequent anomalies in vegetative health and productivity [Adegoke and Carleton, 2002; Ji and Peters, 2003; Musyim, 2011]. As a result, soil moisture information is a key input into many large-scale drought monitoring systems [Mo et al., 2010]. For example, the United States Department of Agriculture (USDA) Foreign Agricultural Service (FAS) attempts to anticipate the impact of drought on regional agricultural productivity by monitoring soil moisture conditions using a quasi-global soil water balance model [Bolten et al., 2009]. However, the accuracy of such models is dependent on the quality of their required meteorological inputs and is thus questionable over data-poor regions of the globe.

[3] With the onset of data availability from the ESA Soil Moisture and Ocean Salinity (SMOS) and NASA Soil Moisture Active and Passive (SMAP) L-band missions [Kerr and Levine, 2008; Entekhabi et al., 2010], the next five years should see a significant expansion in our ability to retrieve surface soil moisture using satellite remote sensing. However, the added value of soil moisture remote sensing, above and beyond current water balance modelling approaches, has not yet been objectively quantified. Here, we evaluate the utility contributed by existing remotely-sensed surface soil moisture products for quasi-global agricultural drought monitoring. FollowingPeled et al. [2010] and Crow et al. [2012], our approach is based on sampling the lagged correlation between root-zone soil moisture anomalies obtained from a water balance model and subsequent anomalies in vegetation conditions (as captured by satellite-based visible/near-infrared vegetation indices). Since this approach measures the ability of current soil moisture estimates to anticipate future variations in vegetation health, it provides a direct valuation of soil moisture products in an agricultural drought context. In addition, by comparing correlations obtained before and after the assimilation of remotely-sensed surface soil moisture retrievals into the water balance model, we can quantify the added utility associated with assimilating remote sensing observations.

2. Methodology

2.1. Two-Layer Palmer Model

[4] Model estimates of surface and root-zone soil moisture are derived from the 2-Layer Palmer water balance model currently used operationally by USDA FAS. The model is based on a bucket-type modeling approach as described inPalmer [1965]. The available water capacity (AWC) of the top model layer is assumed to be 2.54 cm at field capacity, and the AWC of the second layer (i.e., root-zone layer) is calculated using soil texture, depth to bedrock, and soil type derived from the Food and Agriculture Organization (FAO) Digital Soil Map of the World available from the FAO at In this fashion, water holding capacity for both layers (incorporating near-surface soil moisture and groundwater) range between 2.54 cm to 30 cm according to soil texture and soil depth. Vertical coupling between the two model layers is calculated using a simple linear diffusion equation based on the soil moisture content of each layer and an assigned diffusion coefficient [Bolten et al., 2010]. A confining layer (i.e., bedrock) is assumed for the bottom of the second model layer and is treated as a “no flow” boundary. Evapotranspiration is calculated from the modified FAO Penman-Monteith [Allen et al., 1998] method and observations of daily min/max temperature. Further modeling details are available in Bolten et al. [2010]. Required daily rainfall accumulation and air temperature datasets are obtained from the U.S. Air Force Weather Agency (AFWA) Agriculture Meteorological (AGRMET) system (see which derives a daily rainfall accumulation product based on: microwave sensors on various polar-orbiting satellites, infrared sensors on geostationary satellites, a model-based cloud analysis, and World Meteorological Organization (WMO) surface gauge observations.

[5] Despite its continued operational use, the 2-Layer Palmer model is obviously less complex than many more modern land surface models. However, using the same evaluation system applied here,Crow et al. [2012]found that modern land surface models generally offered only marginal increases in agricultural drought monitoring skill relative to simplistic soil water accounting models - suggesting that the 2-Layer Palmer model remains a reasonable baseline for evaluating the added impact of assimilating new remote sensing products.

[6] All modeling is performed on a quasi-global (60°S to 60°N) domain and 0.25° resolution mesh using a daily time step between June 1, 2002 to December 31, 2010. Soil moisture conditions are initialized using climatologically-averaged values (2002 to 2010) for June 1, 2002 and spun-up until the start of the analysis on July 1, 2002. Soil moisture predictions obtained from the model alone will be referred to as open loop (OL) results.

2.2. Remotely-Sensed Soil Moisture

[7] Surface soil moisture retrievals are obtained from gridded 0.25° Land Parameter Retrieval Model (LPRM) products provided by VU University Amsterdam based on Advanced Microwave Scanning Radiometer-EOS (AMSR-E) brightness temperature products [Njoku et al, 2003] between June 2002 and December 2010 [de Jeu, 2003; Owe et al., 2008]. The effective measurement depth of LPRM surface soil moisture retrievals is estimated to be 1–2 cm. For the purposes of this analysis, we assume these retrievals reflect the equivalent soil moisture estimated in the surface layer of the 2-Layer Palmer model. Only descending (1:30 AM local time) overpasses are used since they appear to be more useful for soil moisture retrieval than ascending overpasses [de Jeu, 2003]. LPRM gridded products are screened to mask areas with frozen soil, snow cover, and/or excessive vegetation using a surface temperature algorithm based on 37 GHz AMSR-E brightness temperature observations and retrieved canopy optical depth [Owe et al., 2008].

2.3. The Ensemble Kalman Filter

[8] Prior to assimilation, systematic biases between modeled and observed soil moisture datasets must be removed [Kumar et al., 2012]. To eliminate these differences, raw LPRM surface soil moisture retrievals (θLPRM) are rescaled such that their inter-annual (2002 to 2010) mean (μ) and standard deviation (σ) obtained for a 31-day moving window centered on a given day-of-year (DOY) matches the mean and standard deviation sampled from the top-layer of an open-loop realization (OL1) of the model over the same time period:

display math

Note that all soil moisture variables in (1) and below are given in volumetric units [m3 m−3].

[9] The assimilation of θ*LPRM from (1)into the USDA FAS 2-Layer Palmer model is based on an Ensemble Kalman Filter (EnKF). A 30-member ensemble of two-element soil moisture state vectorsθjcontaining model surface and root-zone predictions is created via the direct perturbation of model soil-water balance calculations. These additive, mean-zero Gaussian perturbations are applied during each daily time step of the model and have covariance:

display math

where αis the ratio of surface layer AWC to root-zone AWC. Upon acquisition ofθ*LPRM at time i via the rescaling step in (1), each ensemble replicate θj is updated following:

display math

where the observation operator H = [1 0]; εis mean-zero, Gaussian noise with varianceR [m6 m−6]; and j is an ensemble number index. The Kalman gain vector K in (3) is:

display math

with P representing the 2 × 2 state covariance matrix sampled directly from the θij ensemble (created, in part, by the introduction of noise with covariance Q) and R the scalar error variance of θ*LPRM retrievals. While θ*LPRM is assumed to directly reflect surface layer conditions, the covariance information in Pis used to update both surface and root-zone forecasts contained inθij. After updating, each θij+replicate is propagated in time by the Palmer 2-Layer Model (and further perturbed viaQ) until the next-availableθ*LPRM observation, at which point (3)is re-applied using a newly sampledP. Daily EnKF state predictions for the surface and root-zone layers,θEnKF1 and θEnKF2 respectively, are then obtained by averaging across the resulting θij+ ensemble.

[10] The parameter R represents the error variance in θ*LPRM retrievals for a given land surface type. The skill of retrieved soil moisture decreases significantly over areas of dense vegetation [Njoku and Chan, 2006]. Therefore, following Bolten et al. [2010], we calculate R as:

display math

where φis the AMSR-E incidence angle;b [m2 kg−1] is a vegetation structure coefficient (set equal to 0.30 for wooded grasslands and shrubs, grasslands, and croplands and 0.28 for closed bushlands, open shrublands, and bare soil); ωc [kg m−2] is canopy vegetation water content; and Ro is a constant set equal to 0.152 m6 m−6. A monthly climatology of Advanced Very High Resolution Radiometer-derived Normalized Difference Vegetation Index (NDVI) retrievals is used to estimateωc following Bindlish et al. [2003]. While (5) has already been applied successfully for use in a similar data assimilation system [Bolten et al., 2010], it should be noted that more complex error estimates for θLPRM retrievals are also available [Parinussa et al., 2011].

[11] Likewise, Qcaptures the added uncertainty incurred when the 2-Layer Palmer advances soil moisture estimates ahead by a one day. Here we assumeQis driven primarily by the accuracy of daily rainfall accumulation products used to force the model. Since this accuracy is known to vary geographically according to the density of available rain gauges for the correction of satellite-based rainfall estimates [Gebremichael et al., 2003], Q is specified as a function of the average distance D[km] to the three-closest WMO rain gauges:

display math

For the case D > = 200 km: Q = 0.082 m6 m−6 and R = 0.

2.4. MODIS NDVI and Land Cover Data

[12] Monthly NDVI products for evaluation are obtained from the Moderate Resolution Imaging Spectroradiometer (MODIS) MOD13C2 product [NASA Land Processes Distributed Active Archive Center, 2011]. Monthly NDVI composite products categorized as “fully reliable” in the MOD13C2 reliability flag file are aggregated from their native 0.05° resolution to match the 0.25° resolution modeling grid. In order to focus on water-limited ecosystems, sub-grid fractions of barren, tundra, forest cover, and open water surfaces from the MODIS MCD12C1 land cover classification [NASA Land Processes Distributed Active Archive Center, 2010] are summed within each global 0.25° pixel, and pixels where the sum of these areas constitutes more than a 50% areal fraction are masked.

3. Analysis

3.1. Rank Correlation Sampling

[13] Daily root-zone soil moisture estimates obtained from the 2-Layer Palmer model open loop (θOL2) and analysis products produced by the EnKF-based assimilation ofθ*LPRMinto the 2-Layer Palmer model (θEnKF2) are separately aggregated in time between July 2002 and December 2010 to obtain monthly inline image and inline image time series. Any months containing five or fewer θLPRMretrievals (e.g., due to snow cover and/or excessive vegetation) are masked from this monthly product. Each product is then grouped by month-of-year and ranked according to soil moisture value within these groupings. The resulting rank time series of Rank( inline image) and Rank( inline image) describe the wetness of any particular month relative to the same month in all other years of the 2002 to 2010 data record.

[14] An analogous aggregation and ranking procedure is applied to monthly MODIS-based NDVI to obtain Rank( inline image), and the sample correlation coefficient Rs(L) is calculated between Rank( inline image) for month k and both Rank( inline image) and Rank( inline image) for month k + L. Much of our analysis will focus on the specific case Rs(−1) where L = −1, and rank correlation is calculated between monthly soil moisture and NDVI when soil moisture precedes NDVI by one month. Note that since neither the 2-Layer Palmer model nor LPRM utilizes NDVI information (and the EnKF uses only climatological NDVI information which will not impact inter-annual ranks), sampledRsshould not be spuriously impacted by the presence of cross-correlated errors. To focus on periods of the annual cycle prone to water (and not energy) limitation, only months with an average daily high air temperature above 5°C are included in such sampling. A minimum threshold of at least 30 monthly inline image and inline image (or inline image ) pairs is then required to sample a reliable estimate of Rs.

3.2. Results

[15] Figure 1 shows Rs(L) on a global 0.25° degree grid for the model-only inline image product (Rs(−1)OL2; Figure 1a) and the inline image analysis (Rs(−1)EnKF2; Figure 1b). Figure 1c plots the difference obtained by subtracting Figure 1b from Figure 1a (i.e., RsEnKF2 minus RsOL2). By applying a Fisher transformation, Z-scores for this difference are calculated as:

display math

where F(Rs) = 0.5 Ln[(1 + Rs)/(1 − Rs)]; n is the number of monthly soil moisture/NDVI values sampled to obtain Rs; and the factor 1.06 corrects for the non-Gaussian distribution of sampledRs [Fieller et al., 1957]. Resulting Z-scores are plotted inFigure 1d. Note that since (7)neglects both the presence of auto-correlation in Rank( inline image) and Rank( inline image) and potential cross-correlation in sampling error, care should be exercised in formally interpretingFigure 1d. Blank areas in Figure 1are due to pixels failing the land-cover masking criteria described inSection 2.4 or land pixels where less than 30 pairs of values are available for estimating Rs(−1) (see Section 3.1).

Figure 1.

Global analysis of Rs(−1) (i.e. the rank correlation between monthly soil moisture and NDVI when soil moisture precedes NDVI by one month) for (a) model-only (root-zone) soil moisture [Rs(−1)OL2] and (b) EnKF (root-zone) soil moisture created by assimilating LPRM soil moisture into the model [Rs(−1)EnKF2]. Also plotted are (c) Rs(−1)EnKF2 minus Rs(−1)OL2 (reflecting the net impact of the assimilating LPRM retrievals) and (d) Z-scores forRs(−1)EnKF2 minus Rs(−1)OL2 given by (7).

[16] Arid regions (e.g., the Western United States, Southern Africa, and Australia) generally demonstrate the highest coupling between open loop root-zone soil moisture and future NDVI (Figure 1a). However, a sharp jump in Rs(−1) is noted when θ*LPRM is assimilated into the model (Figures 1b and 1c). The benefit of surface soil moisture data assimilation is especially large in areas of world where poor rain-gauge coverage degrades the quality of model-only inline image estimates (e.g., Africa, Central Australia, and Central Asia). In these areas, the assimilation of θ*LPRMimproves monthly-scale soil moisture estimates by filtering random modeling errors due to poorly-observed rainfall. In addition to theL = −1 case shown in Figure 1, qualitatively similar results (not shown) are also found for the cases L = −2 and −3 [months]. Finally, Figure 1d demonstrates that spatially continuous areas of enhanced Rs(−1) are statistically significant at a 1σ level, and only sporadic areas of significantly degraded Rs(−1) are found.

[17] As seen in Figure 1, the impact of θ*LPRMassimilation is especially large in data-poor areas of world lacking sufficient ground-based rain gauge instrumentation for adequate correction of satellite-based rainfall products. A number of these notably data-poor countries also face considerable food security challenges.Figure 2 shows Rs(L)OL2 and Rs(L)EnKF2results averaged within six countries in Africa and Southern Asia with moderate to severe food security issues. Relative to the model-only case, the EnKF data assimilation case demonstrates consistently stronger rank correlation with future NDVI in these countries.

Figure 2.

Comparisons between Rs(L)OL2 and Rs(L)EnKF2 over a range of L(i.e., 0 to 6 months) for sparsely-instrumented countries with moderate-to-severe food security issues. Plotted variableRs(L) is the rank correlation between monthly soil moisture for month i + L and NDVI for month i.

[18] It is also useful to examine seasonal trends in Rs(−1). For both the Extra-Tropical Northern Hemisphere (ETNH; to 60°N) and Southern Hemisphere (ETSH; to 60°S), spatially-averagedRs(−1)OL2 and Rs(−1)EnKF2 are plotted in Figure 3as a function of month-of-year. The seasonal time series inFigure 3 demonstrates an intuitive pattern with the highest soil moisture/NDVI coupling, and thus the largest Rs(−1), occurring during the middle/end of ETNH and ETSH summers when soil moisture storage tends to be at a yearly minimum. An increase in spatially-averagedRs(−1)EnKF2 (relative to Rs(−1)OL2) is apparent throughout the annual cycle. In particular, despite relatively high levels of biomass, and thus reduced accuracy in remotely-sensed surface soil moisture retrievals [Njoku and Chan, 2006] during the ETNH and ETSH summers, the positive impact of soil moisture data assimilation persists throughout the growing season. This ability to add skill in the middle portion of the growing season is critical since crop yield sensitivity to water stress is maximized during this period.

Figure 3.

Seasonal cycles of Rs(−1)OL2 and Rs(−1)EnKF2averaged within the Extra-tropical Northern (ETNH; to 60°N) and Southern (ETSH; to 60°S) Hemispheres. Plotted variableRs(−1) is the rank correlation between monthly soil moisture and NDVI when soil moisture precedes NDVI by 1 month.

4. Conclusions

[19] Agricultural drought is commonly defined as an extended period of lower than normal root-zone soil moisture characterized by a reduction in plant biomass and ecologic productivity. Here, we quantify the added value of remotely-sensed surface soil moisture retrievals for improving our ability to accurately predict agricultural drought impacts on regional vegetation productivity. Following the general approach ofPeled et al. [2010], our evaluation is based on sampling the rank correlation between current monthly root-zone soil moisture and future NDVI conditions.

[20] The assimilation of surface soil moisture retrievals into a quasi-global soil water balance model is shown to significantly improve the value of model-based, root-zone soil moisture estimates as a leading indicator of agricultural drought (Figure 1). Such improvement is especially clear over data-poor regions of the world where modeled soil moisture estimates are derived from poor-quality rainfall observations (Figure 2). Value is added even during the middle portion of the growing season when both vegetation biomass and crop yield sensitivity to drought is maximized (Figure 3). Overall, results provide an important new verification of the potential of remotely-sensed surface soil moisture for regional-scale agricultural and ecological prediction activities - particularly in water-limited and/or data-poor regions of the world prone to food insecurity. While the use of vegetation indices like NDVI as a proxy variable for yield is well-established [Becker-Reshef et al., 2010], a better characterization of agricultural productivity can be obtained from comparison against actual crop yield data. In addition, to fully characterize their utility for rapid drought and famine response, soil moisture estimates should ideally be evaluated at finer temporal scales (e.g., weekly rather than monthly). Finally, at least two more years of additional SMOS data collection is required to apply a comparable analysis to L-band satellite soil moisture products. Consequently, results in this paper should be interpreted only as a feasibility analysis and not as the description of a finalized agricultural drought product.


[21] Research was funded by a grant from the NASA Applied Sciences Program.

[22] The Editor thanks Wouter Dorigo and an anonymous reviewer for assistance in evaluating this paper.