Estimating precipitation errors using spaceborne surface soil moisture retrievals



[1] Limitations in the availability of ground-based rain gauge data currently hamper our ability to quantify errors in global precipitation products over data-poor areas of the world. Over land, these limitations may be eased by approaches based on interpreting the degree of dynamic consistency existing between precipitation estimates and remotely-sensed surface soil moisture retrievals. This paper demonstrates how such an approach can be implemented using a Kalman filter tuning procedure to reliably estimate daily rainfall errors in global precipitation products without reliance on ground-based rainfall observations.

1. Introduction

[2] The validation of satellite-based precipitation products with ground-based resources represents a notable challenge - particularly in global areas lacking adequate ground-based rain radar and rain gauge coverage [Amitai et al., 2005]. Over land, these difficulties may be eased by techniques that evaluate the degree of hydrologic consistency existing between rainfall and other hydrologic variables [McCabe et al., 2007]. In particular, the simultaneous spaceborne retrieval of both global precipitation and surface soil moisture products provides an opportunity to evaluate both products based on their mutual dynamic consistency. Here we develop and apply a tuned Kalman filtering strategy to the assimilation of spaceborne surface soil moisture retrievals into a simple water balance model forced by a range of individual precipitation products. Modeling uncertainties derived via tuning of Kalman filter error parameters - based on the statistical analysis of filtering innovations - are compared with actual errors in precipitation forcing products to evaluate whether the approach is capable of reliably estimating the accuracy of global precipitation products in the absence of ground-based rainfall observations. As an initial exercise, the technique is applied to a data-rich area (the southern contiguous United States) where extensive ground-based rain gauge observations are available to validate the approach. However, the potential for exploiting remotely sensed soil moisture retrievals to provide robust estimates of daily precipitation errors in data-poor areas is emphasized.

2. Kalman Filtering

[3] Our technique is based on using uncertain daily precipitation totals (P′) to drive a simple daily antecedent precipitation index (API) model

display math

where γ is the API coefficient and i is a daily time index. Following Crow and Zhan [2007], evapotranspiration seasonality is captured by varying γ according to day-of-year (d)

display math

Parameters α and β are constants and set equal to 0.85 and 0.10, respectively.

[4] When available, remotely-sensed soil moisture estimates θ are used to update equation (1) using a Kalman filter

display math

Here “−” and “+” denote API values before and after Kalman filter updating. Following Reichle and Koster [2005], daily θ estimates are obtained by linearly rescaling a time series of raw volumetric soil moisture retrievals such that their long-term mean and variance match those derived from a multi-year API time series. See section 4 for additional details.

[5] The Kalman gain K in equation (3) is given by

display math

where T is the scalar error variance in API forecasts and S the scalar error variance in θ retrievals. At measurement times, T is updated via

display math

[6] Between soil moisture measurements, and the adjustment of API and T via equations (3) and (5), the API model is temporally updated using observed P′ and equation (1). In parallel, updated model forecast error T+ is forecasted in time following

display math

where Q relates the scalar error variance added to an API forecast as it propagates from time i − 1 to i.

[7] Of particular interest here are the model and observation noise variance parameters Q and S. Proper choices for these parameters lead to a sequence of normalized filter innovations (ν), defined as

display math

that is both serially uncorrelated (ρν(1) = 0) and has a temporal second moment of one (E[ν2] = 1) [Mehra, 1971]. Here, the entire modeling time period is repeatedly simulated until constant values of Q and S are found which approximately satisfy the ρν(1) = 0 and E[ν2] = 1 innovation constraints. As described in section 3, modeling noise variance estimates obtained in this manner (equation image) can potentially be used to evaluate the accuracy of P′ rainfall forcing in equation (1).

3. Tuned Kalman Filtering

[8] As a preliminary test, the filter calibration approach is applied to the assimilation of surface soil moisture estimates into equation (1) using a synthetic twin experimental methodology. The experiment is based on: the generation of a 10-year “truth” soil moisture data set using an (assumed) error-free precipitation product (P) and equation (1) for a 1° latitude/longitude box in the south-central United States, the perturbation of these values via additive Gaussian random error, and their subsequent re-assimilation into equation (1) - now driven by a perturbed precipitation product (P′ = αP), where α is a log-normal random variable with mean μα, standard deviation σα and serial correlation ρα(1). Figure 1 shows results for the case of μα = 1, ρα(1) = 0, and σα equal to both 0.25 (rain product A) and 1.0 (rain product B). The plotted lines in Figure 1 are constructed from a large number of synthetic experimental runs in which S is fixed at a range of values and Q is modified until various innovation constraints are met. For instance, upon assimilation of rain product A, the first innovation constraint (innovation whiteness or ρν(1) = 0) is satisfied by the combination of assumed Q and S defined by the dark dashed black line and the second constraint (innovation second moment of one or E[ν2] = 1) by the solid dark black line. The single combination of assumed Q and S (equation imageA and equation image) magnitudes satisfying both constraints represents the model and observation noise variances predicted by the tuned filter. Repeating the experiment for rain product B (see lighter dashed and solid lines in Figure 1) yields the same equation image estimate and a larger estimate of equation image - accurately reflecting the increased amount of model noise associated with utilizing the lower accuracy rainfall product. Taken as a whole, Figure 1 suggests that the square-root of retrieved equation image magnitudes obtained from the tuned filter may provide an accurate estimate of the actual root-mean-square (RMS) accuracy of precipitation products A and B.

Figure 1.

Utilizing a statistical analysis of filtering innovations (ν) in order to simultaneously estimate both the API model noise variance Q and observation error variance S (for an assimilated soil moisture product). Results are shown for synthetic twin experiments based on forcing the API model with a high (product A) and low (product B) accuracy rainfall product.

[9] The ability of tuned filtering approaches to accurately capture model forcing errors has been well-studied for several decades [Mehra, 1971]. In the absence of any Kalman updating, white noise in P′ will produce persistent drifting in uncorrected API predictions obtained via equation (1) and positively-correlated serial forecast errors. During application of the Kalman filter, placing too much weight on the API forecasts (via either an underestimation of Q or an overestimation of S) will cause the filter to diverge from an optimal estimate and produce subsequent filtering innovations ν with a positive serial correlation (ρν(1) > 0). Conversely, assuming the presence of temporally uncorrelated observation errors, overreacting to θ observations (due to an overestimation of Q or an underestimation of S) produces spurious, random corrections to API. When compared to the true auto-regressive model for API, such perturbations will manifest themselves as negatively-correlated serial forecast errors and filter innovations (i.e., ρν(1) < 0). Calibrating assumed Q and S magnitudes such that ρν(1) = 0 ensures a correct partitioning between forecasting and observation errors. Since ρν(1) is sensitive only to the ratio between Q and S, a second constraint (E[ν2] = 1) is required to fix an absolute value for Q (Figure 1).

[10] Initial results in Figure 1 are based on a highly idealized synthetic experiment. Additional difficulties will arise when the statistical properties of actual precipitation errors differ from those required for the optimal performance of the Kalman filter. For instance, replicating the experiment for biased (μα ≠ 1) and/or serially-correlated (ρα(1) > 0) rainfall error can result in large differences between the square root of equation image and actual RMS rainfall errors. In addition, the ability of the approach in Figure 1 to accurately estimate rainfall errors is based on an implicit assumption that such errors are the dominant source of model forecast uncertainty. This assumption is questionable given the potential inadequacy of the simple soil water loss parameterization in equation (1). The remainder of this paper will clarify the actual value of equation image for rainfall product evaluation by applying the approach in Figure 1 to a number of real data cases.

4. Data and Approach

[11] Real data results are based on a single remotely-sensed soil moisture data set and a range of global and remotely-sensed precipitation products. Remotely-sensed surface soil moisture estimates θ° are obtained from application of the single-polarization Jackson [1993] algorithm to X-band Advanced Microwave Scanning Radiometer (AMSR-E) brightness temperature (TB) data by Thomas Jackson and Xiwu Zhan (USDA Hydrology and Remote Sensing Laboratory). Moderate Resolution Imagery Spectrometer (MODIS) 16-day normalized difference vegetation index (NDVI) composite products and the vegetation water content (VWC)/NDVI regression relationship from Jackson et al. [1999] are used to estimate VWC. Surface soil moisture retrievals are acquired with a spatial resolution of about 402 km2 and at a repeat time of 1 to 2 days at mid-latitudes. Screening is performed to mask potentially corrupted retrievals obtained during precipitation events. Using simple spatial averaging, the soil moisture fields are re-sampled onto a 1° latitude/longitude grid.

[12] Benchmark precipitation values are obtained from the high density gauge-based National Center for Environmental Prediction's (NCEP) Climate Prediction Center (CPC) retrospective rainfall product within the contiguous United States. Hourly CPC data, processed as part of the North American Land Data Assimilation System (NLDAS) project [see Cosgrove et al., 2003], are aggregated in time and space into daily (0 UTZ to 0 UTZ) accumulation observations on a 1° spatial grid. In addition, eight separate global and/or remotely-sensed precipitation products are acquired to represent P′ in equation (1). They include: (1) the 1° daily (1DD) Global Precipitation Climatology Project (GPCP) multi-satellite rainfall product [Huffman et al., 2001], (2) the Tropical Rainfall Measurement Mission (TRMM) multi-sensor “3B42” product [Huffman et al., 2007], (3) the real-time, microwave-only TRMM “3B40RT” product [Huffman et al., 2007], (4) the Air Force Weather Agency (AFWA) multi-satellite product generated by the Agricultural Meteorology (AGRMET) modeling system, (5) the Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN) product [Sorooshian et al., 2000], (6) the NCEP Stage IV rain gauge/ground-based radar product, (7) a single World Meteorological Organization rain gauge at 34.08°N, 102.37°W (WMO1), and (8) the average of this gauge and a second WMO gauge located at 35.38°N, 102.16°W (WMO2). The last two solely gauge-based products are included to represent rainfall measurement errors typical from low-density, rain gauge networks within data-poor areas. To match the reprocessed CPC data, all precipitation products are resampled in time and space to produce daily (0 to 0 UTZ) rainfall totals on a 1° grid.

[13] The tuned filtering approach is run over the entire southern tier of the United States between 2002 July 1 and 2005 December 31 with a special focus on the southern Great Plains (SGP) area (33 to 40°N and 105 to 100°W). Prior to any assimilation, domain-scale API statistics are obtained by averaging API temporal mean and standard deviation for each 1° box in the appropriate domain (e.g., the SGP region in Figure 2). For these calculations, daily API precipitation forcing is calculated by averaging across the daily precipitation totals obtained from all the satellite-based precipitation products. Raw volumetric θ° retrievals are then linearly rescaled such that their domain-averaged temporal mean and standard deviation - now in water depth dimensions - matches these API statistics. Since the API model is run on a daily time-step, sub-daily variations in modeled soil moisture are explicitly neglected.

Figure 2.

Over the southern United States the square root of equation image for the GPCP-1DD product derived using the Kalman filter tuning procedure in Figure 1. The box represents the southern Great Plains (SGP) domain examined in Figure 3.

[14] Our overall approach will be to derive equation image values for all eight global/remotely-sensed rainfall products listed above using AMSR-E surface soil moisture retrievals. Numerical optimization is performed using a two-step approach in which the set of equation image and equation image combinations satisfying the innovation second-moment constraint is reduced to a single combination producing the whitest innovations. Derived equation image will be compared to the daily RMS error of rainfall products calculated versus the benchmark NCEP CPC precipitation product to determine what value, if any, equation image estimates have for evaluating the accuracy of daily precipitation products in land regions of the world lacking sufficient ground resources for traditional gauge-based validation.

5. Results

[15] Figure 2 plots the square-root of equation image results for the GPCP-1DD product over the southern tier of the United States. Analogous plots were created for all eight global and/or remotely-sensed precipitation products listed in section 4 (not shown). Figure 3 summarizes the relationship between equation image and the actual daily RMS accuracy of the rain products - approximated here by calculating the daily RMS difference between each product and benchmark rainfall totals derived from the gauge-based NCEP CPC rainfall data set. Each grey circle in Figure 3 represents results in a single 1° box within the SGP domain for a given precipitation product. All error values are calculated between 2002 July 1 and 2005 December 31. The well-defined linear relationship between the square-root of equation image and actual rainfall RMS error suggests that much of the validation information derived from dense CPC rain gauge observations in the SGP region can be replicated using equation image estimates based solely on the tuned Kalman filtering of spaceborne soil moisture retrievals. Black symbols plotted in Figure 3 are based on averaging the square-root of equation image and rainfall RMS error for each product across the SGP domain and highlight the ability of equation image retrievals to resolve variations in accuracy between products. At this coarser spatial scale, trends in product RMS accuracy are reproduced extremely well (R2 = 0.99) - demonstrating that the approach can make robust distinctions concerning the relative accuracy of competing rainfall products in the absence of ground-based rainfall observations. However, the relationship in Figure 3 is not exactly one-to-one due to a least-squares regression slope which is slightly less than one (0.90) and the presence of a non-zero y-intercept (2.05 mm day−1). The positive y-intercept represents a kernel of uncertainty in API forecasts attributable to factors other than imperfect rainfall forcing. Likewise, the slight reduction in regression slope below one (i.e., the small loss in sensitivity of equation image to rainfall RMS error in Figure 3) may be caused by a lack of soil moisture retrievals during and immediately after intense rainfall events when rainfall errors have the strongest impact on surface moisture conditions and thus estimated equation image.

Figure 3.

The square-root of model noise variance estimates derived via Kalman filtering (equation image) versus actual daily RMS error in rainfall products (calculated against NCEP CPC precipitation observations). Grey circles represent results for individual 1° boxes within the SGP domain and all eight rainfall products. Black symbols represent SGP domain averages of grey circles for each product.

[16] It should be stressed that surface soil moisture retrievals are likely to be considerably less accurate over heavily vegetated surfaces, and results presented in Figure 3 are derived from a lightly vegetated portion of the United States known to be particularly well-suited for the remote sensing of surface soil moisture [Jackson et al., 1999]. Nevertheless, expanding the approach from the regional SGP domain to the entire contiguous United States (CONUS) south of 40°N leads to only a small reduction in the strength of the domain-averaged correlation observed between the square root of equation image and actual rainfall error (i.e., a reduction in R2 from 0.99 to 0.96 for the black symbols in Figure 3). Note that, due to their exceptionally sparse spatial support at this expanded scale, the WMO gauge-based rainfall products were not included in these Southern CONUS results.

[17] An additional concern is that the some of the satellite-based rainfall products used to establish the API climatology, and rescale θ° retrievals, are partially corrected using rain gauge information. These corrections could undermine the value of the approach by establishing a potential dependence of equation image results on the availability of ground-based observations. However, results in Figure 3 demonstrate relatively little sensitivity to the manner in which this rescaling is performed. In fact, completely neglecting potential climatological differences and directly assimilating raw θ° retrievals into equation (1) produces almost identical results.

6. Summary

[18] The application of a tuned Kalman filter (Figure 1) to the assimilation of a remotely-sensed surface soil moisture product into a simple API model is demonstrated to provide robust information concerning the relative level of RMS error in daily, 1° latitude/longitude precipitation products (Figures 2 and 3) within the SGP area of the United States. Since equation image estimates are obtained without extensive ground-based rainfall validation resources, they provide a potentially valuable tool for evaluating (and/or calibrating) global precipitation products in data-poor land areas and underscore the potential synergy arising from concurrent spaceborne retrievals of both rainfall and surface soil moisture [Crow et al., 2006].

[19] While the presence of bias or/and auto-correlated error can potentially affect the reliability of tuned filter error estimates (section 3), initial results over data-rich areas of the Southern United States appear robust. In addition, non-random error sources could also conceivably be filtered using existing techniques for long-term bias reduction in global precipitation products [Smith et al., 2006]. Future work will focus on these potential limitations and the wider application of the approach to evaluate the accuracy of competing global rainfall products over data-poor agricultural production regions of the globe. In addition, the expected future availability of higher-accuracy soil moisture retrievals from planned L-band satellite missions [Kerr et al., 2001] will likely enhance the performance of the technique over vegetated surfaces.


[20] Data support from Xiwu Zhan (NOAA NESDIS) and Thomas Jackson (USDA ARS) is gratefully acknowledged. This research was partially supported by the NASA Applied Sciences and Terrestrial Hydrology Programs.