De-noising of passive and active microwave satellite soil moisture time series



[1] Satellite microwave retrievals and in situ measurements of surface soil moisture are usually compared in the time domain. This paper examines their differences in the conjugate frequency domain to develop a spectral description of the satellite data, suggesting the presence of stochastic random and systematic periodic errors. Based on a semiempirical model of the observed power spectral density, we describe systematic designs of causal and noncausal filters to remove these erroneous signals. The filters are applied to the retrievals from active and passive satellite sensors and evaluated against field data from the Murrumbidgee Basin, southeast Australia, to show substantive increase in linear correlations.

1 Introduction

[2] Soil moisture (SM) plays a key role in hydrological and meteorological processes. Spaceborne microwave remote-sensing technology provides the most practical means to achieve real-time global mapping of surface SM over a range of land surface and meteorological conditions. Microwave technologies have underpinned specialized satellite missions—the operational Soil Moisture and Ocean Salinity (SMOS) mission by the European Space Agency and the Soil Moisture Active/Passive mission planned for 2014 by National Aeronautics and Space Administration (NASA). Significant efforts have been taken to evaluate SM retrieval accuracies in field campaigns [e.g., Albergel et al., 2012], which generally show high root-mean-square differences caused by a rather high bias but moderate to fair correlations give cause for optimism. While various renormalization methods for matching statistical moments against reference data may circumvent systematic errors in bias and variance, Su et al. [2013] found evidence that they are limited in improving correlations due to significant short time scale fluctuations.

[3] Most SM retrieval algorithms are based on instantaneous observations and have not fully exploited historical observations. This has motivated Du [2012] to improve correlations by combining high-frequency spectral components from direct sensor observations with the retrieved SM using Fourier filters. The rationale is that short time scale SM signals are better preserved in direct observations. Here we follow an alternative line of attack to improve the retrieval accuracy of current products by addressing the question: How can erroneous signals be removed from these products by using information in past satellite observations? Our starting point is to develop a semiempirical model of satellite data in the conjugate frequency domain and combine it with methods in digital signal processing to develop near-optimal de-noising filter designs. The utility of the filters is demonstrated by evaluating de-noised SM against in situ measurements from southeastern Australia.

2 Data Sets

[4] We consider SM retrievals from AMSR-E (Advanced Microwave Scanning Radiometer-Earth Observing System), ASCAT (advanced scatterometer), and SMOS for their distinctive characteristics. AMSR-E used C- and X-band radiance observations to retrieve near-surface SM. The host satellite Aqua provided daily scans of Australia during the midday ascending (1330 h local time) and nighttime descending (0130 h) orbits. We use the version 5 C/X-band 0.25° resolution product described by Owe et al. [2008] for the period July 2002 to October 2011. In contrast, SMOS provides L-band observations at 0600 h and at 1800 h. We use the most recently re-processed (RE01, January 2010 to March 2012) daily global SM product provided by Centre Aval de Traitement des Données [Jacquette et al., 2010]. Finally, the ASCAT sensor onboard the MetOp-A is a C-band scatterometer that provides observations over Australia about twice a day at about 2130 h and 0830 h. Robust retrievals are achieved using a change-detection algorithm where the moisture content is retrieved as relative wetness. We convert the version 5.4 data set (January 2007 to August 2011) produced by Vienna University of Technology to volumetric measures using soil water retention data reported by McKenzie et al. [2000]. Finally, field measurements of SM at 0–5/8 cm depth are obtained from the 82,000 km2 Murrumbidgee Basin at southeast Australia. Smith et al. [2012] provide details of the regional climatology and monitoring network OzNet, and we follow the data selection and re-sampling methods described in Su et al. [2013] for quality assurance. For comparison with the AMSR-E and SMOS SM, arithmetic averages of the in situ measurements colocated in their satellite footprint are used. For ASCAT SM, measurements from each station are compared with the nearest retrievals. The ground data are subsampled to match the different observation times of the respective satellites.

3 Spectral Analysis

[5] Katul et al. [2007] presented a spectral description of in situ soil moisture dynamics. The near-surface SM time series can be regarded as a mixture of long-term trend consisting of seasonally varying cycles (SS) originating from the seasonal fluctuations in rainfall and solar radiation, and irregular fluctuations (SE) from individual rainfall wetting and drying events. The longer-term trends can also result from instrument drift and longer-term climate cycles and trends, but in this study, these are ignored since only decade-long observations were considered and instrument drift was not observed. The dynamics {SS, SE} can be described by a one-dimensional (1-D) water balance equation:

display math(1)

where θ is the topsoil water content, p is the throughfall precipitation rate, l is the total loss rate from the surface layer, and μ is the layer thickness. In Fourier space, the combined power spectrum of SS and SE is |Θ(ω)|2 = ω− 2|P(ω) − L(ω)|2, where Θ(ω), P(ω), and L(ω) are the Fourier transforms (FTs) of θ(t), p(t)/μ, and l(t)/μ, respectively, and ω = 2π/T is the angular frequency of a spectral component with period T. This spectrum describes the distribution of total power over the different frequency components of the stochastic soil moisture process. Since typical storms last for hours while losses have a relatively slow response, the spectral amplitude of P and L decreases with increasing ω, so that the small time scale SM dynamics mimic Brownian motion and have a brown-like spectrum, |Θ(ω)|2 ∝ ω− 2 − |κ| [Katul et al., 2007].

[6] The discrete Fourier transform (DFT) of the averaged 30 min ground data from six monitoring stations in a 25 × 25 km2 area in Kyeamba Creek provides experimental support to this model. In order to perform the spectral analysis, we apply a 1-D gap-filling algorithm based on the discrete cosine transform (DCT) [Wang et al., 2012] to infill missing values in the data. As the FT of an individual realization of a random process does not yield an ordinary, well-behaved function of frequency, its power distribution can be better represented by power spectral density (PSD, in the units of m6 m−6 h) calculated using Welch's method of averaging the periodograms of the time series segmented with a moving Hamming window. However, a trade-off exists: The use of longer windows enables identification of longer-period components and better resolutions, while shorter windows allow more samples for averaging to reduce the variance of the estimation. The results are shown in Figure 1a, depicting an overall power scaling of ω− 2.25, and spectral peaks (black arrows) corresponding to the diurnal and half-daily cycles due in part to daily temperature variations, and seasonal cycles due to large-scale temporal variations of vegetation and climate.

Figure 1.

(a) PSD of 30 min in situ SM time series (inset) at Kyeamba, estimated using Welch's method with different Hamming window sizes W. Same but for (b) AMSR-E, (c) SMOS, and (d) ASCAT half-daily SM. Dashed blue curves are model fit with equation ((2)). Red arrows indicate false resonances in AMSR-E and SMOS data. All subplots follow the legend of Figure 1a unless indicated otherwise.

[7] Fourier analysis of the satellite product can also provide insights into the observed dynamics and more importantly, retrieval error characteristics. While most evaluations of satellite SM [e.g., Su et al., 2013, and references therein] have focused on comparing satellite SM against in situ measurements in the time domain, our analysis represents a complementary framework in the Fourier domain. To apply conventional DFT, we assume that periodic temporal sampling by satellite sensors at approximately 12 h intervals. For ASCAT, where there are multiple overpasses in one morning or night, their averages are used as single measurements. The DCT-based infilling is then applied to provide estimates for the missing data points (34–55%) of the satellite time series, where the lengths of intermittent gaps are mainly 0.5–1 day long.

[8] Figures 1b–1d plot the PSDs of AMSR-E, SMOS, and ASCAT SM time series over Kyeamba. They are representative of the spectra obtained at other locations in the catchment. Their spectra are significantly different from the in situ spectra, especially at high frequencies, suggesting that two forms of retrieval error exist in the satellite data. First, the appearance of relatively flat power distributions for T < 103h suggests a stochastic noise component. A wide range of short-term stochastic processes such as background contributions to sensor field of view, coarse spatial sampling, and sensor uncertainties can combine to introduce such retrieval errors. Second, the spectra contain additional resonant peaks, which manifest as extraneous cycles in the time domain, suggesting a form of systematic retrieval error. The AMSR-E spectrum contains 10 false resonances (red arrows) with periods (T) in the range of 1–8 days, and SMOS has about 12 false resonances. For ASCAT, the false resonances are spectrally broad or absent, and there is a noticeable gap in the T = 1.5–2 day spectral window. All three products also contain energetic components with T = 10–40 days, although the repeat cycles of satellite orbits in the range of 19–26 days may be partly responsible for this. This form of systematic error is distinct from the consistent differences in means and spread, and among other reasons, might be attributed to temporally varying biased sampling of a land surface with high-spatial variability. From synthetic experiments with the ground data (not shown), we found that these features do not arise from the irregular temporal sampling by the satellites sensors.

4 De-noising Methods

[9] In this section, we build upon the water balance model in equation ((1)) to devise two de-noising strategies, namely a low-pass Wiener filter and a bandstop filter to remove the identifiable stochastic and systematic errors in satellite data in the high-frequency regime. The details of their mathematical derivations and Matlab pseudocode are provided in the supporting information.

4.1 Wiener Low-Pass Filter

[10] To the zeroth order, the stochastic noise error can be assumed to be additive white noise so that the transform of an erroneous satellite retrieval is FT[θsat] ≡ Θsat(ω) = χΘ(ω) + E, where Θ(ω) is the solution of equation ((1)), χ is some constant scaling factor, and E is the constant amplitude of the noise. Given the linearity of FT, this statement assumes the Brownian process characteristics of point-scale SM and the water balance equation extends to the satellite footprint scale. Some evidence supporting this includes linear associations between the FTs of in situ SM at six sites colocated within the Kyeamba area (not shown) and temporal stability between point-scale dynamics [e.g., Grayson and Western, 1998].

[11] Focusing on the short time scales, we invoke first-order approximations by linearizing equation ((1)) with l(t) = ημθ(t), where η is a constant, and assume a Poisson model of precipitation arrival to obtain the model spectrum of noisy satellite SM time series:

display math(2)

for ω ≠ 0, where |P(ω)| ≈ P is the spectral amplitudes of the rainfall forcing and A ≡ χP is introduced. Since the noise is typically weaker than the signal, three regimes can be identified, namely, |Θsat(ω)|2 ≈ A2/η2 in the low-frequency regime for time scales over a month, where ω < < η; |Θsat(ω)|2 ≈ A2/ω2 in the brown spectrum regime, where the soil acts as an integrator of random rainfall forcing; and |Θ ′ (ω)|2 ≈ E2, where E2 > AE/ω > (A/ω)2 in the noise regime where the stochastic noise dominates. This model produces similar spectra to the in situ and satellite spectra, as shown by the model fits in Figure 1.

[12] We apply the standard treatment of signal processing to design a filter with a transfer function HL(ω) to eliminate the stochastic error that is optimal, conditional on the validity of the signal and noise models. Optimality in the least squares sense [Wiener, 1949] requires a filter to produce a filtered signal θfilt(t) or Θfilt(ω) = HL(ωsat(ω) that minimizes the L2-norm of Θfilt(ω) − Θ(ω). Consequently, the transfer function for such a noncausal discrete-time (DT) filter is

display math(3)

where z ≡ exp() and γ2 = (A2 + E2η2)/E2. For a given DT input signal {θsat[n];  n = 0, 1, 2 …} sampled at times t = nΔt, the filtering can be implemented in the time domain with the following input-output equation

display math(4)

[13] This noncausal filter design that requires pre-recorded data can be extended to a causal filter that is suitable for real-time operations,

display math(5)

and in the time domain,

display math(6)

[14] The gain response |HL(ω)| ∝ (cosh γ − cos ω)− 2 ≈ (cosh γ − 1 + ω2/2)− 2 describes the level of attenuation to be applied to the spectral component at frequency ω. The Wiener filter constitutes a low-pass filter that leaves the low-frequency regime T < 103h of the spectra unchanged with a unit gain but suppresses the components at high frequencies with decreasing gain |HL(ω)| ~ O(1/ω), resulting in a smoothed time series in the time domain and a brown spectrum in the Fourier domain. However, as the high-frequency components are also responsible for describing rapid SM changes following intense rainfall events, a trade-off exists between removing noise and retaining this information. It is also important to note that a filtering problem is equivalent to a signal estimation problem, where the Wiener filter provides the best estimation of SM based solely on imperfect observations. While designed primarily for its gain response, the phase-shift response of a causal filter can induce undesirable signal distortions [Hamming, 1998], i.e., different spectral components within the filter bandwidth undergo temporal delays or advances of τ(ω) = ∠ HL,nc(ω)/ω. These delays are most significant for components with short periodicities, where τ/T can be up to 0.21 for the T = 1 day component. To overcome this, the data can be processed twice with the same filter in forward and backward directions to negate the phase shifts, but this procedure can be done only on pre-recorded data.

[15] The Wiener filter design is mathematically similar to the existing exponential filter [Wagner et al., 1999], θfilt[n] = (1 − α)θfilt[n − 1] + αθsat[n] for estimating profile soil water from topsoil retrievals. As the first-order approximation and η < < γ < 1 (see later), equation ((6)) becomes θfilt[n] ≈ (1 − γ)θfilt[n − 1] + γθsat[n], but with notable contextual difference, where α is related to the characteristic time of the percolation process while γ is related to the signal-to-noise ratio.

[16] To utilize these filters, we require estimates of the model/filter parameters {A,E,η}. As shown in Figure 1, one approach is to fit the model (equation ((2))) to the power spectrum of the historical SM time series via minimization of the L1-norm of |Θdata|2 − |Θmodel|2. This objective function is chosen over least squares so that the routine is less sensitive to spectral peaks, and fitting to Welch's PSD is preferred over a power spectrum as the filter parameters are independent of the data length and Hamming window size. Finally, we note in the supporting information that solving for the amplitude E provides a standalone method to estimate the variance of the stochastic random error in the satellite data.

4.2 Bandstop Filter

[17] Our empirical studies of satellite data have revealed the presence of extraneous resonant peaks. Owing to the linearity of the FT, we can extend the model with another additive term, ∑ kδ(ω − ωk), where ωk are frequencies of these modes. A series of bandstop filters centered at {ωk} can be used to systematically remove these resonant modes, and the simplest filter design is

display math(7)

where math formula when ω = ωk produces the desired gain response. The constant H0 is determined by setting a boundary value math formula, and the parameter r (r < 1) controls the bandwidth of the filter that increases with 1 − r, which can be optimized based on the width of the resonant mode. Finally, we implement the causal filter in the time domain with

display math(8)

5 Method Evaluation

[18] The proposed method is applied to the three satellite products to improve their agreement with the ground data. Its greatest impact is on reducing the mismatch in timing and shape of the satellite and in situ time series. By focusing on attenuating the high-frequency components, the de-noising leaves the temporal mean unchanged and variance largely unaffected. Therefore, Pearson's correlation R is used as the performance metric in the evaluations and its 95% confidence interval (CI) is calculated using Fisher's transform. Correlations for SM anomalies, where the seasonal trend is removed from the satellite and ground data using a 5 week moving window [Albergel et al., 2012], are also considered.

[19] For a given site, four steps are taken to apply de-noising. (1) Estimate the PSD of the historical SM time series with Welch's method and identify the frequencies ωk of the false resonances with a peak-finding algorithm. (2) Apply the bandstop filters to the target data using equation (8). (3) Calibrate the Wiener filter parameters {A,E,η} by fitting the model (equation (2)) to the historical data. (4) Apply the calibrated filter to the pre-recorded or real-time data using equation (4) or (6) for reanalysis or real-time applications. In effect, a low-pass filter and multiple bandstop filters are combined in a cascade to yield an output spectrum, Θfilt(ω) = HL(A,E,η)[∏kHB (ωk)]Θsat (ω). No ancillary or in situ data is required in the process.

[20] The de-noising of the satellite data over the Kyeamba site will first be discussed. In Figure 2, red curves show reduction of these resonances of T < 200h with bandstop filters (Step 2). In this case, the bandstop filter was not applied to ASCAT due to the absence of pronounced false resonances. Step 3 yields the calibrated Wiener filter parameters listed in Table 1. With a long historical record of >4 years, AMSR-E and ASCAT show good agreement with site-specific parameters A and η of the in situ data. By contrast, the fitted A value for SMOS (~16 months) is overestimated, highlighting the limitation of using shorter records to estimate the low-frequency end of the PSD. In other words, while the estimates of parameters E and η converge with relatively short historical record (<2 years), an accurate estimation of A requires longer time series. To overcome this, it might be reasonable to use calibrated parameters from another data set (e.g., AMSR-E) for filtering SMOS data, given their similar satellite footprints; however, further evaluation is needed given differences in penetration depths at L and C/X-bands.

Figure 2.

Changes to the original satellite PSD (grey) de-noised with bandstop filters (red), causal (blue), and/or noncausal Wiener filter (green). They are compared with the PSD of the in situ data from Kyeamba (black). Insets show smoothing of the satellite data by the filters in time domain. (a) AMSR-E, (b) SMOS, and (c) ASCAT.

Table 1. Fitted Parameters of Equation ((2)) to the Estimated PSD of In Situ and Satellite SM in Figure 1a
Data SetWηAEγ
  1. aThe calculation is performed on the average of six Kyeamba stations (K*) and on single station K1. The Hamming window sizes W (in units of days) used in PSD estimation are chosen as the minimum of 3 years or the total data length. The parameters η and γ are in units of rad/h, and A and E are in units of m3m−3 h1/2.
In situ (K*)10950.00100.00280-
In situ (K1)10950.00090.00250-

[21] Finally, in Step 4, the low-pass filters reduce the amplitude of the high-frequency components and smooth the time series (insets). We note that the mismatch between the fitted model and the ground data (Figure 1a) highlights a tendency for oversmoothing such that the filtered spectra fall below the in situ spectrum. This could potentially be improved with a better rainfall forcing model that accounts for nonzero κ in the ω− 2 − |κ| scaling of typical SM spectra. Nevertheless, we observe statistically significant improvements (at 95% confidence level) in correlations between AMSR-E and in situ data, from R0 = 0.727[12] (i.e., ±0.012) to Rfilt = 0.776[10] when bandstop filters are applied and to Rfilt = 0.808[9] following the low-pass filtering. Similar levels of improvements are found with the SMOS data where correlations increased from 0.783[25] to 0.825[21] and 0.863[17] after Steps 2 and 4, respectively. By comparison, the benefit of low-pass filtering the ASCAT data is more modest with R improving from 0.654[21] to 0.691[19].

[22] Extending the evaluation to other monitoring stations within the catchment, Figures 3a–3c show that on average across all the evaluation sites, the bandstop filters yield modest but consistent improvements of 〈Rfilt − R0〉 = 0.021 and 0.031 for AMSR-E and SMOS, respectively. Further improvement is achieved through the causal low-pass filtering, notably by a further 0.054 (AMSR-E), 0.099 (SMOS), and 0.050 (ASCAT) on average. Small temporal delay of SM dynamics induced by the causal filter does not result in worse performance than the noncausal counterpart. Finally, we eliminate seasonal influences on reported R scores by repeating the analyses with SM anomalies in Figures 3d–3f. With preprocessed R0 = 0.45–0.51 on average, the bandstop filtering improves AMSR-E and SMOS data by 0.040–0.045 in correlations and combined with the low-pass filtering to yield an overall increase by 0.154–0.175. In general, improvements in correlations are greatest for low pre-processed correlation and were modest where that correlation was strong.

Figure 3.

Change in correlation between satellite and in situ data after the treatment of bandstop filters and/or Wiener filters. (a–c) Evaluation of the original time series. (d–f) Evaluation of the SM anomaly time series. The error bars denote the 95% CIs, calculated using Fisher's transform and summation in quadrature.

[23] The methodology and these results should however be taken with care. Since our analyses and filter designs are based on the physical water balance model (equation ((1))), any data periods during which SM dynamics deviate from the model (e.g., frozen soils) should be removed prior to Step 1. The formalism also treats extraneous signals from vegetation and background emission, and perturbations to soil emission due to vegetation masking as noise; thus, applying the de-noising schemes renders the filtered time series more representative of an in situ SM and less representative of a remotely sensed surface layer. Two interpretations of the results are therefore possible. First, the filters remove erroneous signals in the data sets and non-SM information of the surface has been treated as random noise. Alternatively, one can interpret that information has been added and the filters create a new SM product with the Brownian-process property expected of water balance model.

6 Conclusions

[24] Spectral analysis is a promising method for studying the error characteristics of long global SM data sets. This study provides evidence of the presence of false resonances and stochastic noise that degrade the correlations between the SM retrievals and in situ observations. Based on a semiempirical water balance model for erroneous satellite observations in the Fourier domain, designs of near-optimal (Wiener) filters and bandstop filters are proposed and the variance of the stochastic error can also be estimated. While the characterization of the model relies on long-term records, data sets from other sensors can offer alternatives to aid in defining model parameters pertaining to on-site input forcing and moisture loss. By removing extraneous noise and errors, the proposed de-noising filters are capable of improving the correlations between three distinct satellite products and in situ data. While the model/filters might be improved with more realistic spectral descriptions of precipitation and loss processes, our results give cause for optimism that the current scheme is relevant to a broader range of SM retrieval records dating back to 1978 and to future observations in operational environments.


[25] We are grateful to two anonymous reviewers and Gabrielle De Lannoy for their valuable comments on the manuscript. This research was conducted with financial support from the Australian Research Council (Linkage project LP110200520).

[26] The Editor thanks two anonymous reviewers for their assistance in evaluating this paper.