Towards a Kalman Filter based soil moisture analysis system for the operational ECMWF Integrated Forecast System



[1] This paper presents the future European Centre for Medium-Range Weather Forecasts soil moisture analysis system based on a point-wise Extended Kalman Filter (EKF). The performance of the system is evaluated against the current operational Optimal Interpolation (OI) system. Both systems use proxy observations, i.e., 2 m air temperature and relative humidity. The spatial structure of the analysis increments obtained from both analyses are comparable. However, the EKF-based increments are generally higher for the top soil layers then for the bottom layer. This gradient better reflects the underlying hydrological processes in that the strongest interaction between soil moisture and bare soil evaporation and transpiration through vegetation should occur in top layers where most of the roots are located. The impact on the forecast skill, e.g., air temperature at 2 m and 500 hPa height, is neutral. The new EKF surface analysis system offers a range of further development options for the exploitation of satellite observations for the initialization of the land surface in Numerical Weather Prediction.

1. Introduction

[2] Current operational soil moisture analysis systems in Numerical Weather Prediction (NWP) are based on analyzed or observed screen-level variables, namely 2 m temperature (T2m) and relative humidity (RH2m). Optimal Interpolation (OI) algorithms are used operationally at Météo France [Giard and Bazile, 2000], the European Centre for Medium-Range Weather Forecasts (ECMWF) [Douville et al., 2000], the Canadian Meteorological Centre [Bélair et al., 2003] and in the High Resolution Limited Area Model (HIRLAM) [Rodriguez et al., 2003]. The German Weather Service (Deutscher Wetterdienst) adopted a “simplified” Extended Kalman Filter [Hess, 2001].

[3] Since 2 m temperature and relative humidity are only weakly related to soil moisture, the hydrological soil modules of the forecasting systems are hardly constrained in the analysis through these proxy observations. Earlier studies confirmed the characteristics of analysis schemes using screen-level variables [Drusch and Viterbo, 2007]: The observations are efficient in improving the turbulent surface fluxes and consequently the weather forecast on large geographical domains. The quality of the resulting soil moisture fields are often not improved.

[4] However, there is an increasing demand for accurate information on water availability, e.g., from the carbon community, water management authorities, or for agricultural applications. On the other hand, new space-borne observation techniques have been developed to provide more direct and accurate observations of soil moisture, e.g., ASCAT (Advanced Scatterometer)[Bartalis et al., 2007], SMOS (Soil Moisture and Ocean Salinity) [Kerr et al., 2001], and the Soil Moisture Active Passive (SMAP) mission.

[5] Operational data assimilation systems need to be modified to make optimal use of satellite-based land surface information. In this paper we outline the strategy for developing and implementing an advanced analysis system. ECMWF's future Extended Kalman Filter (EKF) based analysis system is introduced and the performance is compared against the current OI system.

2. Motivation and Implementation Strategy

[6] At ECMWF, the surface analyses are disconnected from the atmospheric 4D-Var analysis to ensure that the spatial resolution and the assimilation window can be optimized for land applications. For the current OI two main limitations have been identified [e.g., Mahfouf et al., 2009]: (i) The analysis is performed for fixed times at 00, 06, 12, and 18 UTC when the synoptic observations are obtained. Including data from satellites requires a data assimilation system that can take variable observation times into account; (ii) The optimal coefficients and empirical correction functions were derived from a limited set of simulations using a single-column version of a coupled land-surface-atmosphere model. They will not be optimal for different meteorological conditions or areas with different soil and/or vegetation types.

[7] The proposed EKF offers the possibility to include measurements at observation time and uses an explicit observation function via the Jacobians. The transition to a new EKF-based soil moisture analysis system can be performed in several steps:

[8] 1. Development of a 1D prototype analysis system and initial performance checks. This work has been performed within the E-LDAS (European Land Data Assimilation System) project [Seuffert et al., 2003, 2004; van den Hurk et al., 2008] using 2 m temperature and relative humidity observations and L-band brightness temperatures from an airborne sensor.

[9] 2. Implementation of the new data assimilation system into the forecasting system and evaluation against the operational system.

[10] 3. Operational implementation after system optimization.

[11] 4. Introduce the satellite observations, including data pre-processing, development of a bias correction scheme, and separate evaluation.

[12] In this paper we report on step 2. It should be noted that other advanced data assimilation methods, e.g., an Ensemble Kalman Filter, may offer similar advantages and could be used in NWP applications as well.

3. ECMWF Soil Moisture Analysis

[13] Within ECMWF's Integrated Forecast System (IFS), the surface analysis provides the initial conditions at the forecast model's lower boundary. The surface analysis is independent from the 4D-Var atmospheric analysis.

[14] The operational IFS soil moisture analyses are produced daily at 00, 06, 12, and 18 UTC, using an OI method. They are scheduled to run immediately after the two main 4D-Var atmospheric analyses that cover the periods from 09 to 21 UTC and 21 to 09 UTC. Short-range forecasts from the most recent analyses provide the background fields for both the atmosphere and the surface. More details on the OI system, the empirically derived quality checks and its implementation are given by Douville et al. [2000] and Drusch and Viterbo [2007].

[15] In order to optimally combine conventional observations with satellite measurements, an advanced surface data assimilation system has been implemented in the IFS. The core of the system is a simplified EKF, which is based on minimization of a cost function J as in variational methods under a linear approximation [Seuffert et al., 2004]. The simulated soil moisture in the three root zone layers (state vector x) is adjusted by minimizing J, optimally combining the information from the model forecast of x and observed parameters (T2m, RH2m, and/or satellite derived soil moisture and/or brightness temperatures), and the observation vector y:

equation image

where H is the observation operator and xb is the background state vector. For small dimension estimation problems, and under the tangent linear hypothesis, the minimum of J (∇J = 0) can be algebraically obtained through matrix manipulation. The solution for the analyzed state at time t is:

equation image

with H the Jacobian of the observation operator and K the gain matrix:

equation image

In the proposed simplified EKF, H can be approximated by a one-sided finite difference, assuming a quasi-linear problem close to the background state. Perturbed model runs, as described in section 4.1 are used to determine H. Since the observation operator includes the model propagation the Kalman gain is evolving even when the background error covariance matrix is set to fixed values. The EKF will be applied point-wise, i.e., for each model grid point separately.

4. Numerical Experiments and Results

[16] In order to evaluate the new surface analysis scheme we set up two data assimilation experiments using the OI and the EKF surface analyses, respectively. Both experiments use the screen-level parameters to correct the soil moisture forecast. They are based on ECMWF's IFS version 33R1 including the atmospheric 4D-VAR analysis system. The horizontal resolution for the experiments is set to T159 (∼125 km) with 60 vertical levels in the atmosphere. The experiments start from the same initial (operational) model state at 1 May 2007, 0000 UTC and run for a 31 day period.

[17] A 6-hour assimilation window is used for the surface analysis to provide a direct comparison with the OI and the EKF experiments. In both experiments the standard deviations of observation errors are set to σT = 2K, σRH = 10%. The EKF opens the possibility to up-date the background error and to account for errors in soil and/or vegetation properties in B. However, in order to perform a rigorous comparison between the OI and the EKF soil moisture analyses using simulations for a relatively short period of one month, this option is not activated and the background error covariance matrix B is set constant with standard deviations of σB = 0.01 m3m−3 for the two experiments. The soil moisture analyses are only performed in snow-free areas.

4.1. Jacobians of the Observation Operator

[18] The Jacobian H of the observation operator is expressed as:

equation image

with tobs being the time of observation and t0 the forecast basetime. The elements of the Jacobian matrix are estimated in finite differences by perturbing individually each component xj of the control vector x by a small amount δxj. A perturbed model integration allows the j-th column of this matrix to be constructed as:

equation image

In a well behaved deterministic system the Jacobians H+ (associated with a positive perturbation) and H (associated with a negative perturbation) should have similar values, with H = (H+ + H)/2 being independent of δx and small values of ∣H+H∣.

[19] The sensitivity of the Jacobians of T2m, RH2m to the amplitude of the soil moisture perturbations (δxj) in the three top soil layers are shown in Figure 1. The sensitivity of T2m with respect to soil moisture is mostly negative. This means that an increase in soil moisture will reduce the value of the screen level temperature. Soil moisture and RH2m are positively correlated. The mean values of the Jacobians for T2m, RH2m are larger for the surface layer than for the deeper layers indicating that the assimilation will be more effective in modifying the surface layer in order to fit the observations. We obtain non-zero values of ∣H+H∣ for very small perturbations because of numerical instabilities in the land surface model. Large perturbations can cause strong (non-linear) effects, which would also result in high values for ∣H+H∣. Because of model nonlinearities, large perturbations should lead to increased ∣H+H∣; however, this expected trend is reduced by the impact of the ad hoc quality checks (see section 4.2) which eliminate excessive Jacobians associated with large perturbations. Figure 1 indicates that soil moisture perturbations of 0.01 m3m−3 result in lowest values for ∣H+H∣ and are therefore most appropriate.

Figure 1.

Elements of the Jacobian matrix of the observation operator for the global average (H = (H+ + H)/2)) and the absolute difference (∣H+H∣) of T2m and RH2m components, calculated for positive (H+) and negative (H) soil moisture perturbations for the uppermost three soil layers.

4.2. Gain Components

[20] The most relevant aspect to assess differences between the OI and the EKF surface analyses are the gain matrix elements Ki,j. Prior to the gain element computation, two quality checks have been introduced for the EKF: Analysis increments are set to 0 (i) if the Jacobian (equation (4)) becomes larger than 50 K/m3m−3 or 5%/m3m−3 for the T2m or RH2m components, respectively, or (ii) if the resulting soil moisture increment for any layer is larger than 0.1 m3m−3. The first case occasionally occurs in dry areas (e.g., in the Northern Sahara and Central Australia) when very low observed values of RH2m and very low modelled first guess soil moisture lead to a high sensitivity of the corresponding Jacobians.

[21] Figure 2 shows the T2m and RH2m gain components for the 12UTC analyses (01 May 2007) from the OI and EKF experiments for the top layer. Results are presented for one day to ensure comparable initial conditions and to avoid compensating effects through temporal averaging. As expected, gain coefficients are about one order of magnitude larger for RH2m than for T2m components for both, the OI and the EKF experiments. They exhibit similar patterns with low values over mountainous areas, in presence of snow, under freezing temperatures and in desert areas. The OI and EKF coefficients also depict a strong diurnal cycle. In the OI formulation this dependency is controlled by accounting for the solar zenith angle in the gain function and is clearly visible in the OI gain component which is limited to the day time region between 20°W and 130°E. In the EKF there is no explicit relationship between the gain and the solar zenith angle or other ad-hoc rejection criteria. The link between the screen level and soil variables is provided through turbulent surface fluxes, following the diurnal cycle. In contrast to the OI experiment, the assimilation is not switched off at night in the EKF and for the 12UTC analysis low values of gain are computed over America and Australia.

Figure 2.

Gain components for the first soil layer as obtained form (a and c) the OI and (b and d) the EKF experiments. The T2m components are shown in Figures 2a and 2b in % m3m−3K−1; the RH2m relative humidity components are shown in Figures 2c and 2d in % m3m−3 %−1.

[22] Low values of EKF gain coefficients over the tropical rain forest and in high latitudes highlight the ability of the EKF in automatically masking situations where soil moisture is only weakly linked to the near-surface atmosphere (e.g., soil moisture is close to or above field capacity). Here the OI coefficients, having no dependency on the soil moisture state, are significantly larger. In agreement with Mahfouf et al. [2009], the gain coefficients are lower by a factor 2 to 4 between OI and EKF for both the T2m and the RH2m components. In the OI analysis the gain coefficients are almost identical for all layers.

4.3. Soil Moisture Increments

[23] The amplitude and evolution of the analysis increments are a useful diagnostic of the forecast system. A well calibrated, physically sound model is characterized by small, randomly distributed analysis increments. Large and/or persistent increments indicate systematic model and/or observation errors. Considering that the analysis of the OI experiment effectively corrects model deficiencies and that we use the same forecast model in the data assimilation experiments, we expect similar spatial and temporal patterns in the mean analysis increments. Figure 3 shows analysis increments from the OI and EKF experiments for 1 May 2007 at 12 UTC. In general, increments from the EKF experiment show similar spatial patterns as the ones from the OI runs. In contrast to the OI, substantial EKF increments are allowed over the American and Australian continents as already indicated in the gain coefficients. While the analysis increments in the OI are almost constant for the three top layers, they are becoming smaller with depth in the EKF. Since the thickness of the soil layers increases with the soil depth (0 − 0.07 m, 0.07 − 0.21 m, and 0.21 − 1.0 m), much larger (unrealistic) increments are computed for deep soil layers than for surface layers when the vertically integrated soil moisture is considered. With the EKF surface analysis, relatively large coefficients and increments are found for the uppermost soil layer, and they decrease with the soil layer depth. Since 60 to 73% of the roots are located in the top two model layers, the reduction of EKF increments at depth better reflects vertical variations in the sensitivity of transpiration and evaporation to soil moisture.

Figure 3.

Soil moisture analysis increments (in mm) from (a, c, and e) the OI and (b, d, and f) the EKF. Results for the top soil layer, the second layer, and the third layer are shown in Figures 3a and 3b, Figures 3c and 3d, and Figures 3e and 3f, respectively.

4.4. Forecast System Skill

[24] The impact of the surface analysis on the forecast skill has been investigated. The global mean difference between observations and analysis of T2m is shown to be 1.635 K for May 2007 for both experiments. Differences between observations and first guess, i.e., 6- or 12-hourly forecasts, are also very close for the EKF (2.35 K) and the OI (2.36 K) experiments. They indicate a neutral effect of the EKF system compared to the OI, when the two systems use the screen level observations with same observations errors and a fixed B matrix. The impact of the new analysis system on the forecast skill of the standard meteorological parameters, e.g., 500 hPa height and temperature, is also neutral when the forecast range from one to 10 days is considered.

5. Discussion and Conclusions

[25] In order to take full advantage of synoptic data and satellite observations an EKF based surface analysis system has been developed and implemented in ECMWF's IFS. The performance and results of both, the current operational OI and the future EKF, have been analysed. It has been found that the EKF and the OI gain coefficients and soil moisture increments present similar patterns. Generally, the values are lower for the EKF than for the OI. In contrast to the OI, the EKF surface analysis results in different amplitudes of the gain for the different soil layers, which are in agreement with hydrological processes. The impact of the EKF on the forecast skill scores is neutral when compared to the OI. Since both systems use screen level parameters to adjust soil moisture we do not expect substantial changes in the accuracy of the analysed soil moisture fields. However, the EKF offers the possibility to better constrain the water content of the soil through satellite derived surface soil moisture estimates in the future and, eventually, NWP systems will provide more accurate soil moisture estimates.

[26] The computational costs for the EKF are approximately 1000 times higher (in CPU time) than the OI analysis, which uses 2.6 sec CPU time for each analysis cycle at T159. Even with ECMWF's high performance computer systems it will not be possible to use the EKF in it's current form operationally at the full model resolution. The perturbed forecasts for the derivation of the Jacobians have been identified as the main cost drivers. For the operational implementation only the first forecast producing the model first guess will be based on the fully coupled land surface - atmosphere model. The Jacobians will then be derived using the uncoupled land surface model driven with the meteorological forcings from this initial run. The accuracy of this approach was found to be acceptable [Balsamo et al., 2007] and will result in a substantial cost reduction.

[27] Although the analysis system has been designed to analyse soil moisture it also provides the opportunity to include other surface variables. Snow mass, fractional snow coverage, snow, surface and soil temperatures, and the vegetation leaf area index are geophysical parameters that can be observed from space and used through the Extended Kalman Filter analysis system. Since only one perturbed model run with the decoupled land surface model is required to analyse an additional state variable the proposed analysis system presents a very efficient approach with respect to the computational costs and maintenance.


[28] The authors thank Anton Beljaars and Jean-François Mahfouf for their helpful discussions on the EKF surface analysis results. Rob Hine prepared the final versions of the figures.