## 1. Introduction

### 1.1. Soil Moisture in the Climate System

[2] Surface soil moisture is a key state variable which integrates much of the land surface hydrology and exerts considerable control on several land-atmosphere exchanges. It is the fastest component of the continental water cycle with a residence time of just a few days. Root zone (5–100 cm) soil moisture determines how much water is available to vegetation, thereby influencing the latent heat flux and hence the surface energy balance.

[3] A consistent data set of soil moisture, ground temperature and surface fluxes would enable a detailed study of land-atmosphere interactions and the role that they play in the climatic system. Global or regional in situ measurements at the scales required to study hydrometeorology (10 km) and hydroclimatology (30–50 km) would require networks that are logistically and economically infeasible. Remote sensing, on the other hand, is ideal for obtaining data at these scales and globally.

### 1.2. Remote Sensing of Soil Moisture

[4] Passive microwave radiometry has long been recognized as having the potential to measure soil moisture on regional and global scales. Low-frequency passive microwave radiation (1–3 GHz or L band) is particularly suitable as there is a sharp contrast in the dielectric constants for water and soil in this region of the spectrum [*Ulaby et al.*, 1986; *Wang and Schmugge*, 1980]. Furthermore, measurements in this frequency range are relatively unaffected by clouds and can penetrate light to moderate vegetation.

[5] As early as the late 1960s and early 1970s small studies were undertaken to determine the feasibility of using microwave brightness temperatures to estimate soil moisture. L band remote sensing of soil moisture can be used to estimate volumetric water content in the top 5 cm of the soil column with a precision of a few percent [*Jackson et al.*, 1995; *Jackson*, 1997; *Jackson et al.*, 1999]. Future pathfinder missions such as NASA's Hydrosphere State (Hydros) and ESA's Soil Moisture and Ocean Salinity (SMOS) will provide global L band observations from which a global soil moisture data set can be obtained [*Entekhabi et al.*, 2004].

### 1.3. Land Data Assimilation

[6] While remote sensing offers the advantage of global coverage, the temporal resolution of observations is limited by the revisit time. The Hydros satellite will revisit a given location on the Earth's surface just once every 2–3 days. Furthermore, the L band brightness temperature relates to the soil moisture at the surface (top 5 cm) and yields no information on the root zone. Forcing a land surface model with meteorological data can produce soil moisture and temperature estimates, along with the associated fluxes at the temporal resolution of the model yielding information on the diurnal cycle. However, such unconstrained simulations are subject to the errors in model structure and forcing uncertainty. Data assimilation offers a means to combine the advantages of modeling with those of remote sensing.

[7] Data assimilation techniques have been used in meteorology and oceanography for decades. A comparison of the various techniques is provided by *Ghil and Manalotte-Rizzoli* [1991]. *Courtier et al.* [1993] compiled a list of significant papers in the application of data assimilation techniques to meteorology problems. Data assimilation techniques can be roughly divided into two categories; variational techniques and those derived from the classic Kalman filter. Both methods have been applied to hydrological research in recent years.

[8] The central concept in variational data assimilation is the adjoint model. This is obtained by linearizing the forward model along a trajectory producing the tangent-linear model, and obtaining the adjoint. Thus variational techniques require that the model be differentiable. *Lorenc* [1986] describes various variational techniques which have been applied in meteorology. Several applications in oceanography and meteorology are discussed by *Ghil and Manalotte-Rizzoli* [1991] and *Wunsch* [1996]. Variational techniques have been successfully applied to hydrological applications in recent years [*Castelli et al.*, 1999; *Boni et al.*, 2001; *Reichle*, 2000; *Reichle et al.*, 2001a, 2001b; *Margulis*, 2002]. 4DVAR, in which observations distributed in space and time are used with knowledge of temporal evolution of the state, is particularly suited to our problem as demonstrated by *Reichle* [2000], but it requires development of the adjoint. While automatic adjoint compilers are available [*Giering and Kaminski*, 1998], they can prove difficult to use with large and intricate numerical models, and typically involve extensive tuning and sensitivity studies to validate the adjoint model generated. A means is sought by which temporally distributed observations may be used in a smoothing approach like 4DVAR without resorting to a simplified land surface model.

[9] The classic Kalman filter as discussed by *Gelb* [1974] provides the optimal state estimate for linear systems. It is therefore of little use in hydrological applications where the physical model equations are often nonlinear and contain thresholds. In the extended Kalman filter for nonlinear systems [*Gelb*, 1974; *Jazwinski*, 1970], approximate expressions are found for the propagation of the conditional mean and its associated covariance matrix. The structure of the propagation equations is similar to those of the classic Kalman filter for a linear system, as they are linearized about the conditional mean. The extended Kalman filter has been successfully applied to the land data assimilation problem [*Entekhabi et al.*, 1994; *Galantowicz et al.*, 1999; *Walker et al.*, 2001; *Walker and Houser*, 2001; *Crosson et al.*, 2002], but its use in this application would require derivation of a tangent linear model to approximate the land surface model, as well as techniques to treat the instabilities which might arise from such an approximation. *Ljung* [1979] performed a convergence analysis of the extended Kalman filter and demonstrated the potential for divergence or bias in estimates in nonlinear systems. *Nakamura et al.* [1994] encountered such instability in their application of the extended Kalman filter to soil moisture estimation.

[10] An alternative sequential estimation technique for nonlinear problems was proposed by *Evensen* [1994]. In the ensemble Kalman filter (EnKF) an ensemble of model states is integrated forward in time using the nonlinear forward model with replicates of system noise. At update times, the error covariance is calculated from the ensemble. The traditional update equation from the classical Kalman filter is used, with the Kalman gain calculated from the error covariances provided by the ensemble. The EnKF has been successfully implemented by *Evensen and Van Leeuwen* [1996], *Houtekamer and Mitchell* [1998], and *Houtekamer and Mitchell* [2001] and has already been used to merge L band observations with model output to estimate soil moisture [*Reichle et al.*, 2002; *Margulis et al.*, 2002; *Crow*, 2003; *Crow and Wood*, 2003]. Research in ensemble techniques has yielded innovative methods of improving estimates and reducing the computational burden [*Segers et al.*, 2000; *Heemink et al.*, 2001; *Verlaan*, 1998; *Verlaan and Heemink*, 2001]. The advantages and disadvantages of the EnKF are compared to those of variational techniques in Table 1.

Ensemble-Based Filters | Variational Techniques | |
---|---|---|

Advantages | Any model can be used. Model does not need to be differentiable. Noise can be placed anywhere, for example, on uncertain parameters and forcing. Noise can be non-Gaussian and nonadditive. | Uses all data in a batch window to estimate the state. |

Disadvantages | Estimates are conditioned on past measurements only. | Model must be differentiable to obtain tangent-linear model. Process noise can only be additive and Gaussian. Changes to model require that adjoint be obtained again. |

[11] In the past, soil moisture observations have typically been gathered during field experiments such as the Southern Great Plains Field Experiments (SGP97 and SGP99) and Soil Moisture Experiments in 2002 (SMEX02) and 2003 (SMEX03). Smoothing is ideal for analyzing historic data or data which are not available in real time, as is the case with data from field experiments or exploratory missions such as Hydros and SMOS. Smoothing involves using all measurements in an interval **T** = [0, *T*], to estimate the state of the system at some time t where 0 ≤ *t* ≤ *T*, so that the state estimate at a given time is determined by including information from subsequent observations. It will be argued that an ensemble-based smoothing (or batch estimation) approach is most suited to the soil moisture estimation problem.

[12] Results from the EnKF experiment [*Margulis et al.*, 2002] suggest that the estimate could be improved through the inclusion of subsequent observations. Precipitation events divide the study interval into a series of dry-down events. In estimating soil moisture at a given time, one is estimating a single point value in a series. It is intuitive that the manner in which that series evolves in the future is related to the state at the estimation time. Future observations provide information on the shape of this series in the future and so contain useful information on the current state. Correlation between the states and the observations decreases with depth as the observations relate to the surface conditions. Consequently the impact of the observations is lessened with increasing depth. This means that it takes longer to correct for spurious initial conditions at depth than close to the surface. As the impact of the observations eventually penetrates the deeper layers, the latent heat flux estimate is seen to approach the observed values. Difficulty in estimating the root zone soil moisture results in poor initial estimates of the latent heat flux [*Margulis et al.*, 2002]. If including subsequent observations can improve on the initial conditions at depth, it would result in improved latent heat flux estimates.

[13] In the following section an ensemble-based smoother will be developed as an extension of the conventional EnKF which, by including information on how the state evolves beyond the estimation time, should yield improved estimates of the soil moisture at the surface and at depth.