Optimal multiscale Kalman filter for assimilation of near-surface soil moisture into land surface models

Authors


Abstract

[1] We undertake an alternative and novel approach to assimilation of near-surface soil moisture into land surface models by means of an extension of multiscale Kalman filtering (MKF). While most data assimilation studies rely on the assumption of spatially independent near-surface soil moisture observations to attain computational tractability in large-scale problems, MKF allows us to explicitly and very efficiently model the spatial dependence and scaling properties of near-surface soil moisture fields. Furthermore, MKF has the appealing ability to cope with model predictions and observations made at different spatial scales. Yet another essential feature of our approach is that we resort to the use of the expectation maximization (EM) algorithm in conjunction with MKF so that the statistical parameters inherent to MKF may be optimally determined directly from the data at hand and allowed to vary over time. This constitutes a significant advantage since these parameters (e.g., observation and model error noise variances) essentially determine the performance of the assimilation approach and have so far been most commonly prescribed heuristically and not allowed to evolve in time. We test our approach by assimilating the near-surface soil moisture fields derived from electronically scanned thinned array radiometer (ESTAR) during the Southern Great Plains Hydrology experiment of 1997 (SGP97) into the three-layer variable infiltration capacity (VIC-3L) land surface model. The results show that assimilation significantly improves the short-term predictions of soil moisture and energy fluxes from VIC-3L, especially with regard to capturing the spatial structure of these state variables. Additionally, we find that allowing the statistical parameters of the assimilation algorithm to evolve in time allows for an adequate representation of the time-varying uncertainties in land surface model predictions.

1. Introduction

[2] Understanding the complex spatial and temporal variability of land surface-atmosphere interactions associated with the hydrologic cycle and the exchange of energy fluxes at the land surface is vital to capturing and predicting the overall functioning of atmospheric and climate processes. Currently, land data assimilation systems are being developed for North America and the globe as attempts to improve the accuracy of reanalysis and forecast simulations by numerical weather prediction models http://ldas.gsfc.nasa.gov/). The crux of these efforts is the reduction of errors associated to the prediction of soil moisture storage and energy fluxes from land surface models. In this context, several recent studies have focused on assimilation of radiobrightness temperature or near-surface soil moisture into land surface schemes [e.g., Houser et al., 1998; Reichle et al., 2002a, 2002b; Walker and Houser, 2001; Margulis et al., 2002; Crow and Wood, 2003; Montaldo and Albertson, 2003; Parada and Liang, 2003c, 2003d; S. Chintalapati and P. Kumar, Assimilated soil moisture fields: 1. Spatial variability, submitted to the Journal of Geophysical Research, 2003 (hereinafter referred to as Chintalapati and Kumar, submitted manuscript, 2003); P. Kumar and S. Chintalapati, Assimilated soil moisture fields: 2. Multiscale error propagation, submitted to the Journal of Geophysical Research, 2003 (hereinafter referred to as Kumar and Chintalapati, submitted manuscript, 2003)]. Most commonly, Kalman filtering or one of its variants has been used for this purpose.

[3] While significant progress has been made in evaluating several techniques for assimilation of near-surface soil moisture or radiobrightness temperature into land surface models and in assessing the potential gains associated to them, there still remain several fundamental challenges. Specifically, the need to evaluate inverse covariances within Kalman filtering imposes the assumption of spatially independent near-surface soil moisture fields to attain computational tractability in large-scale problems. Additionally, the statistical parameters within data assimilation methodologies have so far been determined a priori or heuristically and held constant in time. This poses a significant restraint on any assimilation methodology as these parameters (e.g., observation and model error noise variances) essentially determine how much uncertainty is attributed to the remote sensing retrievals and the land surface model predictions when computing the optimal soil moisture estimates. In an operational setting, these parameters should ideally be determined automatically through some optimal metric and allowed to evolve in time to capture the fluctuating nature of the uncertainties in land surface model predictions and possibly in the observations. Yet another challenge arises when having to deal with remote sensing retrievals and land surface model predictions at dissimilar spatial scales within a framework that describes how the variability of near-surface soil moisture evolves with resolution.

[4] This study formalizes previous work by the authors that pertains to the assimilation of remote sensing retrievals of near-surface soil moisture available at scales different from the resolution at which the land surface model predictions are made while being consistent with the spatial structure of the data [Parada and Liang, 2003c, 2003d]. An independent and concurrent study by Kumar and Chintalapati (submitted manuscript, 2003) also addresses the-scale discrepancy issue by utilizing multiscale Kalman filtering [Chou et al., 1994] offline from the assimilation algorithm to upscale near-surface soil moisture retrievals to the resolution of the land surface model and then using extended Kalman filtering (EKF) to perform assimilation with the scaled data. Nonetheless, the approach by Kumar and Chintalapati (submitted manuscript, 2003) does not explicitly consider the spatial dependence of the scaled near-surface soil moisture retrievals and the land surface model predictions, respectively, when deriving the optimal soil moisture states through EKF. Moreover, while Chintalapati and Kumar (submitted manuscript, 2003) show the importance of being able to estimate time-varying optimal statistical parameters when performing assimilation with EKF, the work by Chintalapati and Kumar (submitted manuscript, 2003) and Kumar and Chintalapati (submitted manuscript, 2003) does not treat the issue of how this may be accomplished.

[5] In this article, we demonstrate the use and appealing advantages of an extension of multiscale Kalman filtering (MKF) [Chou et al., 1994; Fieguth et al., 1995; Luettgen and Willsky, 1995; Kumar, 1999] for assimilation of near-surface soil moisture data into the three-layer variable infiltration capacity (VIC-3L) hydrologically based land surface model [Liang et al., 1994, 1996, 1999; Liang and Xie, 2001; Cherkauer and Lettenmaier, 1999]. We pay particular attention to deriving optimal statistical parameters for the MKF framework by means of the expectation maximization (EM) algorithm [Kannan et al., 2000]. Moreover, we explicitly consider the spatial dependence and scaling properties reported to exist for near-surface soil moisture fields [e.g., Rodriguez-Iturbe et al., 1995; Oldak et al., 2002]. The key advantages of the proposed assimilation methodology are that it is amenable to estimation of statistical parameters (e.g., observation and model error noise variances) that vary in time as new observations become available, it explicitly represents the spatial dependence of observations and model predictions and constrains the optimal soil moisture estimates to comply with the existing spatial structure, it is also capable of coping with observations being at distinct spatial scales from model predictions, and it is extremely efficient computationally.

[6] The rest of this paper is organized as follows. Section 2 provides some background on the VIC-3L land surface model and describes our study region and data sources. Section 3 presents the proposed assimilation methodology and provides some background and contrasts to commonly used Kalman filtering techniques. In section 4 we thoroughly evaluate the impacts of assimilation on the predictions of soil moisture and energy fluxes from VIC-3L. We further demonstrate the need and importance of allowing the statistical parameters of the assimilation approach to change in time to capture the fluctuating nature of the uncertainties in land surface model predictions. Last, section 5 summarizes our key findings and presents our main conclusions.

2. Study Region, Data Sources, and VIC-3L Land Surface Model

[7] Our study region is a quarter degree by quarter degree area (34.75° to 35°N, 97.9375° to 98.1875°W) encompassing the Little Washita watershed in Oklahoma, which has been the subject of several intensive field measurement campaigns. During the Southern Great Plains Hydrology experiment of 1997 (SGP97), 16 surrogate images of volumetric soil moisture content for the top 5 cm of soil were derived at 800-m resolution by inverting L band microwave radiobrightness temperature imagery retrieved with electronically scanned thinned array radiometer (ESTAR) at different days within the period ranging from 18 June through 16 July 1997 [Jackson et al., 1995, 1999]. These 16 near-surface soil moisture images were regridded to 1/128° (∼780 m) and assimilated into VIC-3L at a nominal local time of 1000 to 1100 LT in the dates the observations were collected.

[8] A thorough description of the VIC-3L land surface model is given by Liang et al. [1994, 1996, 1999], Liang and Xie [2001], and Cherkauer and Lettenmaier [1999]. Here it suffices to mention that VIC-3L jointly solves the energy and water budgets at the land surface, and that it characterizes the soil column as consisting of three soil layers, which we denote as layers 1, 2, and 3. In setting up the VIC-3L model for this investigation, our intent was to rely solely on data that is commonly available at sites less instrumented than the Little Washita watershed so that we may get a more realistic assessment of the benefits that may be obtained by performing assimilation of near-surface soil moisture into land surface models. The most superficial layer (layer 1) of VIC-3L has been set to a depth of 5 cm for this study to match the approximate penetration of the L band microwave near-surface soil moisture imagery that is assimilated into VIC-3L. One routing and six soil parameters of VIC-3L, which include the depths of soil layers 2 and 3, were automatically calibrated at 1/8° resolution to match daily streamflow observations from 12 July 1994 to 30 April 1997 available from U.S. Geological Survey gauge 07327550 by using the approach described by Parada et al. [2003]. The resulting depths of layers 2 and 3 are 99 cm and 4.97 m, respectively. It is worth stressing that these depths are treated as calibration parameters to allow VIC-3L to better reproduce historical streamflow records and may not reflect the true and unknown field conditions.

[9] The assimilation simulation on VIC-3L was performed at 1/32° resolution (approximately 3.2 km) and at the hourly time step from 1 April 1997 to 31 March 1998. The soil and vegetation information used in this study to run VIC-3L are described by Maurer et al. [2002]. Precipitation, air temperature, wind speed, and atmospheric pressure were taken from data at two observation sites located at (34.9604°N, 97.9789°W) and (34.8830°N, −98.2050°W), respectively, and reported in the hourly surface composite available online at http://www.joss.ucar.edu/gcip/nesob/. The forcing data were uniformly imposed over the entire domain for the assimilation simulation. Hence the effects of the spatial variability of the forcing data, and particularly of precipitation, in determining the spatial structure of near-surface soil moisture are not reflected in this study.

3. Optimal Multiscale Data Assimilation Algorithm

[10] In this section we summarize the mathematical and statistical constituents of the proposed data assimilation paradigm. To facilitate the understanding of our methodology and to ease comparisons to previously used data assimilation techniques, we start by providing a brief sketch of the Kalman filtering framework, which we then contrast to the approach undertaken in this study. The notation section provides a convenient summary of the notation employed in the description of the proposed data assimilation framework.

3.1. Kalman Filter

[11] In Kalman filtering, equation (1a) is the dynamic equation that describes the temporal evolution of the hidden or unobservable state, xt, in time and equation (1b) is the so-called observation equation, which relates observations, yt, to the hidden state as shown below [Digalakis et al., 1993]:

equation image
equation image

where subindex t denotes time. We additionally assume that wt and vt are uncorrelated zero-mean Gaussian vectors with covariances

equation image
equation image

where δt,t is the Kronecker delta, E[·] is the expectation operator, superscript T connotes the transpose of a matrix, and subindices t and t′ connote time instances. To initialize the Kalman filter, we further specify the distribution of the hidden state at time t = 0 as x0N0, Po), where N(μ, P) is used to refer to a normally distributed random variable with mean μ and covariance P.

[12] In the context of data assimilation of soil moisture or radiobrightness temperature into land surface models, xt may be taken as a vector of the true unknown soil moisture states at time t for various depths along the soil profile. In such a scenario, Ft and Gt serve as piece-wise linear approximations to the underlying physics describing the evolution of soil moisture within a given land surface parameterization, and wt is a Gaussian noise term intended to capture the uncertainties inherent to the land surface model predictions. Two approaches have so far been taken to specify the observation equation in studies concerned with assimilation of near-surface soil moisture. The first of these, denoted here as indirect approach, involves the use of a radiation transfer scheme to relate observations of radiobrightness temperature to near-surface soil moisture [e.g., Reichle et al., 2002a; Margulis et al., 2002; Crow and Wood, 2003]. In this case, Ct and Dt constitute piece-wise linear representations of the physics inherent to the radiation transfer scheme being utilized. Alternatively, several researchers have adopted what we refer to as a direct approach, which amounts to assimilating the surrogate near-surface soil moisture products derived offline from the assimilation scheme by inverting remotely sensed radiobrightness temperature imagery [e.g., Walker and Houser, 2001; Reichle et al., 2002b, Chintalapati and Kumar, submitted manuscript, 2003]. In both approaches to assimilation, the observations are assumed to be noisy and their uncertainty is represented through the observation noise term vt.

[13] The Kalman filtering equations provide a framework for obtaining the best linear or least squares estimate of the hidden state xt conditional on all observations up to time t. Moreover, a significant advantage of the filter over direct insertion or other heuristic methodologies is that it also yields an estimate of the uncertainty or variance of its optimal estimate. Let us write the mean and covariance of xt conditional on all observations up to an arbitrary time t′ as

equation image
equation image

Then, equation imagett constitutes the optimal linear estimate of xt given all observations up to time t′ = t. Likewise Ptt is the covariance of the hidden state about the optimal estimate equation imagett, and hence it represents the uncertainty of equation imagett as the optimal estimate for xt. Both equation imagett and Ptt may be obtained recursively by applying the Kalman filtering equations [e.g., Digalakis et al., 1993], which essentially perform a weighed average of the soil moisture states obtained from two sources: the piece-wise linear representation of the land surface model physics given in equation (1a) and the observations available through equation (1b). The weight given to each source is inversely proportional to its uncertainty.

[14] Let us now examine and list some of the assumptions and limitations that must be born in mind when applying the Kalman filter for assimilation of near-surface soil moisture into land surface models. It is important to stress that the points to be made in the following discussion apply to traditional Kalman filtering, extended Kalman filtering (EKF), and ensemble Kalman filtering (EnKF).

[15] First, the Kalman filter has several statistical parameters that must be specified in a sensible manner to obtain meaningful results. These include μ0, Po, Qt, Rt. Since wt and vt represent the uncertainties in the soil moisture estimates obtained from the land surface model physics and the observations, respectively, their variances (Qt and Rt) essentially determine how much weight the Kalman filter places on each of these two sources when obtaining the optimal soil moisture estimates and their corresponding uncertainties. This clearly highlights two key points. First, it is crucial to obtain reliable estimates of Qt and Rt. Second, we expect the degree of uncertainty in the predictions from land surface models to vary over time (see section 4). Hence we would ideally like to avail ourselves with an objective and automatic procedure to optimally adjust the statistical parameters inherent to the Kalman filter to capture how the accuracy of land surface models evolves in time. Unfortunately, estimation of time-varying parameters for the traditional Kalman filter and some of its variants (e.g., EKF, EnKF, etc.), all of which rely on the temporal propagation of hidden state covariances, introduces complex challenges [Digalakis et al., 1993]. Thus, in studies concerning the assimilation of near-surface soil moisture into land surface models it is common practice to heuristically prescribe these parameters and to keep them constant for all times. When doing this, one obtains the least squares estimates equation imagett and Ptt for a fixed parameter set, but not necessarily the optimal estimates over all possible values of Kalman filter parameters.

[16] Second, because of the need to evaluate inverse covariances for the hidden state, the computational complexity of the Kalman filter increases significantly as the spatial dependence of the hidden state becomes stronger. Thus, to attain computational tractability in large-scale problems, it is common practice to assume that the near-surface soil moisture fields are spatially uncorrelated. Nonetheless, there is significant evidence suggesting that near-surface soil moisture and radiobrightness temperature imagery are characterized by persistent spatial dependence over large distances [e.g., Houser et al., 1998; Parada and Liang, 2003a, 2003b]. Being able to capture such dependence within our assimilation algorithms would allow us to further and better constrain the optimal estimates for the soil moisture states.

[17] Third, since vt is specified to be a zero-mean process, the Kalman filter assumes that E[yt] = CtE[xt] + Dt. This amounts to requiring that there be no systematic biases in the land surface predictions and the observations, respectively. Hence the Kalman filter equations presume that the average soil moisture content (or radiobrightness temperature) for a given area at a given assimilation time are the same for the land surface model predictions and the observations. Unfortunately, it is highly probable that the land surface model predictions and the remote sensing retrievals show potentially large discrepancies in mean area values at the short timescales (e.g., hourly) that are of interest when performing assimilation of near-surface soil moisture. This inevitably induces biases in the optimal estimates obtained from the Kalman filter, which lead to changes in the mean area soil moisture content that are unaccounted for in the Kalman filtering equations and beyond those imposed by the water budget.

[18] Fourth, the Kalman filter equations do not explicitly consider situations in which the land surface model predictions and the observations may be at distinct spatial scales nor do they provide means of representing the scaling properties that have been reported to exist in retrievals of radiobrightness temperature and the associated near-surface soil moisture products [e.g., Rodriguez-Iturbe et al., 1995; Oldak et al., 2002; Parada and Liang, 2003a, 2003b].

3.2. Optimal Multiscale Kalman Filter

[19] Here, we briefly outline the fundamental equations that constitute our proposed data assimilation framework, which is based on an extension to the multiscale Kalman filter (MKF) algorithm. We also provide the intuition behind the expectation maximization (EM) algorithm for maximum likelihood estimation of the statistical parameters inherent to our data assimilation protocol. For a more complete treatment of MKF, the reader is referred to Chou et al. [1994], Fieguth et al. [1995], and Luettgen and Willsky [1995]. Further details and motivation for the use of the EM algorithm for MKF is also given by Kannan et al. [2000]. For completeness, Appendices A and B provide a full synopsis of the proposed data assimilation paradigm and its implementation for the VIC-3L land surface model.

3.2.1. MKF-Based Data Assimilation

[20] In data assimilation, the objective is to update the predictions for one or more state variables from a computational model by considering one or more observational data sources. In this context, we may have two kinds of state variables that we wish to update simultaneously. The first of these corresponds to variables for which (1) the observations and the land surface model predictions may potentially be at different spatial scales and/or (2) empirical evidence suggests that a given statistical model may appropriately describe the spatial dependence (and/or scaling properties) of these variables. We refer to these as multiscale states. Conversely, the second kind of variables are associated to a single spatial resolution and designated as single-scale states.

[21] In the context of assimilation of radiobrightness temperature or near-surface soil moisture into land surface models, we may represent the soil moisture content in the near surface (top 5 cm) as a multiscale hidden state for which we have two independent sources of estimates at possibly incongruent spatial resolutions: the land surface model predictions and the remote sensing observations. Furthermore, a significant body of literature has documented that scaling properties exist for remotely sensed L band microwave radiobrightness temperature and near-surface soil moisture imagery [e.g., Rodriguez-Iturbe et al., 1995; Oldak et al., 2002; Parada and Liang, 2003a, 2003b]. On the other hand, the soil moisture contents for deeper soil layers may be thought of as single-scale hidden states for which the predictions from a land surface model constitute our only direct source of information at present and scaling properties cannot be assumed to hold due to the lack of empirical evidence.

[22] We may formalize the above description by letting x(s, ℓ = 1) designate multiscale hidden states, and x(s, 2 ≤ ℓ ≤ L) denote L − 1 different types of single-scale hidden states as formally defined further on. In the application at hand, x(s, ℓ) corresponds to the soil moisture contents for the different soil layers that make up the soil column in VIC-3L. We may schematically represent the 2-D random field x(s, ℓ) as shown in Figure 1, which depicts a multiscale tree. Each level in the tree corresponds to a spatial resolution for the process of interest. In particular, the root node corresponds to the coarsest scale or average over the entire field and the lowest level to the finest resolution for which there are observations. Let us define the abstract index s = (m, i, j) to denote a given node or position in the multiscale tree, where m indexes the resolution and (i, j) the spatial location. We specify m such that it equals M at the coarsest scale or root node (which we also designate as s = root) and decreases to m = 0 at the finest scale. Additionally, let us formally define the index ℓ as being 1 for the multiscale components of x(s, ℓ), such as the moisture content at the near-surface soil layer. On the other hand, let ℓ > 1 for other hidden state variables that we wish to characterize as single-scale processes associated to the scale with index m′, which contains the nodes with indices s′ = {s : s is at scale m′}. In our application, m′ connotes the scale at which the VIC-3L model is run (i.e., 3.2 km), and there is a one-to-one correspondence between the values assigned to ℓ and the indices used to denote the soil layers of VIC-3L. For instance, the true hidden soil moisture contents in layers one, two, and three for grid cell ss′ are designated as x(s, ℓ = 1), x(s, ℓ = 2), and x(s, ℓ = 3), respectively.

Figure 1.

MKF schematic. The abstract index s = (m, i, j) denotes a scale (m) and position (i, j) in the multiscale tree. The coarsest-scale node containing s is connoted γs and termed the parent of s. The children of s are denoted by αhs with h = 1, …, H(s). In our assimilation application a node or grid cell at scale m has spatial dimensions 0.78 · 2m by 0.78 · 2m km2. The VIC-3L predictions are made at scale m = 2, and the ESTAR retrievals of near-surface soil moisture are available at scale m = 0.

[23] We first define the model describing the evolution of the multiscale process (e.g., near-surface soil moisture) from coarser to finer scales as

equation image

where x(s, ℓ = 1) is a vector of dimension dx(s, ℓ = 1) and γs denotes the coarser-scale node containing node s, which is referred to as the parent of s. The process w(s, ℓ = 1) is white, independent of x(s = root, ℓ = 1), and has distribution N(0, Q(s, ℓ = 1)). It is possible to select A(s, ℓ = 1) and Q(s, ℓ) so that x(s, ℓ = 1) has fractal properties, such as those reported to exist for near-surface soil moisture and radiobrightness temperature retrievals obtained from ESTAR [e.g., Rodriguez-Iturbe et al., 1995; Oldak et al., 2002; Parada and Liang, 2003a, 2003b], and power law scaling of its variance can be captured. Specifically, letting

equation image
equation image

with

equation image

corresponds to a long-memory 1/f process displaying a linear decay with slope −η + 1 in a plot of log periodogram versus log frequency [Fieguth et al., 1995]. However, scaling properties need not be imposed on x(s, ℓ = 1) to apply the MKF-based data assimilation methodology advocated here. We may instead obtain optimal estimates of each individual Q(m, ℓ = 1) as shown in Appendix B. Regardless of whether or not we wish to impose fractal properties on the multiscale field, x(s, ℓ = 1), mass conservation dictates that A(s, ℓ = 1) = 1 so that the mean near-surface soil moisture content is preserved from scale to scale.

[24] Next, we establish how single-scale hidden states (e.g., the soil moisture content in deeper soil layers 2 and 3 of VIC-3L) relate to the multiscale process x(s, ℓ = 1) and to one another through equation (6) so that the single-scale states may also be updated by conditioning on all observation sources available:

equation image

for all ss′, and where x(s, ℓ) is a vector of dimension dx(s, ℓ). In the context of assimilation of near-surface soil moisture or radiobrightness temperature into land surface models, L denotes the number of soil layers and it is the index for the deepest such layer. The processes, w(s, ℓ = 2), …, w(s, ℓ = L), are white, independent of x(s = root, ℓ = 1), and have distributions N(0, Q(s, ℓ = 2)), …, N(0, Q(s, ℓ = L)), respectively. F(s, ℓ) and G(s, ℓ) may be specified based on the underlying physics describing soil moisture dynamics in the VIC-3L land surface model. Let θt denote the volumetric soil moisture content in layer ℓ at time t expressed as a fraction. Additionally, let z connote the depth of soil at which layer ℓ concludes (i.e., the depth for the interface between layers ℓ and ℓ + 1). By writing mass balance equations consistent with those used in VIC-3L to determine the soil moisture contents in layers 1, 2 and 3 in explicit finite difference form we may obtain relations

equation image
equation image

where Δt is the time step. The scalars h1,2t and h2,3t depend on the estimates for fluxes and soil moisture states evaluated at time t. Note that if we let x(s, ℓ) = θt+1 in equations (7a) and (7b), we immediately get expressions having the same form as equation (6).

[25] We may have noisy or uncertain observations y(s, ℓ) that can be related to either the multiscale (e.g., near-surface soil moisture) or single-scale (e.g., soil moisture in deeper layers) hidden states through the following equation:

equation image

where y(s, ℓ) is a vector of dimension dy(s, ℓ), and the observation noise, v(s, ℓ) is white, independent of x(s = root, ℓ = 1) and w(s, ℓ), and has distribution N(0, R(s, ℓ)). A distinguishing feature of the proposed MKF-based data assimilation approach is that we treat both the land surface model predictions for near-surface soil moisture and the remotely sensed imagery as observation sources. Hence, in this study we have two observation sources for near-surface soil moisture available at each assimilation time: (1) the near-surface soil moisture as predicted from VIC-3L at 3.2-km resolution, denoted as yVIC(s, ℓ = 1); and (2) the near-surface soil moisture observations derived from ESTAR at 800-m resolution, designated as yESTAR(s, ℓ = 1). The variances of the observation noise terms associated to yVIC(s, ℓ = 1) and yESTAR(s, ℓ = 1) are connoted as RVIC(s, ℓ = 1) and RESTAR(s, ℓ = 1), respectively. The need for treating land surface model predictions for near-surface soil moisture and remotely sensed imagery as observation sources arises because we do not make use of a dynamic equation such as equation (1a) to describe the temporal evolution of the near-surface soil moisture states. We rather focus on capturing the spatial dependence of this field through equation (4). By doing this, we gain the ability to efficiently cope with predictions and observations being at incongruent spatial scales, to fully describe the spatial dependence of near-surface soil moisture within our assimilation scheme, and to obtain time-varying statistical parameters to describe how the degrees of uncertainty in the remotely sensed imagery and/or the land surface model predictions may change over time.

[26] Note that equation (8) may be used for either direct assimilation of surrogate near-surface soil moisture fields or as a piece-wise linear representation of a radiation transfer scheme for assimilation of radiobrightness temperature. Since we directly assimilate near-surface soil moisture imagery into VIC-3L, we set C(s, ℓ = 1) = 1. Additionally, we allow for biases in the land surface model predictions and the remote sensing imagery for near-surface soil moisture to be captured by the constant D(s, ℓ = 1) as described in section 3.2.2. Last, we should clarify that since the physics underlying the VIC-3L land surface model is used to specify equation (6), which relates the moisture content in deeper layers to the near-surface soil moisture content, the land surface model predictions for the moisture contents in layers 2 and 3 need not be treated as observations.

[27] At each assimilation time, the MKF-based assimilation methodology proposed in this study yields the best linear or least squares estimates (together with their uncertainties) for the hidden soil moisture states in all soil layers of a given VIC-3L grid cell conditional on the observations available at that time only in any spatial location or grid cell, at any depth along the soil column, and at any spatial resolution. Details on how this is accomplished are provided in Appendix A. By contrast, the optimal estimates from the Kalman filter (as described in section 3.1) for a given land surface model grid cell and assimilation time would take into consideration all the observations available up to the time of assimilation, but for that grid cell only and at a single spatial resolution. Moreover, estimation of time-varying statistical parameters may be accomplished with ease for the proposed MKF-based assimilation algorithm as discussed in section 3.2.2. On the other hand, this may proof a much more difficult task in the Kalman filtering framework and has not yet been rigorously attempted in the context of assimilation of near-surface soil moisture or radiobrightness temperature into land surface models.

3.2.2. Expectation Maximization Algorithm for Parameter Estimation

[28] To make parameter estimation possible, it is necessary to impose some notion of stationarity or homogeneity of parameters, so that a sample of reasonable size can be defined. Here we make the following two assumptions. First, the noise term w(s, ℓ) is assumed to have a spatially homogeneous covariance for nodes s associated to a given scale m < M and index ℓ. Second, the observation noise v(s, ℓ) is also assumed to have a stationary covariance for all Y(m, ℓ) = {y(s, ℓ) : s is at scale m}. These assumptions may be summarized as follows

equation image

for all s at scale m < M

equation image

for all y(s, ℓ) at scale m. At each time of assimilation, we must specify the following parameters for the MKF-based data assimilation algorithm:

equation image

Appendix B provides the set of equations that make up the EM algorithm for estimation of the parameters in Θ, with the exception of equation image(s = root, ℓ = 1).

[29] Recall that equation image(s = root, ℓ = 1) is the mean of x(s = root, ℓ = 1). Since we have set A(s, ℓ = 1) = 1 to impose conservation of the mean near-surface soil moisture content across scales and the noise terms w(s, ℓ = 1) have zero mean, equation image(s = root, ℓ = 1) is also the mean of x(s, ℓ = 1) at all other spatial resolutions. If we assumed that the spatial averages for the land surface model predictions of near-surface soil moisture and the corresponding remotely sensed soil moisture imagery are identical (i.e., no bias exists), equation image(s = root, ℓ = 1) could be estimated simply as the average of the near-surface soil moisture predictions from VIC-3L and the remote sensing imagery at each assimilation time, namely, equation image(s = root, ℓ = 1) = E[{YVIC, YESTAR}]. Unfortunately, it is unlikely that no discrepancies or biases exist in the mean area near-surface soil moisture contents from these two sources. In the presence of bias, equation image(s = root, ℓ = 1) is undetermined. This is true both for the Kalman filter and the MKF-based assimilation approach introduced in this paper. Thus a sensible way to examine the impacts of the existing bias is to perform several assimilation runs setting equation image(s = root, ℓ = 1) to distinct and likely values such as

equation image
equation image
equation image

Once a value is chosen for equation image(s = root, ℓ = 1), the bias that is present cannot be ignored. Doing so would cause the mean of the estimated near-surface soil moisture content not to be preserved across scales. A simple way to prevent this from occurring is to account for the bias through the constant D(s, ℓ) in equation (8) as

equation image

A similar approach could be taken when using the Kalman filter and its variants (e.g., EKF and EnKF) to prevent the introduction of errors in the assimilated soil moisture fields that result from the presence of bias.

[30] We may also write a reduced parameter set for the application at hand. Since we have only two observation sources, we have to optimize two observation noise variances, namely, RVIC and RESTAR. For the sake of parsimony and motivated by an extensive amount of empirical evidence, we assume that near-surface soil moisture can be characterized as a fractal field [e.g., Rodriguez-Iturbe et al., 1995; Oldak et al., 2002] and make use of equation (5b). Additionally, we neglect the noise terms w(s, ℓ = 2) and w(s, ℓ = 3) in equation (6) since no observations other than the land surface model predictions are available for the soil moisture contents in layers 2 and 3 to make estimation of Q(m, ℓ = 2) and Q(m, ℓ = 3) possible. If observations were available for these deeper soil layers, Q(m, ℓ = 2) and Q(m, ℓ = 3) could be estimated as shown in Appendix B. Given these application specific constrains, we may rewrite the parameter set as

equation image

[31] At each assimilation time, the EM algorithm iteratively maximizes the true likelihood of the data (i.e., the VIC-3L predictions and ESTAR retrievals of near-surface soil moisture) for the current time step by updating the MKF-based assimilation algorithm parameters in two steps: (1) the expectation step, or E step, where necessary conditional expectations or so-called sufficient statistics are computed and (2) the maximization step, or M step, where the parameter values are updated. This procedure is illustrated schematically in Figure 2. The EM algorithm is known to converge, but it may do so to a local optimum. For a more in-depth coverage and further motivation on the use of the EM algorithm the reader may consult Digalakis et al. [1993], Kannan et al. [2000], and references therein.

Figure 2.

EM algorithm schematic. At each assimilation instance the EM algorithm iteratively maximizes the likelihood of the present data. At iteration k the EM algorithm updates the parameter set Θk in two steps. Sufficient statistics are evaluated in the expectation step, or E step, and improved parameter values are computed in the maximization step, or M step. The procedure is initiated with an initial estimate for the parameter set, Θo and terminates upon convergence of the parameter values.

4. Results

[32] In this section we test the proposed MKF-based data assimilation methodology and assess the impacts of assimilation on the predictions of soil moisture and energy fluxes from VIC-3L. We further evidence the need to allow the statistical parameters of the assimilation approach to change in time to capture the fluctuating nature of the uncertainties in land surface model predictions.

4.1. Impacts of Assimilation on VIC-3L Predictions

[33] Subsequently, we examine the effects that assimilation of near-surface soil moisture has on the predictions of soil moisture states and energy fluxes from the VIC-3L land surface model. We start by looking at spatially averaged results and then proceed to analyze the changes in spatial variability resulting from assimilation. We also examine the effects that discrepancies in the spatially averaged near-surface soil moisture fields predicted from VIC-3L and retrieved from ESTAR have on the results by taking into account the presence of bias within our assimilation formulation as described in section 3.2.2. In particular, we perform three assimilation simulations corresponding to the three distinct values for the true spatial mean of the near-surface soil moisture field given in equations (9a)(9c). In all of these, the VIC-3L predictions are made at a resolution of 3.2 km whereas the ESTAR retrievals are available at 800-m resolution. Throughout this section, we refer to the VIC-3L results in the absence of assimilation as control run predictions.

[34] Figure 3 depicts the temporal evolution of the soil moisture states for the three soil layers in VIC-3L from 18 June to 17 July 1997. During this period, the VIC-3L control run predictions for spatially average volumetric near-surface soil moisture content deviate from less than 1% to at most 6% from the ESTAR observations at the times of assimilation. This constitutes both an encouraging result from the point of view of the VIC-3L model performance and clear evidence of the presence of biases in the land surface model predictions of near-surface soil moisture with respect to the ESTAR retrievals at the short timescales (e.g., hourly or subhourly) at which assimilation may be performed given the instantaneous nature of the remote sensing observations. Particularly, the effects of biases are clearly seen following the precipitation event on 15 July, after which the assimilation simulations for the three different specifications of mean area near-surface soil moisture content diverge from one another. In general, however, we find that the changes in the spatially averaged soil moisture states in all three soil layers resulting from assimilation of near-surface soil moisture are small since the VIC-3L model does well at capturing the mean area near-surface soil moisture content. Nonetheless, we observe a small decrease in soil moisture content in the second soil layer toward the end of the SGP97 experiment.

Figure 3.

Temporal evolution of spatially averaged soil moisture contents in layers 1, 2, and 3 of VIC-3L. Assimilation results are shown for the three distinct specifications of the mean area soil moisture content, namely, as the average mean from ESTAR and VIC-3L predictions (equation (9a), “Avge. Mean”), the mean from ESTAR retrievals (equation (9b), “ESTAR Mean”), and the mean from VIC-3L predictions (equation (9c), “VIC Mean”). Tick marks for a given date are placed at 0000 LT. Assimilation was performed at 1000 LT.

[35] In Figure 4 we compare the VIC-3L predictions of latent and sensible heat fluxes to observations taken at a flux tower located at (34.9604°N, 97.9789°W) for the month of the experiment. The VIC-3L predictions correspond to the 3.2 by 3.2 km grid cell that contains the flux tower. Since all assimilation cases yield very similar results at this particular location, Figure 4 only shows results for the assimilation case in which we specify the spatial mean of the near-surface soil moisture through equation (9a), namely, as the average of the means corresponding to the VIC-3L predictions and the ESTAR retrievals. It is important to stress that we may only establish a qualitative assessment of the similarities in pattern and overall structure of variability between the observations and VIC-3L energy flux predictions. A point-wise match in the flux predictions and the observations need not exist since there is a scale discrepancy in their corresponding representative areas. While the flux tower observations constitute point measurements, the VIC-3L predictions represent averages over a 3.2 km by 3.2 km area. With this in mind, we find that the VIC-3L model does well overall at capturing the diurnal cycle of latent and sensible heat fluxes at this location both in the control run and assimilation simulations.

Figure 4.

Comparison to observed fluxes at (34.9604°N, 97.9789°W). VIC-3L results correspond to the 3.2 km by 3.2 km grid cell encompassing the flux tower. Assimilation results correspond to the specification of the mean-area near-surface soil moisture content as in equation (9a), namely, as the average mean from ESTAR retrievals and VIC-3L predictions. Tick marks for a given date are placed at 0000 LT. Assimilation was performed at 1000 LT.

[36] Figure 5 portrays the improvements in spatial structure of near-surface soil moisture predictions for all 16 dates at which assimilation is performed. Figure 5 displays the soil moisture content for the 5-cm soil layer near the surface corresponding to (1) the ESTAR observations at 800-m resolution, (2) the VIC-3L predictions at 3.2-km for the time of assimilation given that assimilations has taken place in the past, but not yet at the present time, which we designate as the priors, and (3) the optimal near-surface soil moisture fields for the current assimilation time at 3.2-km resolution obtained from the proposed framework after performing assimilation, which we call the posteriors. The assimilation results presented in Figure 5 correspond to the specification of the spatial average of near-surface soil moisture as in equation (9a). Note that since 18 June is the first date at which assimilation is performed, the model prior for this date is identical to the control run prediction.

Figure 5.

Spatial structure of percent volumetric near-surface soil moisture content (θ1) corresponding to ESTAR retrievals, VIC-3L priors, and VIC-3L optimal fields for all assimilation dates. Results correspond to the specification of the mean area near-surface soil moisture content as in equation (9a), namely, as the average mean from ESTAR retrievals and VIC-3L predictions.

[37] Figure 5 presents us with two clear indications of the benefits that may result from assimilation of near-surface soil moisture. The first of these is the considerable improvement in the spatial structure of the optimal near-surface soil moisture fields obtained from the proposed MKF-based assimilation paradigm. The second significant encouraging finding is that the improvements in spatial structure of near-surface soil moisture predictions tend to persist for time periods ranging from 24 hours to several days. For instance, the prior images for 25 June and 2 July among others show good agreement to the ESTAR near-surface soil moisture imagery for these dates. This is important in practice as we expect lapses of 3 days to a week between successive operational remote sensing retrievals of near-surface soil moisture for any given location.

[38] A startling feature of Figure 5 is the rather blocky structure in the near-surface soil moisture predictions for the control run (e.g., 18 June) and in the priors at times following a long period of no assimilation (e.g., 1 July). We also see the lack of resemblance in spatial features that arise following precipitation events (see Figure 3) after which the improvements in spatial structure of near-surface soil moisture predictions from VIC-3L due to assimilation deteriorate noticeably (e.g., 16 July). We conjecture that this is likely a consequence of two main commonly encountered limitations in land surface modeling. First, the strong emphasis on vertical fluxes typical of land surface parameterizations and the weak horizontal connectivity among model grid cells allows for sharp spatial transitions in the near-surface soil moisture predictions to develop. Second, soil properties (such as saturated hydraulic conductivity) are much more heterogeneous in reality than our soil parameters can capture. As we shall see, the implications of these findings are that assimilation may only yield (potentially significant) improvements in the short-term prediction of soil moisture states. Harnessing the full power of assimilation for improving long-term predictions may require that we make further use of the available remote sensing imagery to revise and improve the structure and specification of parameters in land surface models.

[39] In Figure 6 we evaluate the impact that assimilation of the 16 near-surface soil moisture fields had on prediction of soil moisture content in the three soil layers of VIC-3L. We do this for each VIC-3L grid cell by computing the difference between the soil moisture states in the model runs influenced by assimilation and the control run for every hour from 18 June to 17 July 1997. We then divide by the total number of hours to obtain a mean hourly change in volumetric soil moisture content for each of the VIC-3L grid cells. This objective measure may be written as:

equation image

where ZASSIM,t and ZCONTROL,t denote the predictions for a given state variable (such as soil moisture content) in the assimilation and control runs, respectively, and T connotes the number of hours in the time period of interest. Figure 6 demonstrates that assimilation of near-surface soil moisture induced a noticeable spatial redistribution in the predictions of moisture content in all three soil layers of VIC-3L. In assessing the significance of the adjustments enacted to layers 2 and 3, we must bear in mind that these layers are deeper than the thin surface soil layer. Thus a seemingly smaller correction in the volumetric moisture content in these soil layers may represent a larger amount of water, which is what ultimately plays a role in solving the water budget. For instance, the maximum absolute mean hourly changes in moisture content for layers 1, 2, and 3 are respectively 1.3, 6, and 10 mm. It is also worth noting that all assimilation simulations yielded similar changes in the spatial structure of soil moisture predictions, which can be explained by the good performance of the VIC-3L land surface model with regard to capturing the spatially averaged near-surface soil moisture content. Nonetheless, it is possible to distinguish differences that result from the different specifications of the mean soil moisture content in all three assimilation cases. These could potentially become far more notorious in situations in which a given land surface parameterization incurs larger biases.

Figure 6.

Mean hourly changes in percent volumetric soil moisture content in layers 1 (Δθ1), 2 (Δθ2), and 3 (Δθ3) of VIC-3L from 18 June to 17 July 1997 resulting from assimilation of near-surface soil moisture. Results are shown for the three distinct specifications of the mean area soil moisture content, namely, as (top) the average mean from ESTAR and VIC-3L predictions (equation (9a)), (middle) the mean from ESTAR retrievals (equation (9b)), and (bottom) the mean from VIC-3L predictions (equation (9c)).

[40] Figure 7 presents the mean hourly changes in latent and sensible heat fluxes for all VIC-3L grid cells during the month of the SGP97 experiment computed by means of equation (11). Clearly, assimilation of near-surface soil moisture has had a substantial impact in the way VIC-3L partitions its energy predictions into latent and sensible heat fluxes. Specifically, the mean adjustments in the predictions of these fluxes are on the order of 10 W/m2 for each hour during the SGP97 experiment for all assimilation cases in spite of the ability of VIC-3L to capture the spatially averaged near-surface soil moisture content. This clearly highlights the need to better represent the spatial features of soil moisture states within land surface parameterizations and to the potential for using remote sensing retrievals toward this purpose. By analyzing the results presented thus far, we recognize a clear and physically sound correspondence between changes in the flux predictions and the spatial redistribution of soil moisture states depicted in Figure 6. Additionally, we find that assimilation prompts increases in latent heat flux predictions that are congruent with the mean area decrease in water content (see Figure 3) in the second model soil layer, which contains most of the vegetation roots, due to increased evapotranspiration.

Figure 7.

Mean hourly changes in latent heat (ΔLE) and sensible heat (ΔSH) fluxes from 18 June to 17 July 1997 resulting from assimilation of near-surface soil moisture. Results are shown for the three distinct specifications of the mean area soil moisture content, namely, as (top) the average mean from ESTAR and VIC-3L predictions (equation (9a)), (middle) the mean from ESTAR retrievals (equation (9b)), and (bottom) the mean from VIC-3L predictions (equation (9c)).

[41] We now return to the crucial issue of determining the temporal persistence of improvements in the spatial structure of near-surface soil moisture predictions induced by assimilation. Let θ1,tk denote the 2-D field of near-surface soil moisture predicted by VIC-3L k hours after a last assimilation of ESTAR retrievals at time t. Adhering to this notation, θ1,tk=0 corresponds to the last optimal near-surface soil moisture field assimilated into VIC-3L. Also, allow θ1,tk (i, j) designate near-surface soil moisture corresponding to the VIC-3L grid cell with location indexed by i and j. We can compute cross correlations as shown in equation (12) to determine the degree of similarity in the spatial distribution of near-surface soil moisture fields predicted at times t and t + k:

equation image

where NVIC denotes the number of VIC-3L pixels and SD(·) connotes standard deviation. We expect that as time progresses, the cross correlation between near-surface soil moisture predictions θ1,tk=0 and θ1,tk deteriorates in response to two distinct agents: (1) the underlying physical processes driving the temporal dynamics of near-surface soil moisture; and (2) the decreased fidelity with which a land surface parameterization such as VIC-3L reproduces the spatial features of near-surface soil moisture as we move away from the last instance of assimilation at time t. At times in which ESTAR retrievals are available, it becomes possible to discern between the two forces that cause the decay of equation (12) by computing the cross correlation between ESTAR retrievals. This is done through equation (12) by letting θ1,tk=0 be the ESTAR retrieval of near-surface soil moisture corresponding to time t, allowing θ1,tk to be the near-surface soil moisture field derived from ESTAR k hours after time t, and replacing NVIC by the number of pixels in the ESTAR imagery.

[42] Figure 8 depicts cross correlations obtained per equation (12) for four instances in which the last assimilation takes place on 18 June, 30 June, 11 July, and 12 July, respectively. These encompass the three different scenarios observed by performing a similar analysis for all assimilation dates. The days 30 June and following correspond to slowly varying near-surface soil moisture dynamics (see Figure 3) in which VIC-3L effectively preserves the improved spatial structure of near-surface soil moisture. By contrast the assimilation on 11 July closely follows a precipitation event that causes rapid changes in the near-surface soil moisture content, which cannot be adequately reproduced by VIC-3L. Last, the periods following 18 June and 12 July initially coincide with slowly to moderately evolving near-surface soil moisture dynamics up to the occurrence of precipitation on 23 June and 15 July, respectively, which induces rapid variations in the near-surface soil moisture and causes the improvements in its spatial structure due to assimilation to deteriorate. Figure 8 evinces that the time persistence of assimilation improvements to the spatial structure of near-surface soil moisture predictions depends strongly on the rate of change of the soil moisture dynamics in the near-surface and are considerably degraded or fully obliterated by the occurrence of precipitation. The findings reported here are case-specific and indicate that the VIC-3L model setup and/or parameterization cannot reproduce the initial rate of drainage following strong rainfall events accurately enough to preserve the spatial features of near-surface soil moisture content. However, similar findings have recently been reported by Chintalapati and Kumar (submitted manuscript, 2003), and the difficulties encountered here are likely to be pervasive to other land surface schemes due to the complications that arise when specifying the highly variable and spatially heterogeneous parameters that control gravity drainage, such as hydraulic conductivity [e.g., Montaldo and Albertson, 2003 and references therein]. These findings indicate that the benefits of assimilation may be restricted to improving short-term predictions unless we further its use as a diagnostic and corrective tool for improving the structure of land surface models and the specification of relevant parameters.

Figure 8.

Temporal persistence of improvements in the spatial structure of VIC-3L predictions of near-surface soil moisture content computed per equation (12). Results correspond to the specification of the mean area near-surface soil moisture content as in equation (9a), namely, as the average mean from ESTAR retrievals and VIC-3L predictions.

4.2. Impacts of MKF Parameters on Uncertainty Estimates and VIC-3L Predictions

[43] Much like in linear regression we can analyze the residuals to determine whether our assumption of a linear trend is acceptable, in any parametric assimilation framework, we can also look at the residuals to establish the adequacy of the approach and the goodness of the predictions obtained. Let equation image(s, ℓ = 1∣Y(s = root, ℓ = 1)) and P(s, ℓ = 1∣Y(s = root, ℓ = 1)) denote the optimal near-surface soil moisture estimate and its associated variance or uncertainty obtained at a given assimilation time for node s as in Appendix A. The residuals are by definitions the difference between the observations and what our assimilation paradigm tells us they should be, namely, from equation (8)

equation image
equation image

where equation imageVIC(s, ℓ = 1) and equation imageESTAR(s, ℓ = 1) designate the residuals with respect to the VIC-3L predictions of near-surface soil moisture and the ESTAR retrievals, respectively. If the assumptions ingrained in equations (4) and (8) are appropriate, the residuals in equations (13a) and (13b) should be zero-mean Gaussians with variances given by [Luettgen and Willsky, 1995]

equation image
equation image

Equations (14a) and (14b) constitute estimates for what the variance of the residuals should be and are mainly dependent on the noise variances used in assimilation to describe the errors in land surface model predictions and ESTAR retrievals, namely, RVIC and RESTAR. Hence we can objectively verify whether our characterization of uncertainties for the VIC-3L predictions and remotely sensed near-surface soil moisture are suitable by comparing equations (14a) and (14b) to the sample variances of the corresponding residuals.

[44] We first resort to the Kolmogorov-Smirnov test [DeGroot, 1991] to verify whether the VIC-3L and ESTAR residuals for all 16 dates in which assimilation was performed can be characterized as Gaussian random variables at the 95% level of confidence. We find that the assumption of normality cannot be rejected for the VIC-3L residuals corresponding to all assimilation dates. Similarly, we find that the ESTAR residuals for 15 out of the 16 dates of assimilation can be adequately represented as being normally distributed. We additionally examine the adequacy of the assessment of uncertainties provided by the MKF-based assimilation paradigm presented in this article. Toward this purpose, Figure 9 depicts the temporal evolution of the sample and estimated standard deviations for equation imageVIC and equation imageESTAR, respectively. We find that the uncertainty associated with the ESTAR retrievals is nearly constant in time, which may speak of the high quality of the SGP97 data set. On the contrary, the uncertainty associated with VIC-3L predictions varies considerably over time. Particularly, it tends to be high after prolonged periods of no assimilation and in the days following storm events, which is consistent to the findings presented in section 4.1. This evidently highlights the need to obtain the statistical parameters inherent to the assimilation algorithm directly from the data at hand and to allow them to change over time so as to capture any fluctuations in the performance of land surface models. Most significantly, Figure 9 demonstrates that the proposed MKF-based assimilation algorithm is capable of accurately tracking the standard deviation of residuals, and thus it most likely provides a suitable characterization of the uncertainties in ESTAR retrievals and VIC-3L predictions.

Figure 9.

Sample and predicted standard deviation of VIC-3L and ESTAR assimilation residuals (see equations (14a) and (14b)). Results correspond to the specification of the mean area near-surface soil moisture content as in equation (9a), namely, as the average mean from ESTAR retrievals and VIC-3L predictions. Tick marks for a given date are placed at 0000 LT. Assimilation was performed at 1000 LT.

[45] To further study and exemplify the importance of providing a time varying description of the uncertainty in land surface model predictions within the algorithm used for assimilation, we conduct three additional assimilation simulations. In them, we utilize the optimal estimates for all statistical parameters with the exception of RVIC, which we prescribe and hold constant at three distinct values. Specifically, we fix RVIC to the lowest, median, and highest values of its previously obtained optimal estimates for the 16 different dates at which assimilation takes place. Figure 10 displays the true and predicted standard deviation of VIC-3L residuals for these three different assimilation simulations. When RVIC is set to its minimum optimal value we approach the case of no assimilation, namely, the absence of improvement in the spatial structure of near-surface soil moisture predictions. Hence the sample and predicted standard deviation of VIC-3L residuals go to zero. Conversely, when RVIC is set to its maximum optimal value we approach the case of direct insertion and we place an unjustifiably small weight on the land surface model physics. In doing so, we become unable to track the sample standard deviation of residuals and thus to provide accurate uncertainty estimates for the optimal near-surface soil moisture fields obtained from the assimilation algorithm. This is a significant drawback to endure as one of the key benefits from adopting a parametric assimilation methodology is precisely the availability of uncertainty measures. Last, upon setting RVIC to its median optimal value, we underestimate the uncertainty in VIC-3L predictions after lengthy periods with no assimilation and following storm events. Once again, we confirm the need for the error noise variance characterizing the uncertainty in land surface model predictions to be determined in an optimal manner and to be allowed to evolve in time. Unfortunately, the time span of the SGP97 experiment and the extent of the available data do not permit an evaluation of the long-term effects from utilizing heuristically prescribed and time-invariant statistical parameters when performing assimilation. However, it is plausible that as we sample the seasonal variability and encounter further instances of under or overestimation of the uncertainty in land surface model predictions, we incur on increasingly higher assimilation biases.

Figure 10.

Sample and predicted standard deviation of VIC-3L assimilation residuals for three distinct time-invariant values of RVIC, namely, the (top) lowest, (middle) median, and (bottom) highest of the optimal values from all 16 assimilation dates. Results correspond to the specification of the mean area near-surface soil moisture content as in equation (9a), namely, as the average mean from ESTAR retrievals and VIC-3L predictions. Tick marks for a given date are placed at 0000 LT. Assimilation was performed at 1000 LT.

5. Conclusions

[46] In this study we have introduced a novel approach to assimilation of near-surface soil moisture into land surface models that fuses the multiscale Kalman filter (MKF) and the expectation maximization (EM) algorithms [Chou et al., 1994; Fieguth et al., 1995; Luettgen and Willsky, 1995; Kannan et al., 2000]. With the proposed MKF-based assimilation paradigm, it is not necessary to impose the notion of spatially independent near-surface soil moisture observations to attain a computationally tractable problem. On the contrary, this methodology explicitly and very efficiently considers the strong spatial dependence and scaling properties of near-surface soil moisture fields [e.g., Rodriguez-Iturbe et al., 1995; Houser et al., 1998; Oldak et al., 2002] to constrain its optimal estimates. Moreover, MKF-based assimilation has the appealing ability to cope with land surface model predictions and remote sensing retrievals available at incongruent spatial scales. Additionally, the EM algorithm allows us to objectively and automatically derive maximum likelihood estimates for the statistical parameters inherent to our assimilation approach, which are determined directly from the data at hand and allowed to evolve in time. This constitutes a considerable asset as these parameters (e.g., the noise variances describing the uncertainties in land surface model predictions and remote sensing retrievals) largely determine the outcome and benefits from assimilation.

[47] We have tested the proposed MKF-based assimilation framework by assimilating near-surface soil moisture retrievals obtained during SGP97 into the VIC-3L land surface model. The results obtained evince that assimilation improves the VIC-3L predictions of soil moisture states, particularly with regard to being able to capture the spatial structure of these fields. Significant impacts on the way VIC-3L partitions latent and sensible fluxes are also observed to result from assimilation of near-surface soil moisture into the model. Upon examining the time-persistence of improvements resulting from assimilation, we find that these tend to carry on for time spans ranging from 24 hours to a several days. This is of practical importance as we expect lapses of 3 days to a week between successive operational remote sensing retrievals of near-surface soil moisture for any given location. Furthermore, our analysis suggests that assimilation may only yield (potentially significant) improvements in the short-term prediction of soil moisture states and energy fluxes. Improving long-term predictions through assimilation may require that we further use the available remote sensing imagery to revise and improve the structure and specification of parameters in land surface models.

[48] Last, the analysis of the optimal statistical parameters describing the uncertainty in VIC-3L predictions of near-surface soil moisture and the examination of residuals have confirmed the need to obtain the statistical parameters inherent to the assimilation algorithm directly from the data at hand and to allow them to change over time so as to capture any fluctuations in the performance of land surface models. Moreover, it has been demonstrated that the proposed MKF-based assimilation algorithm is capable of accurately tracking the standard deviation of residuals, and thus it most likely provides a suitable characterization of the uncertainties in ESTAR retrievals and VIC-3L predictions.

Appendix A:: MKF-Based Data Assimilation Algorithm

[49] Consider the 2-D random field x(s, ℓ) consisting of multiscale and single-scale hidden states as described in section 3.2.1. Equation (4) in the main text describes the evolution of the multiscale process x(s, ℓ = 1) from coarser to finer scales. Equation (6) establishes how single-scale hidden states x(s, ℓ > 1) relate to the multiscale process and to one another. Last, equation (8) allows for noisy or uncertain observations y(s, ℓ) and relates them to either the multiscale or single-scale hidden states.

A1. Inference of Optimal States

[50] Let us introduce the following notation for sets of observations in a given assimilation time:

equation image
equation image
equation image

It is important to stress that the logical operator and used in equation (A1a) must be evaluated strictly. For instance, consider the set Y(s, ℓ +) with ℓ = 1. This set is nonempty if and only if ss′ or s is a descendant of a node in s′. We may also write conditional means and covariances of the hidden state as

equation image
equation image

We will further make use of the following cross covariances between hidden states

equation image
equation image

with equation (A2d) being valid for ss′ and 1 ≤ ℓ ≤ L − 1.

[51] The MKF-based assimilation algorithm proposed in this article yields the best linear or least squares estimates for the hidden soil moisture states conditional on all observations available at each time of assimilation (at any spatial scale and depth along the soil column). Using the notation introduced in equations (A1) and (A2), we may write the optimal soil moisture states and their uncertainty estimates conditional on all observations available at a give assimilation time as equation image(s, ℓ∣Y(s = root, ℓ = 1)) and P(s, ℓ∣Y(s = root, ℓ = 1)), respectively. The assimilation algorithm consists of an upward and a downward sweep. In the upward sweep, we convey the information gathered from observations starting from deeper to shallower soil layers and from the leaves of the multiscale tree up to the root (i.e., from finer to coarser scales). In the downward sweep, we proceed in an inverse fashion to carry information from observations starting at the root and moving toward the leaves of the multiscale tree (from coarse to fine scales) and from soil layers closer to the surface down to deeper ones.

A2. Upward Sweep

[52] We start by initializing the means and covariances for all multiscale and single-scale hidden states with their corresponding unconditional values, which we denote as equation image(s, ℓ) and P(s, ℓ), respectively. First, we specify the distribution of the multiscale hidden state at the root node as N(equation image(s = root, ℓ = 1), P(s = root, ℓ = 1)). Then, we propagate the unconditional mean and covariance of the root node down to the leaves of the multiscale tree through

equation image
equation image

Similarly, we propagate unconditional means and covariances to the single-scale hidden variables through

equation image
equation image

for 1 ≤ ℓ ≤ L − 1 and ss′ . Next we initialize conditional single-scale means and covariances as equation image(s, LY(s, L+)) = equation image(s, L) and P(s, LY(s, L+)) = P(s, L) for all ss′. We can now update these initial estimates with any available observations for all ss′ starting at ℓ = L and progressing to ℓ = 1 in two steps: an observation update followed by a layer update.

A2.1. Observation Update (Single-Scale Hidden States)

[53] For ℓ = L,…, 2 and ss′ such that y(s, ℓ) exists:

equation image
equation image

where I is the identity matrix and

equation image
A2.2. Layer Update

[54] For L − 1 ≤ ℓ ≤ 1 and ss′:

equation image
equation image

where

equation image
equation image
equation image

If m′ ≠ 0, we initialize conditional means and covariances for multiscale states at the finest scale of the tree (m = 0) as equation image(s, ℓ = 1∣Y(s−, ℓ = 1)) = equation image(s, ℓ = 1) and P(s, ℓ = 1∣Y(s−, ℓ = 1)) = P(s, ℓ = 1). On the other hand, if m′ = 0, we set conditional means and covariances at the finest scale of the tree as equation image(s, ℓ = 1∣Y(s−, ℓ = 1)) = equation image(s, ℓ = 1∣Y(s, ℓ+)) and P(s, ℓ = 1∣Y(s−, ℓ = 1)) = P(s, ℓ = 1∣Y(s, ℓ+)), which are obtained from equations (A6a) and (A6b), respectively. We then proceed to update multiscale hidden states with their corresponding observations starting at the finest scale and progressing to the root node via three recursive steps: an observation update, a scale update, and a fusion step.

A2.3. Observation Update (Multiscale Hidden States)

[55] For all s at scales m = 0,…, M such that y(s, ℓ = 1) exists:

equation image
equation image

where

equation image
A2.4. Scale Update

[56] Let H(s) denote the number of children of multiscale state x(s, ℓ = 1), and introduce the operator αhs to designate its hth children. For m = 0,…, M − 1 and 1 ≤ hH(s):

equation image
equation image

where

equation image
equation image
A2.5. Fusion of Predictions (for m > 0)

[57] If mm

equation image
equation image

On the other hand, if m = m

equation image
equation image

The upward sweep proceeds until we reach the root node and compute x(s = root, ℓ = 1∣Y(s = root, ℓ = 1)) and P(s = root, ℓ = 1∣Y(s = root, ℓ = 1)).

A3. Downward Sweep

[58] For the multiscale variables, we proceed from the root down to the leaves of the tree until we reach the finest scale. For m = M − 1,…, 0:

equation image
equation image

where

equation image

We additionally compute the conditional cross covariances between all multiscale hidden states and their parents, which will be needed for parameter estimation. For s ≠ root

equation image

Once we reach m = m′, we can perform the downward sweep for single-scale hidden states. For 2 ≤ ℓ ≤ L and ss′:

equation image
equation image

with

equation image

We also compute the conditional cross covariances between adjacent single-scale hidden states (i.e., those with successive ℓ indices), which are needed for parameter estimation:

equation image

Appendix B:: Parameter Estimation for MKF-Based Assimilation (EM Algorithm)

[59] This appendix presents the equations that compose the expectation maximization (EM) algorithm for maximum likelihood estimation of parameters in the proposed MKF-based data assimilation approach. For an in-depth discussion on this subject the reader may consult Digalakis et al. [1993], Kannan et al. [2000], and references therein. To make parameter estimation possible, one must impose some notion of stationarity or homogeneity of parameters, so that a sample of reasonable size can be defined. For ease of notation and to be congruent with the application in the main text, we present the parameter estimation equations under the two assumptions given in section 3.2.2, which may be summarized as

equation image

for all s at scale m < M

equation image

for all y(s, ℓ) at scale m.

[60] Let us denote the set of all parameters of the MKF-based data assimilation algorithm by

equation image

The EM algorithm iteratively maximizes the true likelihood of the data by updating the model parameters in two steps: (1) the expectation step, or E step, where necessary conditional expectations or so-called sufficient statistics are computed, and (2) the maximization step, or M step, where the parameter values are updated.

B1. E Step

[61] Define the operator 〈·〉 = E[·∣Y(s = root, ℓ = 1)]. In each iteration of the EM algorithm, the E step amounts to the computation of the following conditional expectations or sufficient statistics

equation image
equation image

In addition, we compute

equation image

for s ≠ root, and

equation image

for ss′ and 2 ≤ ℓ ≤ L. Note that upon running the MKF-based assimilation algorithm we have all necessary values to compute the expectations in (B1a) through (B1d). Hence the E step involves the evaluation of the assimilation algorithm given in Appendix A followed by the computation of sufficient statistics as outlined above. This is done in every iteration of the EM algorithm.

B2. M Step

[62] For all nodes s at scale m, define

equation image
equation image

After the E step is complete, we are ready to update all parameter values with the exception of equation image(s = root, ℓ = 1), which is undetermined in the presence of biases (see discussion in section 3.2.2). To do so, it is necessary to impose some notion of homogeneity of parameter values so that a sample of reasonable size may be defined. In this appendix, we present the parameter update equations under the assumption that the noise processes w(s, ℓ) and v(s, ℓ) have spatially homogeneous covariances at each scale, namely

equation image

for all s at scale m

equation image

for all s at scale m such that Y(m, ℓ) exists. These assumptions are without loss of generality in the expressions for the parameter updates as homogeneity may be imposed on arbitrarily small sets of nodes.

[63] In each iteration of the EM algorithm, we update the parameter values so as to increase the likelihood of the observations at hand through the following expressions.

equation image

For all m and ℓ so that Y(m, ℓ) exists, let nm,ℓ denote the number of observations contained in Y(m, ℓ). Observation noise variances may be updated as

equation image

where the summation in equation (C3) ranges over all s such that y(s, ℓ) exists. For all m < M, let Nm,ℓ designate the number of hidden states of type ℓ at scale m. If the multiscale hidden variables are not required to possess fractal behavior, we update Q(m, ℓ = 1) for m < M as follows

equation image

Conversely, if the fractal behavior appears to hold and it is imposed on the multiscale hidden states, x(s, ℓ = 1), the fractal exponent may also be updated via the EM algorithm. For ease of notation, let us consider the case in which the multiscale hidden state is a scalar (i.e., it has dimension dx(s, ℓ = 1) = 1). In this special case, A = 1 and we may write Q(m, ℓ = 1) = qm, where qm is also a scalar. If we additionally assume that we are dealing with a quadtree structure (i.e., H(s) = 4 for all s associated to scales m > 0). The fractal behavior is imposed by requiring that qm = q0 2−ηm for 0 ≤ m < M and η < 0. The update for qo is obtained from equation (B4). To obtain the update for the fractal exponent, we start by writing out the likelihood for complete data with qm = q0 2−ηm for 0 ≤ m < M. Taking the derivative with respect to η and setting it equal to zero we obtain a polynomial of degree M − 1 in which the independent variable is z = 2−η. We solve for the M − 1 roots of this polynomial. One of these roots, z*, is real and greater than one. The fractal exponent may then be obtained from η = −ln(z*)/ln(2).

[64] Last, for m = m′ such that ss′ and 1 < ℓ ≤ L

equation image
Notation
m

scale index, equal to 0 at the finest resolution and increasing to M at the coarsest scale; m = m′ at the resolution associated to single-scale hidden states (i.e., the scale at which VIC-3L predictions are made).

s = (m, i, j)

abstract index describing a given scale m and position (i, j) in the multiscale tree, equal to root at scale m = M and s′at scale m = m′.

index for the L types of hidden state variables; ℓ = 1 for multiscale states and 1 < ℓ ≤ L for single-scale states.

x(s, ℓ)

hidden state of type ℓ at node s. x(s, ℓ) has dimension dx(s, ℓ) by 1.

A(s, ℓ = 1)

scale transition matrix for multiscale hidden states (equation (4)).

w(s, ℓ = 1)

noise term describing the new details of x(s, ℓ = 1) that become available as the resolution gets finer (equation (4)).

Q(s, ℓ = 1)

covariance matrix for w(s, ℓ = 1).

η

scaling exponent for Q(s, ℓ = 1) (equation (5b)).

H(s)

number of children of multiscale hidden state x(s, ℓ = 1).

F(s, ℓ) and G(s, ℓ), 1 < ℓ ≤ L

matrices relating single-scale hidden states to each other and to multiscale hidden states (equation (6)).

w(s, ℓ), 1 < ℓ ≤ L

noise terms describing the uncertainties in the expressions relating single-scale hidden states to one another and to multiscale hidden states (equation (6)).

Q(s, ℓ), 1 < ℓ ≤ L

covariance matrices for w(s, ℓ), 1 < ℓ ≤ L.

y(s, ℓ)

observation of type ℓ at node s. y(s, ℓ) has dimension dy(s, ℓ) by 1.

C(s, ℓ) and D(s, ℓ)

matrices relating observations to hidden states (equation (8)).

v(s, ℓ)

observation noise term describing the uncertainty of y(s, ℓ) (equation (8)).

R(s, ℓ)

covariance matrix for observation noise v(s, ℓ).

yVIC(s, ℓ = 1), yESTAR(s, ℓ = 1)

VIC-3L predictions and ESTAR retrievals of near-surface soil moisture respectively.

RVIC(s, ℓ = 1), RESTAR(s, ℓ = 1)

covariance matrices for observation noises associated to yVIC(s, ℓ = 1) and yESTAR(s, ℓ = 1) respectively.

K(s, ℓ)

Kalman gain matrix for observation update of x(s, ℓ).

Y(s, ℓ)

set of all observations of types κ ≥ ℓ at node s and all of its descendants.

equation image(s, ℓ∣Y)

mean of x(s, ℓ) conditional on all observations in the set Y.

P(s, ℓ∣Y)

covariance of x(s, ℓ) conditional on all observations in the set Y.

P({s, γs}, ℓ∣Y)

conditional cross covariance between x(s, ℓ) and its parent, xs, ℓ).

P(s, {ℓ, ℓ + 1}∣Y)

conditional cross covariance between x(s, ℓ) and x(s, ℓ + 1).

equation image(s, ℓ∣Y(s = root, ℓ = 1))

expected value of x(s, ℓ) conditional on observations of all types (1 ≤ ℓ ≤ L) and at all spatial scales (0 ≤ mM).

P(s, ℓ∣Y(s = root, ℓ = 1))

covariance of x(s, ℓ) conditional on observations of all types (1 ≤ ℓ ≤ L) and at all spatial scales (0 ≤ mM).

Acknowledgments

[65] This work is partially supported by NASA under Grant NAG5-10673 to the University of California, Berkeley, and by the Hellman Junior Faculty award, University of California, Berkeley.

Ancillary