A suboptimal data assimilation algorithm based on the ensemble Kalman filter


  • Ekaterina Klimova

    Corresponding author
    1. Institute of Computational Technologies SB RAS, Novosibirsk, Russia
    • Institute of Computational Technologies SB RAS, Ac. Lavrentjev Ave., 6, Novosibirsk, 630090, Russia.
    Search for more papers by this author


A suboptimal algorithm for data assimilation based on the ensemble Kalman filter (EnKF) is proposed. An advantage of the algorithm is that it does not require an additional calculation of the ensemble of perturbations that correspond to the analysis-error covariance matrix because it is calculated automatically with this algorithm. The operation count of the algorithm is close to that of the local ensemble transform Kalman filter (LETKF), but its formulae are different from those of the LETKF. Copyright © 2012 Royal Meteorological Society

1. Introduction

The Kalman filter algorithm is currently one of the most popular approaches to solving the problem of data assimilation (Ghil and Malanotte-Rizzolli, 1991). An equation for the conditional mean is solved in the method to obtain an optimal estimate of the current atmospheric state with observational data and a forecast model, which is in general nonlinear (Jazwinsky, 1970). The equation is difficult to solve, but in simplified versions it can be reduced to equations for the 1st and 2nd moments. The simplifications are based on linearization about some basic state (the extended Kalman filter) or expansion into a power series in terms of the estimation error (second-order truncated filters). The random fields considered are assumed to be Gaussian (Jazwinsky, 1970).

The ensemble approach today is a leading tool in applying the Kalman filter to data assimilation. It was first proposed by Evensen (1994) and later developed by Burgers et al. (1998), Evensen (2003, 2007) and Houtekamer and Mitchell (1998, 2001, 2005). This approach allows calculating covariance matrices of estimation errors for nonlinear forecast models. In this case a version of the extended Kalman filter (EKF) is used, where forecast error covariances are estimated with an ensemble of forecasts.

Implementation of the ensemble algorithm causes many problems due to the limited number of ensemble members and a necessity to obtain an ensemble with such a covariance matrix that corresponds to the analysis error covariances. To overcome these problems, EKF algorithms that generate random observation errors (perturbed observations) can be used (Burgers et al., 1998). Such versions of the EKF were considered in the publications by Houtekamer and Mitchell (1998, 2005) and Burgers et al. (1998), but Whitaker and Hamill (2002) showed that the sampling error can be great. To avoid this effect the authors proposed an ensemble square root filter that allows implementation of the EKF without perturbed observations. Such an approach to determination of the analysis error ensemble is commonly referred to as ‘deterministic’. Tippett et al. (2003) give an overview of ensemble filters using the deterministic approach. It has been shown that all the approaches using the square root Kalman filter obtain ensembles that differ one from another, but all of them are used to calculate the same covariance matrix. One more version of the deterministic approach was described by Sakov et al. (2008). They proposed a simplified version of the ensemble square root filter.

To avoid the problems that occur due to the limited number of ensemble members the EKF may be applied locally. A method of localization, to limit the correlation radius, was proposed by Houtekamer and Mitchell (1998). These ideas were developed further by Houtekamer and Mitchell (2005) and Hunt et al. (2007), where the authors proposed using the algorithm in the subdomains.

Another significant problem is to avoid a false decrease in the forecast error variance being estimated using the ensemble, which results in the so-called time divergence of the algorithm (Lorenc, 2003). To solve this problem non-zero model noises have to be specified, but their exact values are unknown.

The EKF is a technically sophisticated algorithm operating with high-order matrices. In this article, an efficient algorithm of data assimilation for nonlinear models based on the EKF is proposed. The basic idea is taken from automatic control theory (Krasovskii et al., 1979). A simple algorithm, called the π-algorithm, proposed by Krasovskii et al. (1979) was considered in (Klimova, 2008a) for its possible application to data assimilation. Klimova (2008b) generalized the π-algorithm so that it could be used with an ensemble of forecasts (the ensemble π-algorithm). In the present article the formulae of the ensemble π-algorithm are derived from assumptions that are more general than those in Klimova (2008b). The operation count of the algorithm is close to that of the local ensemble transform Kalman filter (LETKF; Hunt et al., 2007; Szunyogh et al., 2008), but its formulae differ from those of the LETKF. In particular, the ensemble π-algorithm does not require calculating an ensemble that corresponds to the analysis error covariances because it is done automatically. As in LETKF, all operations in the ensemble π-algorithm are performed with matrices of the same order as that of the ensemble.

The ensemble π–algorithm formulae for nonlinear model and data operators are derived in section 2. A modification of the ensemble π-algorithm to be used in ensemble forecasting is given in section 3. Section 4 provides a comparative analysis of the π-algorithm and LETKF. Section 5 presents the results of numerical experiments on model data assimilation with the one-dimensional Burgers' equation, and section 6 presents the main conclusions of the article.

2. Ensemble π–algorithm

To derive the ensemble π–algorithm formulae for a nonlinear model consider a discrete Kalman filter (Jazwinsky, 1970). Let the ‘true’ value x' fit the equation

equation image(1)

where xf(tk+1) is the vector of ‘true’ values at time tk+1,M is the model operator, η(tk) is Gaussian white noise with covariance matrix Qk. The forecast step can be expressed as:

equation image(2)

where xf(tk+1) is the forecast vector at time tk+1, xa(tk) is the analysis vector at time tk. The analysis step is

equation image(3)

where equation image is the analysis error covariance matrix, Rk is the observation error covariance matrix, H is an operator (in general, nonlinear) mapping grid-point values to observation point values, H is a linearized operator, equation image is the observations vector at time tk (Jazwinsky, 1970). Formula (2) was derived by Jazwinsky (1970, p. 277) for the EKF. A similar formula for the analysis step can be derived if one neglects the terms with second derivatives in the formulae of the second-order filter for a discrete-continuous filtration problem (Jazwinsky, 1970, pp. 345–346). The EKF is an approximation of the classic Kalman filter for the nonlinear case, and at strong nonlinearity it does not produce an optimal estimate (Houtekamer and Mitchell, 2001).

The algorithm can be written in the equivalent form:

equation image(4)

where equation image is the forecast error covariance matrix. This formula unites the steps of analysis and forecast, so that indexes ‘a’ and ‘f’ can be omitted in what follows.

In the classic Kalman filter the estimation error covariance matrix is calculated by the following formula:

equation image

where xt is the true value of the parameter estimated, equation image is the estimate obtained using the Kalman filter (Jazwinsky, 1970). The true value is unknown, but an ensemble of errors can be modelled with the available prior information about the errors.

The observation data can be expressed as:

equation image

where εk is a random observation error with zero mean and covariance matrix Rk. Define the estimation error as dxk+1 = xt(tk+1) − x(tk+1). The error satisfies the following equation:

equation image

The estimate obtained with the classic Kalman filter is the conditional mean (Jazwinsky, 1970). As the conditional mean is estimated using an ensemble of realizations of the random field xt, an ensemble of our estimation errors could be considered. Let M and H be linearized operators to calculate the error. Then the ensemble of estimation errors can be described with the following equation:

equation image(5)

where n = 1,equation image,N, ηn(tk) is an ensemble of the model noises, equation image is an ensemble of the observation errors. As above, the error is understood as a deviation from the ‘true’ value.

Now consider an ensemble of initial values of the errors

equation image

where ‘n’ is the number of a vector in the ensemble, equation image random perturbation fields, with covariance P0. To estimate Pk+1 with the formula (Yaglom, 1987)

equation image

a version of the ensemble Kalman filter has been derived. Taking this formula for Pk+1 one obtains the system of equations for equation image (Eq. (5)).

Let D be a (L × N) matrix with columns as vectors equation image, where L is the dimension of the vectors, which is the number of the variables to be forecasted.

Formula (5) can be seen as a matrix equation:

equation image(6)

where Π is an (N × N) matrix with elements defined as:

equation image(7)

where m,n are the numbers of vectors in the ensemble. It is obvious that equation image is a number. Let F be a matrix with columns equation image, where equation image, and equation image be a matrix with columns equation image, where equation image.

To make calculations easy to use the index ‘k’ will be omitted in what follows. Formula (7) is equivalent to the matrix equality

equation image(8)

From formula (6) the following relations are derived:

equation image(9)

where I is the unit matrix. Multiply the left and right parts of formula (8) by equation image and obtain the following equation with respect to matrix Π:

equation image(10)

Let equation image. From Eq. (10) it follows that

equation image(11)

To calculate the square root, the matrix (C + 0.25I) must be positive-definite. In this case

equation image(12)

Obviously Π ≥ 0. Since operator H is linear, equation image = (HF + E), where E is a matrix with columns equal to vectors equation image. Then, matrix C is

equation image(13)

In the case where C20 C is symmetric and positive-definite, hence the square root of matrix (C + 0.25I) can be calculated by finding its eigenvectors and eigenvalues (Bellman, 1960).

Consider a more general case, when elements E are not equal to zero. The following nonlinear equation for matrix ΠT is to be solved:

equation image(14)

For an algorithm to solve Eq. (14) when the norm of matrix C2 is one order less than the norm of matrix C1, see Appendix A.

In the above assimilation algorithm N equations (see Eq. (5)) for the estimation errors have to be solved by applying a linearized model. The change of error in time can be estimated using an ensemble of forecasts, taking for instance (M(xn(tk)) − M(xt(tk))) approximately equal to equation image.

3. A version of the π–algorithm based on ensemble forecasting

What follows is a modification of the algorithm described above, so that it can be applied for forecasting of ensembles. It is well known that such a forecast requires setting an ensemble of initial fields {xn}in such a way that ensemble average equation image be equal to xa, and the variance, where xa is the result of the step of analysis of the Kalman filter and Pa is the analysis error covariance matrix. The following ensemble of initial fields satisfies the first condition

equation image

The ensemble mean will be an estimate obtained with the Kalman filter, while its deviation from the ensemble mean is considered as the estimation error. To describe the errors with the formulae of the classic Kalman filter a perturbed observations ensemble is specified:

equation image

In this case the best estimate is the ensemble mean. Let equation image be the best estimation error. Then equation image will satisfy the following equation:

equation image

To solve this system of equations calculations similar to those above are made. This version of the ensemble π-algorithm is expressed as:

equation image(15)

where equation image is (N× N) matrix with elements

equation image(16)

and F2 is a (L× N) matrix with columns {M(xn(tk)) + ηn(tk), n = 1,equation image,N}. The matrix DT is calculated by formulae similar to Eqs (6)–(9). The only difference is that F here is a matrix with columns equation image:

equation image(17)

and equation image is a matrix with columns equation image:

equation image(18)

The best estimate of the desired field x at time tk+1 will be the ensemble mean:

equation image(19)

Thus, the ensemble π-algorithm consists of the following steps:

  • (1)specify the (L × N) matrices F, equation image (Eqs (17) and (18));
  • (2)calculate the (N × N) matrix C (Eq. 12);
  • (3)calculate the (N × N) matrix Π (Eq. 14);
  • (4)calculate the (L × N) matrix D (Eq. 9);
  • (5)calculate the ensemble x(k+1) (Eqs (15) and (16)) and ensemble mean (Eq. 19).

The ensemble π-algorithm, like all ensemble algorithms, has a number of disadvantages due to the fact that the number of ensemble members is limited. In particular, there may occur non-zero forecast error covariances at long distances, and the number of leading model operator's eigenvectors may be larger than the number of ensemble members. This is the reason why some authors suggest using the ensemble algorithm locally (Hunt et al., 2007; Szunyogh et al., 2008). As follows from formula (7) for the elements of matrix Π, this matrix is the same for all grid points where analysis is carried out. This feature significantly simplifies the application of the algorithm for each of the grid points or for a group of points.

The algorithm in question can also be performed independently for a group of grid points that belongs to a particular subdomain. The subvector equation image and matrixes H and R are specific for each subdomain. As follows from formulae (12), (14), (15) and (16), matrices Π and D can estimated for all grid points with the same subvector equation image.

To reduce the false covariations at long distances so-called localization is used. There are two main types of localization (Greybush et al., 2011). With one type the forecast error covariance matrix elements are multiplied by a function of the distance between grid points in such a way that the corresponding correlation at long distances tends to zero. With the other type the observation error covariance matrix is multiplied by a distance-dependent function in such a way that the observations located at long distances from the current grid point are considered to have infinite error. Formally such localization can be applied to the ensemble π- algorithm. But it must be stressed that in both cases the localization violates the classic Kalman filter formulae for analysis-error and forecast-error covariance matrices.

The simplest way of localization in implementing the ensemble π-algorithm is shown by a numerical example in section 5. To perform the algorithm under more real conditions additional research into localization methods is required.

4. Comparative analysis of π-algorithm and LETKF

Consider the formulae of the ensembleπ-algorithm as they are given in section 3 of this article. As usual in the Kalman filter algorithm mark the steps of analysis and forecast with indexes ‘c’ and index ‘f’, respectively. To simplify our consideration, take operator H as linear. Define Dxa as a matrix with columns equation image, and Dxf as one with columns equation image, and as one with columns equation image

The steps of analysis can be written as:

equation image(20)


equation image(21)

For simplicity the dependence on time tk will be neglected because the same moment of time is considered.

In LETKF the step of analysis is performed only for the ensemble mean (Hunt et al., 2007):

equation image


equation image

Dxais calculated from equation image. Matrix  is (Hunt et al., 2007)

equation image

The above formulae show that LETKF and the ensemble π-algorithm generate the analysis ensemble in different ways. In LETKF equation image, with equation image. For the ensemble π-algorithm:

equation image

The ensemble element for the analysis steps in LETKF can be expressed as:

equation image(22)

where equation image, at this

equation image

It should be emphasized that matrices K = PaHTR−1 in Eqs (20) and (22) are different. This means that Eq. (22) for equation image in LETKF is different from Eq. (20) by the term equation image on the right-hand side. As for the ensemble algorithms equation image, and the order of estimation convergence is equation image, the analyses steps in both schemes differ by a term of that order.

Thus, in the ensemble π-algorithm an ensemble of fields is specified so that its mean will be the best estimate according to the Kalman filter, while the deviation from the mean value will model the estimation error described by Eq. (21). This is an important property of the π-algorithm because the error Eq. (21) is of the same form as the error Eq. (5). Originally Eq. (5) was obtained under the assumption that the estimation error is a deviation from the ‘true’ value.

It must be stressed that by setting an analysis ensemble with formula (20) in the π-algorithm, the classic relation between Pa and Pf

equation image

cannot be performed with sufficient accuracy (see Appendix B). This is typical for all ensemble algorithms that use perturbed data (Evensen, 2007, pp. 41–43; Whitaker and Hamill, 2002).

Comparing the above algorithm with LETKF the following conclusions can be drawn. The basic arithmetical operations in both algorithms are performed with matrices of an order equal to the number of ensemble elements. Steps 1–4 require the same number of operations as steps 1–6 in LETKF (Hunt et al., 2007). Step 5 differs from step 8 of LETKF by calculating matrix equation image. Here the number of computer operations is proportional to J × N2, where J is the dimension of the observation vector.

5. Numerical experiments

To estimate the above algorithm for practical applications numerical experiments were carried out with the one-dimensional nonlinear Burgers' equation:

equation image

The equation was considered for the latitude circle θ0 = 45° with periodic boundary conditions. The equation was solved by a Leap Frog/DuFort–Frankel finite-difference scheme; an example is described by Kalnay (2002) for a grid with Δλ = 2.5°, time step Δt = 1 h and initial conditions in the form of a large-scale wave u(λ) = U0sin(λ),U0 = 10 ms − 1. Parameter α = 0.001.

A true value was simulated by the same model with initial conditions

equation image

where ζ0 = σfN(0,1)f = 1.5 and N(0,1) is a random variable distributed according to a normal law with mean equal to 0 and dispersion equal to 1. The observation data were simulated by the same model (twin experiments). The experiments were performed with observation data obtained by superposition of a random observation error ε = σ0N(0,1) on the true value xt(λ,t)

The model was integrated for 48 h. The observation data were available every 6 h within a band of 36 grid points from number i1 to number i2:

equation image

where ntime is the step number in time (6, 12, 18 and so on).

A series of numerical experiments on model data assimilation were performed using the above algorithms based on the ensemble approach. With the ensemble π-algorithm (section 3), as well as with the LETKF algorithm, the following ensemble of N initial fields was used:

equation image

where x0 was regarded as a preliminary estimate of x, equation image. In all experiments σ0 = 0.5. With N initial fields N forecasts with assimilation were calculated. The ‘best’ estimate at each time step was ensemble mean equation image and the estimation error was the root-mean-square deviation equation image from xt (true value). In addition, the trace (tr) of the estimation error covariance matrix was calculated.

The numerical experiments were carried out with different values of N (the ensemble size). The forecast error covariance matrix initially was set as P0 = (σf)2I, and the observation error covariance matrix was equation image, where I is the unity matrix and Q = 0 (no noise in the model).

To calculate the square root of a matrix a procedure from netlib was used (http://www.netlib.org/eispack/). The procedure calculates the eigenvalues and eigenvectors of a symmetric positive-definite matrix. Let the eigenvectors of matrix C1 be denoted as Ψ, and the eigenvalues as {μi,i = 1,equation image,N} then (Bellman, 1960):

equation image

Equation (14) was solved by the approximate algorithm described in Appendix A. After approximate estimation of (C + 0.25I)1/2,tr(C2)/tr(C1) and tr(ΔZ)/tr(Z0) were compared. These values were compared for different σ0 {0.1;0.5; 1} and different N{25, 50, 100, 200}. In all cases the results were numbers of order 10−2 − 10−3. For matrix inversion of Z0 in calculating ΔZ0 (Eq. (A2)) the eigenvalues and eigenvectors obtained from matrix (C1 + 0.25I) were used. To calculate DT with formula (9) a simple iterative method was used (Marchuk, 1987) for matrix inversion. As the first approximation for the iterative algorithm DT the values obtained under condition C2 = 0 were used.

The approximate description of the covariances with a limited number of ensemble members has resulted in so-called ‘false covariances’ of errors at grid points located at long distances. This is a common disadvantage for all ensemble methods (Lorenc, 2003). In the numerical experiments carried out some corrections at analysis steps of were performed but only in the area where the observations were available (for grid points from i1 to i2). To perform analysis at each grid point all the observations from the area selected were used.

The numerical experiments on data assimilation were carried out using both the ensemble π-algorithm and LETKF (Hunt et al., 2007) for different values of the ensemble size. It should be stressed that the root-mean-square (RMS) error and the covariance matrix trace were almost the same for N ≥ 50. In Figure 1 the forecast RMS errors obtained using the ensemble π-algorithm (pi) and LETKF (letkf) are shown. Figure 1(a) is for N = 10 and Figure 1(b) is for N = 100. In Figure 2 the estimation error covariance matrix trace obtained using the ensemble π-algorithm (pi) and LETKF (letkf) for N = 100 is shown.

Figure 1.

(a) The root-mean-square (RMS) error obtained in numerical experiments (N = 10). Value of the RMS error (m s−1) is denoted ‘pi’ and ‘letkf’ denotes the ensemble π-algorithm and LETKF correspondently. (b) The RMS error obtained in numerical experiments (N = 100). Value of the RMS error (m s−1) is denoted ‘pi’ and ‘letkf’ denotes the ensemble π-algorithm and LETKF correspondently.

Figure 2.

The estimation error covariance matrix trace (N = 100). Value of the covariance matrix trace (m2 s−2) is denoted ‘pi’ and ‘letkf’ denotes the ensemble π-algorithm and LETKF correspondently.

Figure 1 shows that the RMS errors in the ensemble π-algorithm are smaller compared with the error in LETKF at N being significantly smaller than the dimension of the vector to be estimated (which is true in actual practice). For N comparable with this dimension the RMS errors of both algorithms are almost identical, although there was a small advantage of the π-algorithm over LETKF. From Figure 2 it can be concluded that the covariance matrix trace in the π-algorithm decreases more slowly than it does in LETKF. Since the RMS errors remain almost identical, the π-algorithm seems to have slower divergence with time.

Thus, the numerical experiments carried out have shown that the ensemble π-algorithm can be used for practical applications. Additional investigation is required to use the algorithm for real assimilation problems.

6. Conclusion

In this paper an ensemble-based algorithm for data assimilation was proposed and compared with the Kalman filter algorithm. An advantage of the algorithm is that it does not require an additional calculation of an ensemble of perturbations that match the analyses error covariance matrix, because the algorithm does it automatically.

To summarize:

  • (1)A suboptimal algorithm of data assimilation based on an ensemble approach was proposed. The peculiarity of the algorithm is the introduction of an equation for the estimation error. Its solution is used to evaluate the covariances.
  • (2)The basic arithmetic operations are carried out with matrices of the same order as the dimension of the ensemble. The operation count of the algorithm is close to that of LETKF.
  • (3)The algorithm can be realized locally for individual grid points, as well as for groups of points.


I would like to thank the anonymous reviewers for their valuable remarks that improved the content of the paper and helped the author partly reconsider their point of view on the results obtained. This work was supported by Project IV.31.2.1 under the Fundamental Research Program of Siberian Branch of Russian Academy of Sciences (2010-2012); by Project 4 under Interdisciplinary Integration Research Program of Siberian Branch of Russian Academy of Sciences (2009-2011); by Grant 09-07-00103-a of the Russian Foundation of Basic Research.

Appendix A: Solution for the System of Nonlinear Equations

Consider element (n,m) of matrix C2:

equation image

Then the matrix C2 trace can be calculated by:

equation image(A1)

In the formula equation image is interpolation at the observation point with equation image is a diagonal element of matrix Rk+1 (matrix Rk+1 is assumed to be diagonal). The inner sum in the bottom line of Eq. (A1) is approximately equal to the covariance between the observation error and the forecast error at the points of observation. As is well known, one of the conditions of the Kalman filter is that these covariances are zero. Taking this into account, assume that the norm of matrix C2 is one order of magnitude less than that of matrix C1.

Equation (14) can be solved approximately in the following way. Let Z = (ΠT + 0.5I). Let Z = Z0 + ΔZ, and equation image. Assume that matrix ΔZ2 can be neglected, and matrices Z0ΠΔZ are commutative, the following system of equations for ΔZ can be obtained

equation image

and, therefore

equation image(A2)

Thus, the equation is solved in two steps: calculation of the square root of the symmetric and positive-definite matrix Z0 and estimation of ΔZ. To obtain this, C2 is multiplied by equation image (Eq. (A2)).

Appendix B: Relation between Analysis-Error and Forecast-Error Covariance Matrices

Let K = PaHTR−1. Formula (21) implies that:

equation image(B1)

where A = (IKH)DxfETKT + KE(Dxf)T(IKH)T. As the classic Kalman filter assumes no correlation between the observation and forecast errors, then A0. Estimating matrices Pa and Pf by the corresponding ensembles, formula (B1) shows that

equation image(B2)

After some manipulations, the following equation is obtained:

equation image

Assume that HTKI. Then PaHTR−1(I + HPfHTR−1) = PfHTR−1 or

equation image(B3)

Thus, performing the classic Kalman filter formula (B3) requires generating errors in such a way that equation image.