SEARCH

SEARCH BY CITATION

Keywords:

  • data assimilation;
  • error growth

Abstract

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Formulation of variational assimilation
  5. 3. Three-body model
  6. 4. Three-body model experiments
  7. 5. Discussion
  8. Acknowledgements
  9. References

The justification for the standard four-dimensional variational data assimilation (4D-Var) method used at several major operational centres assumes a perfect forecast model, which is clearly unrealistic. However, the method has been very successful in practice. We investigate the reasons for this using a toy model with fast and slow time-scales and with non-random model error. The model error is chosen so that the solution remains predictable on both time-scales. The fast modes are much less well observed than the slow modes. We show that poorly observed modes can be best forecast by using a regularization matrix in place of the background-error covariance matrix, and using it to give a much stronger constraint than that implied by the true background error for these modes. The effect is that use can be made of observations over a longer time period. This allows the resulting forecast-error growth to be reduced to much less than that of random perturbations generated using the analysis-error covariance matrix and even less than the model error growth given sufficiently accurate observations. © Crown Copyright 2010. Published by John Wiley & Sons, Ltd.


1. Introduction

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Formulation of variational assimilation
  5. 3. Three-body model
  6. 4. Three-body model experiments
  7. 5. Discussion
  8. Acknowledgements
  9. References

Four-dimensional variational data assimilation (4D-Var) has been used successfully at many major operational centres for several years. Examples are the Met Office (Rawlins et al., 2007) and the European Centre for Medium-Range Weather Forecasts (ECMWF: Klinker et al., 2000). In each case the operational introduction of 4D-Var has resulted in significant improvements in performance. However, the theoretical justification of 4D-Var, for instance given by Lorenc (1986), assumes a linear and perfect forecast model and Gaussian errors with zero mean in the background forecast and observations. Under these assumptions, 4D-Var gives a statistically optimal estimate of the state of the atmosphere. As a result, much research has been carried out since the operational introduction of 4D-Var to improve the formulation so that these assumptions can be relaxed. Examples are the nonlinear transform used to assimilate humidity into the ECMWF model by Hólm (2007) and the use of weak-constraint 4D-Var (Trémolet, 2006) to allow for model error.

It is difficult, however, to reconcile the theoretical limitations of 4D-Var with its practical success in situations far from those for which it is valid. This has motivated studies of which aspect of 4D-Var contributes most to its performance, for instance those by Lorenc and Rawlins (2005) and Laroche et al. (2007), which demonstrate that much of the improvement comes from incorporating a linearized version of the forecast model within the system. Diagnostic studies, for instance that by Cardinali et al. (2004), show that most of the information in an analysis comes from earlier observations via the background state. In the ECMWF 12 hour cycled system, Cardinali et al. showed that 85% of the information typically came from the background. The effect is that the analysis error is only slightly smaller than the background error.

Satisfactory performance of a cycled system requires that the growth of the analysis error during the assimilation window is on average compensated for by the reduction in error due to the observations. This is achieved in operational systems. Since the next background error is the analysis error evolved through the assimilation window, this can only be reconciled with Cardinali et al.'s results if the analysis error does not project strongly on to rapidly growing modes. The theory of 4D-Var shows that the analysis preferentially uses rapidly growing modes to fit the data; thus the analysis error in these modes is small. This is confirmed by toy-model experiments, e.g. Trevisan at al. (2010). The evidence from operational performance (Piccolo, 2010) is that this must be done efficiently, so that the subsequent error growth during the assimilation window is small. This is despite the fact that the background-error covariance matrix used in all current operational systems is essentially climatological, and thus contains considerable averaging. It is thus likely that this matrix underestimates the true errors in rapidly growing modes. This appears inconsistent with the observed efficiency of the analysis in correcting them.

In this article we illustrate how optimum forecast performance is obtained by forcing the analysis to use only slowly growing modes to fit the observations. This results in greater weight being given to observations from earlier assimilation cycles. Given sufficiently accurate observations, we show that the subsequent error growth in the forecast can also be reduced. We demonstrate this using the three-body model used for studies of 4D-Var by Watkinson (2006). This model supports rapidly growing perturbations, so is suitable for investigating the issue raised in the previous paragraph. The three bodies are referred to as sun, planet and moon. In this model there are two time-scales: a slow time-scale associated with the motion of the planet round the sun and a fast time-scale associated with the motion of the moon round the planet. We aim to make useful predictions of both modes. The case in which the fast time-scale is not accurately predicted by the model and has to be treated as ‘noise’ is discussed in a companion article (Cullen, 2010). We use two different versions of this model as the ‘truth’ model, which generates the trajectory from which the observations are drawn, and the ‘forecast’ model. This ensures that the forecast diverges from the truth unless the observations are successfully assimilated, as is the case in the real atmospheric system.

We can justify the use of a smaller background-error covariance matrix by thinking of 4D-Var as a method of regularizing the otherwise ill-posed problem of fitting a model state to the observations. This is described in Johnson et al. (2005b), where it is shown that 4D-Var corresponds to a Tikhonov regularization using the forecast background. We show that the 4D-Var algorithm can be re-interpreted as a regularization using a complete model trajectory, under the assumption that the model trajectory is accurately represented by the evolution of the Jacobian of the model starting from a given initial state. The studies of Lorenc and Rawlins and of Laroche et al. cited above suggest that the use of the trajectory is an essential part of the success of 4D-Var. The ‘optimal’ regularization would be the choice of background-error covariance matrix that minimized the short-range forecast error, which will not necessarily be the ‘true’ background-error covariance matrix.

There is a close link between this procedure and the use of a model-state control variable to represent model error in weak-constraint 4D-Var (Trémolet, 2006). If we consider an arbitrarily long window, so that the background becomes irrelevant, the weak-constraint method fits a time sequence of observations with a model trajectory to which small corrections are applied periodically. The regularization approach would seek the smallest corrections that would have to be made to a model trajectory to enable it to fit the observations to within the observational error over a long time period. The optimal-state estimation approach would make the corrections depend on the model error, so that large corrections could be made to modes where the model is inaccurate. We show that in situations where the model error growth is slower than the growth of perturbations under the action of the model, the regularization approach makes corrections to the trajectory that are smaller than the model error and is successful in improving the forecasts. However, unlike the long-window approach, the trajectory is computed sequentially rather than by solving a simultaneous minimization problem. It would be of interest to see whether further benefit could be obtained by applying the same regularization matrix within a simultaneous minimization problem.

All calculations have been carried out using Mathematica® 6.0 (Wolfram, 2007).

2. Formulation of variational assimilation

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Formulation of variational assimilation
  5. 3. Three-body model
  6. 4. Three-body model experiments
  7. 5. Discussion
  8. Acknowledgements
  9. References

We use the standard notation for describing variational data assimilation defined by Ide et al. (1997). The conventional formulation of 4D-Var can then be written as follows. Suppose time is discretized, with the index j denoting time steps. Define a state vector xj: for each j this is an l-dimensional vector. We denote truth values of xj as xt,j. Assume truth values evolve forward through one time step under the nonlinear operator N, so that

  • equation image(1)

and that we have a nonlinear forecast model Mj,j−1 that evolves xj−1 forward for one time step so that

  • equation image(2)

In weak-constraint 4D-Var, (Trémolet, 2006), we assume that Nx = Mx + q, where q represents the model error.

We assume we have m observations yi: 1 ≤ im and a nonlinear observation operator Hi generating the ith observation from a four-dimensional state {xt,j: 0 ≤ jn} valid at the same time. n is the number of time steps. Assume the observations have uncorrelated zero-mean Gaussian observation errors vi with covariance R. Then we write

  • equation image(3)

Denote the Mahalanobis norm for column vector x and matrix A by

  • equation image

Then strong-constraint 4D-Var implies that

  • equation image(4)

where xb is the background state valid at t = 0. This requires determination of a single state vector x0 and a means of calculating the four-dimensional state x(·) from it. The perfect model assumption is made so that (2) is used for this purpose. The statistical assumption behind the formulation of the observation term in J(x0) is stated before (3). The assumption behind the background term is that we can write

  • equation image(5)

where x′ is a Gaussian random variable with zero mean and covariance P. P can evolve in time, as in Kalman filter theory. We also assume that x′ is uncorrelated with vi for all i.

Following Lorenc (1986), Cullen (2010) shows that in the strong-constraint formulation of 4D-Var, Jb can be rewritten as

  • equation image(6)

where Mpj,0 is the Jacobian matrix of M evaluated at xb, the cost function is evaluated every p time steps and the background-error covariance grows according to the equation

  • equation image(7)

Equations (6) and (7) are only valid for Gaussian background errors with zero mean and a perfect linear forecast model.

Essentially the same argument applies in incremental 4D-Var, where the forecast model is nonlinear but perturbations are assumed to obey linear equations. The effect of (7) is then that the error growth is most rapid for the most rapidly growing singular vector, as illustrated for an idealized problem by Johnson et al. (2005a). If nonlinear 4D-Var is used, then (7) will be in error because the assumption of Gaussianity is not valid under nonlinear evolution. Since the initial error growth is linear, the Gaussian assumption will be valid for a while and the fastest growing structures may not be too badly estimated. The total error growth over the assimilation window will be incorrect if the linearity of the perturbation growth breaks down. If the model is imperfect, but the model errors are not too large, the estimated error growth may still be accurate enough to be useful.

We next rewrite Jo in (4) as a function of x0 as in Lorenc and Payne (2007). First assume x0xb is small, as in the justification of incremental 4D-Var. We can then write

  • equation image(8)

where M propagates the initial state forward to all times within the window using (2). Then write

  • equation image(9)

where y′ = {yiHi(M(·,0)(xb))} and H is the Jacobian matrix of H, which includes the selection of observation times as well as positions. We only consider the evolution of errors in the space spanned by M, so that M is invertible. Suppose x0 = xa minimizes (4). Since the gradient of J is zero at the minimum, we can show that

  • equation image(10)

where the gain matrix K is defined by

  • equation image(11)

The behaviour of K depends on whether P−1 is larger or smaller than MTHTR−1HM. This can be illustrated in the special case where all matrices are square, diagonal and invertible, so equation image with k being the kth diagonal element. We also write the shorthand MTHTR−1HM = {(MHR−1MH)}k and write K = {Kk}. For variables k where equation image, then Kkequation imagePk(MHR−1)k. The assumption on Pk means that this implies equation image. If equation image, then equation image. In both cases, this means that Kk is smaller for rapidly growing modes, and thus (10) shows that the analysis increments will also be small.

Now assume, as in the analysis of standard 4D-Var, that the errors in the background and observations are Gaussian with zero mean and the model is perfect. Then the analysis-error covariance matrix is

  • equation image(12)
  • equation image(13)

Consider again the case of diagonal matrices. If Pk is small in the sense described above, then (KHM)kequation image 1, so Akequation imagePk. If Pk is large, then (KHM)kequation image 1 and Akequation imagePk.

If all the observations are at the end of the 4D-Var window, we can consider 4D-Var to be equivalent to 3D-Var at the end of the window with a background-error covariance matrix given by (7). The analysis error at the end of the window is MAMT, which will also be the background error for the next cycle.

In the present article, we consider the problem of cycled 4D-Var where the same background-error covariance matrix is used for all the analysis cycles. This reflects current operational practice in many centres. Usually this B matrix is regarded as an estimate of the ‘true’ background error P, which will evolve through the cycles. Use of B instead of P is justified under a stationarity assumption. In the present case we use a regularization matrix in the sense of Johnson et al. (2005b). Since this may be very different from an estimate of P, we write it as C. If too small a value of C is used, it is possible that the analysis may be unable to stay close to the observations (equivalent to filter divergence in a true Kalman filter). If too large a value is used, the analysis will largely depend on the current set of observations. If these are incomplete, the analysis error may become very large. These effects are demonstrated in Fisher (2007).

The use of a regularization matrix C instead of P means that (11) becomes

  • equation image(14)

A direct calculation of < (xaxt,0)(xaxt,0)T >, where < · > denotes an expectation, then gives

  • equation image(15)

instead of (13).

Satisfactory performance of a cycled system requires that the increase in P due to the time evolution has then to balance the reduction in P by the observations. In the case of a perfect linear model where the true P is used in the analysis, this balance is expressed by

  • equation image(16)

If we allow for model error and assume that the model error accumulated over an assimilation window has a Gaussian distribution with zero mean and covariance Q, and is additionally uncorrelated with the analysis error and the observation errors, then (16) becomes

  • equation image(17)

If a regularization matrix C is used instead of P in the analysis, the equivalent of (16) is obtained by replacing P by MAMT in (15) and the equivalent of (17) is obtained by replacing P by MAMT + Q in (15).

Considering again the diagonal case, we can see from (15) that if, for some k, Ck is underestimated then Kk will be reduced and so Akequation imagePk. Therefore the growth in Pk under the action of the model will not be compensated for and Pk will grow. Now suppose that Ck is overestimated. In this case we can show that (14) implies that Kequation image (HM)−1 if HM is invertible, in which case (15) implies that A is bounded by

  • equation image(18)

In 3D-Var, when M = I, this will only be finite if the observations are complete. In 4D-Var it can be finite for incomplete observations if the action of the model through M introduces sufficient multivariate coupling. This is an important advantage of 4D-Var.

It is not practical to construct a formal optimization problem to find C to minimize the forecast error, since the result would also depend on the observations. Two practicable strategies are discussed.

The first is to ensure that the error reduction in rapidly growing modes given by (13) is large enough to compensate for the growth. This requires a sufficiently large C, but also sufficient observations to ensure that (18) controls the analysis error. Experiments with the three-body model using 3D-Var, to which (13) also applies with M replaced by I in (18), showed that satisfactory performance of a cycled system could not be obtained with incomplete observations. 4D-Var performs better because of the multivariate coupling introduced by the use of M in (18).

The second strategy is to fit the model trajectory to the observations over a longer time period. This is the justification for long-window 4D-Var (Trémolet, 2006). In order to exploit information from observations over a long time period, it is necessary to use the smallest possible C, which acts as a proxy for the model error corrections used in long-window schemes. In order to prevent systematic error growth over the cycles, it is at least necessary that the analysis increment evolved over the window is as large as the model error growth during the window. In the diagonal case, this requires

  • equation image(19)

If the perturbation grows under the action of the model, this implies Ck< Qk. Note that this is different from the optimal-state estimation approach, which would set C = Q. Fisher et al. (2005) shows that a large Q acts as a ‘forgetting factor’, so that information from past observations is not used to analyse modes where the model error is large. Equation (19) shows that perturbation growth in the model can be exploited to allow smaller increments to be made. This allows more information from earlier observations to be retained.

Suppose that we choose a C matrix and use it to complete a set of analysis cycles that fits the observations to within observational error. This means that a model trajectory to which a correction has been added at the end of each window fits the observations to within the observational error over a long period. The statistics of the analysis increments will then define an upper bound on the size of the increments necessary to maintain the fit to the observations. These statistics can be used to define a new C matrix and the cycles repeated. There is clearly no guarantee that this method will converge, or that it will find the smallest C that is sufficient to do the job independently of the first guess.

3. Three-body model

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Formulation of variational assimilation
  5. 3. Three-body model
  6. 4. Three-body model experiments
  7. 5. Discussion
  8. Acknowledgements
  9. References

3.1. Basic formulation

The description given here is an abbreviated version of that given by Cullen (2010). Using Cartesian coordinates to represent the motion of the three bodies in a plane, the αth body has mass mα′, position vector qα′ and momentum pα′, with Greek subscripts now being used to label each body in contrast to Roman time-index subscripts. The position and momentum vectors for all three bodies are combined into the single vectors q′ = (q1T,q2T,q3T)T and p′ = (p1T,p2T,p3T)T, each having six elements to represent the two position and and two momentum components, respectively, for each body's planar motion. Where necessary, indices 1, 2 and 3 represent the sun, planet and moon respectively. Introduce dimensionless variables mα = mα/m1′, qα = qα/L and equation image, where L is a characteristic orbital distance (taken here to be the initial distance between the sun and planet) and G is the universal gravitational constant. The Hamiltonian describing the gravitational interaction of the three bodies is

  • equation image(20)

where sums are taken over all three bodies.

Define the dimensionless time equation image. Noting that |r|/∂r = r/|r| for arbitrary vector r, Hamilton's equations of motion for 1 ≤ α ≤ 3 are

  • equation image(21)
  • equation image(22)

The masses and initial position and momentum coordinates used to generate the truth trajectory from which the observations are drawn are given in Table I. These are the values used by Watkinson (2006), and result in the moon orbiting the planet while the planet orbits the sun. Table I also shows values used to define the imperfect model that is used to assimilate the observations. This model error is chosen to be large enough to affect the results, but does not dominate the error growth during an assimilation window. It is chosen to represent the sort of systematic error typical of operational models, and so intentionally does not have the form assumed in (17).

Table I. Masses (mα), initial conditions for the Cartesian position coordinates (qα,x and qα,y) and the initial momenta (pα,x and pα,y) used by Watkinson (2006) for the three-body orbital problem. The row labelled 2a indicates alternate values used for the planet's mass in the forecast model.
αmαqα,xqα,ypα,xpα,y
11.00.00.00.01−0.11
20.11.00.00.000.10
2a0.1011.00.00.000.10
30.011.00.1−0.010.01

Equations (21)–(22) have 12 degrees of freedom. Data assimilation theory requires the use of statistically independent analysis variables. In Bannister et al. (2008), it is shown that the use of dynamically independent variables such as normal modes is a reasonable proxy for statistically independent variables. In the present case, we achieve something similar by rewriting the system in relative coordinates, the wide binary coordinates of Chambers et al. (2002).

Define a new set of dimensionless position coordinates equation image by

  • equation image(23)
  • equation image(24)
  • equation image(25)

Equation (23) defines the centre of mass, (24) is the position of the sun relative to the centre of mass of the planet and moon and (25) is the position of the moon relative to the planet.

Define a new set of dimensionless momentum coordinates equation image by

  • equation image(26)
  • equation image(27)
  • equation image(28)

P2 is the total linear momentum, P1 is the linear momentum of the sun relative to the solar system's centre of mass and P3 is the linear momentum of the moon relative to the planetary system's centre of mass. P2 and Q2 are conserved quantities under the time evolution described by (21) and (22).

If all 12 degrees of freedom are used in the cost function and C is evaluated using data from the model, which inherently conserves P2 and Q2, then C will only have rank 8 and the minimization cannot be carried out. This is avoided in the current work by excluding the conserved quantities P2 and Q2 from the assimilation. This reduces the data assimilation problem to an eight-dimensional system. The truth and forecast models have different centres of mass. This difference is excluded from the assimilation process by working in coordinates relative to the centre of mass moving with the centre of mass. The coordinates are separate for each system. The effect is that the centre of mass and total linear momentum are set to zero for both the truth and forecast trajectories. The result is then identical to that of a twelve-dimensional assimilation in which the background-error covariance matrix C is written in terms of the control variables, and all rows and columns .corresponding to P2 and Q2 are set to zero. By doing this, any projection of the innovations on to the conserved quantities is ignored.

We therefore write

  • equation image(29)

and use �� and �� as the control variables in the assimilation. This is done by defining

  • equation image(30)

Using (23)–(28), we can derive formulae for linear transformations between x and w in the form

  • equation image(31)

and write the cost function J defined in (4) as

  • equation image(32)

where

  • equation image(33)

4. Three-body model experiments

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Formulation of variational assimilation
  5. 3. Three-body model
  6. 4. Three-body model experiments
  7. 5. Discussion
  8. Acknowledgements
  9. References

4.1. Experimental framework

We use the model defined in Table I to generate the ‘truth’ state from which the observations are drawn. The model was integrated forward using a Störmer–Verlet method (Hairer et al., 2003), with a time step of 0.005. We fix a time-scale by choosing an assimilation period of 0.3 time units, where the period of the moon's orbit round the planet is 0.54 time units, and that of the planet round the sun is 7.2 time units. The standard observation errors used for most experiments are listed in Table II. Experiments were carried out with observations spaced at equal intervals within the assimilation window. Those illustrated used either one observation at the end of each window or four equally spaced observations with the first taken after a quarter of the window length and the last at the end of the window. This strategy was used to allow the benefits of 4D-Var to be studied.

Table II. Diagonal R-matrix standard deviations. Elements 1–2 are observations of q1, 3–4 of q2 and 5–6 of q3. Elements 7–8 are observations of p1, 9–10 of p2 and 11–12 of p3.
ElementsStandard deviations
1–48 × 10−48 × 10−48 × 10−38 × 10−3
5–82 × 10−22 × 10−26 × 10−46 × 10−4
9–1210−310−310−310−3

As discussed in section 2, a regularization matrix C was used in the assimilation. A first guess to was chosen to be a diagonal matrix with the entries given in Table III. C was then calculated using (33). A perturbation drawn from a normal distribution with zero mean and variances given by C was added to the initial conditions in Table I. Observations were drawn from the truth trajectory and perturbations drawn from a normal distribution with zero mean and variances calculated from Table II were added to them. A set of 300 assimilation cycles was then carried out. We then set

  • equation image(34)

Another set of 300 assimilation cycles was then run, and C and thus re-evaluated. This procedure was then repeated till ill-conditioning of the matrix caused failure of the assimilation. This invariably resulted from a few eigenvalues of becoming very small.

Table III. Diagonal -matrix standard deviations. Elements 1–4 correspond to Q1 and P1. Elements 5–8 correspond to Q3 and P3.
ElementsStandard deviations
1–43 × 10−23 × 10−25 × 10−35 × 10−3
5–810−210−210−310−3

The assimilation tests were carried out using a Gauss–Newton method to minimize the cost function. The first iterate of this corresponds to incremental 4D-Var, and a second iteration to a single ‘outer loop’ as used in some operational centres. It was found that the second iteration gave considerable benefit regarding the accuracy of the resulting forecasts, but iterating to convergence did not give much further improvement and resulted in a number of failures because of the generation of excessive analysis increments. This may be related to the non-quadratic nature of the nonlinear minimization problem.

4.2. Results

The standard experiment with which others are compared uses an observation of all 12 variables once at the end of each assimilation window. In the first experiment illustrated, C was re-estimated after each set of 300 cycles by using

  • equation image(35)

instead of (34). Though it is not possible to use this in an operational system, because the truth xt is unknown, it shows how the system performs if the background error is set to a climatological estimate of its correct value.

All the analysis increments and forecast errors are illustrated using the control variables ��,�� defined in (23)–(28). For convenience we use the term ‘sun’ to refer to the ‘slow’ variables Q1, P1 and ‘moon’ to refer to the ‘fast’ variables Q3,P3.

Figure 1 shows the effect of the iterative re-evaluation of C and described above. The standard deviations of the components of the background error wtwb for the sun's position and momentum are plotted. The squares of these form the diagonal elements of for the next set of assimilation cycles. The values decrease during the early iterations and essentially converge after four sets of assimilation cycles. Figure 1 also shows that the analysis error is larger than the background error in the first set of cycles, but is about 25% lower than the background error in the final set. The standard deviations of the components of wbwa are also illustrated. These are the analysis increments, but can also be considered as a proxy for the background error. The values are substantially lower than the true background error in the final set of cycles.

thumbnail image

Figure 1. Experiment with C calculated using (35). Background-error diagnostics in dimensionless units for the sun's position (a) and momentum (b) are used for each set of assimilation cycles. Plus signs indicate values of the standard deviation of the x and y components of the (truth−background) difference. Asterisks indicate values of the standard deviation of the x and y components of the (truth−analysis) difference and diamonds indicate values of the standard deviation of the x and y components of the (analysis−background) difference.

Download figure to PowerPoint

Figure 2 shows the same information for the moon's position and momentum. There is now a very large difference, more than a factor of 3, between the values of wbwa and wbwt. The analysis errors are slightly less than the background errors, typically about 10% less. The iteration only has a small effect.

thumbnail image

Figure 2. As Figure 1 for the moon's position (a) and momentum (b).

Download figure to PowerPoint

Figure 3 shows the effect of the iterative re-estimation of C on the quality of the forecasts from each set of cycles. The forecasts are assessed over both a short-range (0.3–1.2 time units) and a medium-range (1.2–3.0 time units) period. The short-range forecasts show improvement till convergence between the last two sets of cycles. This is consistent with the behaviour of the background error. The medium-range forecast is actually best at the penultimate iteration, but this may simply represent noise as the other diagnostics indicate convergence of performance. The errors are much lower than those for a ‘no assimilation’ run, where the truth and forecast models are integrated for 300 cycles (90 time units). The resulting r.m.s. error in the sun's position is 1.02 and that in the momentum is 0.091.

thumbnail image

Figure 3. Standard deviations of the r.m.s. forecast errors in dimensionless units for the sun's position (a) and momentum (b). Plus signs are forecast errors averaged over the period 0.3–1.2 time units and over a set of 300 assimilation cycles. Asterisks are the corresponding respective values averaged over the period 1.2–3.0 time units.

Download figure to PowerPoint

Figure 4 shows the errors in the forecasts of the moon's position and momentum. These are almost independent of iteration, consistent with the diagnostics shown in Figure 2. The errors are nevertheless less than in the ‘no assimilation’ run, which gave values of 0.096 for the moon's position and 0.0103 for the moon's momentum. The reduction in the error due to the assimilation is much less than for the sun. The medium-range errors for the moon are about half the ‘no assimilation’ values, while for the sun they are only about 3% of the ‘no assimilation’ values.

thumbnail image

Figure 4. As Figure 3 for the moon's position (a) and momentum (b).

Download figure to PowerPoint

The next set of experiments uses (34) to estimate C after each set of 300 cycles. The iterative recalculation of C was repeated till failure of the assimilation due to ill-conditioning of the C matrix. This was caused by two of the eight eigenvalues of the implied becoming very small and indicates that two of the degrees of freedom were not being incremented by the analysis. The values of the background error wbwt, analysis error wawt and analysis increments wawb are shown in Figure 5 for the sun and Figure 6 for the moon.

thumbnail image

Figure 5. As Figure 1 with C calculated using (34).

Download figure to PowerPoint

thumbnail image

Figure 6. As Figure 5 for the moon's position (a) and momentum (b).

Download figure to PowerPoint

Figure 5 shows that the background and analysis errors for the sun decrease for the first four sets of cycles but then increase in the last set. The analysis increments decrease throughout. Comparing Figure 1 with Figure 5 shows that the minimum background error is slightly larger than that found when C is set to equal the true climatological background error. The reduction in background error from the analysis is similar in both experiments. The analysis increments associated with the lowest background error in Figure 5 are about 30% less for the sun's position but 15% greater for the sun's momentum than those corresponding to the converged solution shown in Figure 1.

Figure 6 shows that the diagnosed values for the errors in the moon's position and momentum are strongly affected by the recalculation of C. The background and analysis errors decrease through four sets of cycles and increase in the final set. The analysis increments decrease throughout. The minimum value reached for the background error is about half the values shown in Figure 2. The reduction in background error by the analysis is even less than in Figure 2, only about 5%. The analysis increments in the fourth set of cycles, which gave the lowest background error, are only 10% of the values shown in Figure 2. Those for the final set are even less, but now appear non-optimal.

Figure 7 shows the forecasts for the sun from this experiment. The verification periods are the same as in Figure 3. The errors for the sun's position for both periods decrease through all sets of cycles. The final value is slightly higher than that shown in Figure 3 obtained by setting C to the true background error. The short-range forecast errors for the sun's momentum are similar for all the sets of cycles. However, the medium-range forecast errors decrease from set to set throughout, though again the final value is higher than that shown in Figure 3.

thumbnail image

Figure 7. As Figure 3 for the experiment where C is defined using (34).

Download figure to PowerPoint

Fig 8 shows the forecasts for the moon. The short-range forecasts are most accurate for the fourth set of cycles, as with the values of the background error shown in Figure 6. However, the smallest values of the medium-range forecast error are obtained in the final set, as is the case with the forecasts of the sun shown in Figure 7. The best values of the forecast error are about 30% of the converged forecast error in Figure 4 for the moon's position and momentum in the short range and about 25% of the value shown in Figure 4 for the medium range.

thumbnail image

Figure 8. As Figure 7 for the moon's position (a) and momentum (b).

Download figure to PowerPoint

In order to understand this behaviour, we plot the forecast errors against time for the different sets of cycles. Figure 9 shows this information for the experiment with C calculated from (35) and Figure 10 for the experiment with C calculated from (34). For the converged solution for the sun's position in Figure 9, we see about a 30% growth of error during the assimilation window, as noted in Figure 1. There is then a further growth of a factor of 8 up to time 2.1. The effect of the recalculation of C is to reduce the analysis errors relative to the background error. The forecast-error growth in the short range increases with the recalculation, but after time 1 there is no clear signal. There is little effect of the recalculation on the forecasts of the moon's position, as noted in Figure 4. However, there is no growth of error between analysis and background, which is not consistent with (16) or (17). By time 2.1, the error grows by a factor of 8.

thumbnail image

Figure 9. Error growth against time for the separate sets of cycles in the experiment shown in Figures 14. Successive sets are indicated by plus signs, asterisks, diamonds and triangles. (a) sun's position. (b) moon's position.

Download figure to PowerPoint

thumbnail image

Figure 10. Error growth against time for the separate sets of cycles in the experiment shown in Figures 58. Successive sets are indicated by plus signs, asterisks, diamonds, triangles and squares. (a) sun's position. (b) moon's position.

Download figure to PowerPoint

Figure 10 shows a similar growth of error in the sun's position during the assimilation window in the final set of cycles, as noted after Figure 5. Again, the growth during the window and the short-range forecast period increases with recalculation of C. The growth rate in the subsequent forecast does not depend significantly on the recalculation. The results for the moon's position are very different. The background error is decreased by a factor of 1.4 as a result of the recalculations, but the medium-range forecast error is decreased by a factor of 4, a dramatic reduction in forecast-error growth. There is hardly any growth during the assimilation window.

We compare this with the evolution of the differences between the truth trajectory and the trajectory started from perturbed initial conditions using the forecast model. The differences are averaged over an ensemble of 10 initial perturbations consistent with the analysis-error covariance matrix that gave the best forecasts in the experiment illustrated in Figure 10 with C generated using (34). Figure 11 shows the evolution over the assimilation window in terms of the control variables ��,��. Both the nonlinear evolution and the evolution generated by the Jacobian of the model have been plotted, but in this case they are almost indistinguishable. The linearity assumption is thus accurate. The model error is also plotted but only makes a small contribution to the total error, showing that the error growth implied by M will be quite accurate. The error growth for the sun's position is consistent with the growth of error from the analysis to the next background state shown in Figures 5 and 10, while that for the sun's momentum is less than that shown in Figure 5. However, the error growth for the moon is about a factor of 10 for both position and momentum, which is completely different from the negligible growth between analysis and background shown in Figures 6 and 10. The analysis error for all variables is greater than or similar to the model error evolved over an assimilation window.

thumbnail image

Figure 11. Dashed line: r.m.s. differences in the evolution of the model defined in Table I resulting from perturbing the initial conditions in Table I consistently with the analysis error as defined in the text. Thin line: as the dashed line, but using the Jacobian of the model. Dotted line: differences between the evolution using the model and initial conditions defined in Table I and that using the alternate model and initial conditions from Table I. Solid line: total difference including both components. The differences are expressed using the control variables ��α,��α.

Download figure to PowerPoint

Now we consider whether the results are consistent with the requirement expressed in (19) that the evolved analysis increments must be sufficient to compensate for model error growth. The analysis increments from the runs that gave the best forecasts are shown in Figure 5 for the sun. Applying the growth rate shown in Figure 11 to the values of the increments shown for the fourth set of cycles gives values larger than the model error growth. However, applying the same calculation for the moon gives values only half as high, suggesting that the mechanism by which error growth is prevented involves more knowledge of the model dynamics than the simple argument used to derive (19).

Figure 12 shows the growth of the same perturbations over the whole period for which the forecasts are verified, so covering the full period shown in Figures 7 and 10. The errors at time 2.1 should compare with the errors averaged over the period 1.2–3.0 shown in Figures 34 and 710. The growth in the error in the sun's position and momentum is now dominated by the model error. The forecast errors shown in Figures 7 and 10 are somewhat smaller than that implied by the model error growth. The errors for the moon saturate by about time 1.5, but are primarily due to evolution of the initial error. The linearization assumption breaks down after time 1.2, much longer than the assimilation window. Thus the assimilation technique should be able to reduce the error, as we have demonstrated. The optimum forecast errors shown in Figures 8 and 10 are much lower than those implied by the model error growth over the forecast period.

thumbnail image

Figure 12. As Figure 11, plotted for the period of the forecast verification (three time units).

Download figure to PowerPoint

We now consider the degree of control of the analysis error exerted by the observations. Equation (18) shows that the analysis error evolved over the assimilation window should be less than the observation error. Figure 13 shows the same information as Figure 11 but using the physical variables qα,pα in which the observations are measured. This shows that for the less rapidly growing variables the evolved analysis error is less than the observation error. In the rapidly growing modes, which are the planet's momentum and the moon's position and momentum, the evolved analysis error is much greater than the observation error. However, in these modes the analysis error itself is still less than the observation error.

thumbnail image

Figure 13. As Figure 11, plotted in terms of the physical variables qα,pα.

Download figure to PowerPoint

Table IV shows that the analysis error is between 25% and 70% of the observation error for the various parameters. It also shows the results of another experiment where the observation errors were doubled, but the experimental method was otherwise the same as before. Thus C was separately re-evaluated for each set of cycles using the new observation errors. The analysis errors for the planet and moon are also doubled, but those for the sun are only slightly changed. These results suggest that the analysis error for the moon is being controlled by the observation error, but that (18) does not describe the control that is being exerted. In particular, the relation between analysis error and observation error does not seem to be related to the growth in the window. If we also consider the evidence of Figures 8 and 10 that the evolution of the analysis error over the window to give the next background error shows hardly any growth, there is a consistent message that (16) and (17) do not describe the evolution of the analysis error in practice, so that the analysis error must project largely on to slowly growing modes of the model. The fact that the error is being controlled by the accuracy of the observations, but is much smaller than that given by (18), which describes the information available over a single window, means that the use of observations over multiple windows is essential to the success of the forecasts.

Table IV. The r.m.s. differences between truth and forecast-model trajectories with forecast models as defined in lines 2 and 2a of Table I. Values are averaged over the period used for all assimilation cycles (30 000 time steps).
Analysis errors compared with observation errors
Parameterq1q2q3p1p2p3
Obs0.00080.0080.020.00060.0010.001
Anal0.000560.00350.00490.000300.000480.00037
Obs0.00160.0160.040.00120.0020.002
Anal0.000540.00500.00810.000560.000910.00074

Figure 14 shows the evolution of the forecast errors in the experiment with doubled observation errors in the same format as Figure 10. The results for the sun show that the forecasts where the analysis used the first-guess C matrix are 40 worse than with the original observation errors, but that the final set of cycles gave forecasts only 10% worse than those shown in Figure 10, with an error close to that implied by the model error growth. The re-evaluation of C is far more effective at reducing the error than in the original experiment. The results for the moon show increased errors throughout. The forecast errors from the final set of cycles are double those in the original experiment, confirming that the accuracy of the observations is controlling the optimum performance for the moon.

thumbnail image

Figure 14. As Figure 10 for the experiment with doubled observation errors. (a) sun's position. (b) moon's position.

Download figure to PowerPoint

Figure 15 shows the effect of using incomplete observations. Observations of momentum only were used, with the original set of observation errors as shown in Table II. Experiments showed that 3D-Var with a guessed diagonal C matrix could not complete a set of 300 cycles. This is consistent with the fact that (18) can no longer control the analysis error. 4D-Var was able to complete the cycles with a diagonal C matrix, and the estimation of a new value of C using (34) could be carried out. It was not possible to carry out as many successive re-evaluations of C before failure of the assimilation as with a complete set of observations, indicating that the control of the error growth exerted by the observations is weaker. The forecast results for the moon in the final iteration show that the performance is degraded compared with a full set of observations. However, the forecasts for the sun are much more responsive to the re-evaluation than in the fully observed experiment, and the medium-range forecast error is substantially lower. The optimum value is now much lower than that implied by the model error growth illustrated in Figure 12.

thumbnail image

Figure 15. As Figure 10 for the experiment with observations of momentum only. (a) sun's position. (b) moon's position.

Download figure to PowerPoint

Figure 16 shows an experiment where all the variables were observed four times per assimilation cycle. The results for the moon show that the medium-range forecast errors are larger than those shown in Figure 10 and the reduction in the error due to the re-evaluation of C is less. The analysis increments (not shown) are increased, indicating that the need to fit observations throughout the window prevents the use of small increments and relying on the error growth to allow observations at the end of the window to be fitted. The results for the sun show that the effects of the re-evaluation of C have almost disappeared, and the medium-range forecast errors are also greater than those shown in Figure 10 and almost twice as large as those shown in Figure 15. The values are now close to, but still slightly less than, the values implied by model error growth. These results suggest that optimum performance with forecast errors below that implied by model error growth can be achieved with sufficiently accurate observations, but not too many observations over a short period. The use of a lower value of C gives more weight to observations over previous windows, so that control by the observations is maintained over a longer time period.

thumbnail image

Figure 16. As Figure 10 for the experiment with four complete sets of observations per cycle. The notation for the iterations is as in Figure 10 with the addition of a multiplication symbol for the final iteration.

Download figure to PowerPoint

Figure 17 shows an experiment where the model error is increased by a factor of 5, so that the mass of the planet shown in Table I is set to 1.05. The experiment is otherwise as shown in Figure 10. The errors in the sun's position are about five times greater than in Figure 10, consistent with the increase in model error. However, the errors in the moon's position only increase by a factor of 1.5, indicating that the model error is not the controlling parameter. The analysis increments (not shown) are increased by a factor of 3, so that the evolved increments are comparable to the model error growth over the window and (19) now holds.

thumbnail image

Figure 17. As Figure 16 for the experiment with increased model error.

Download figure to PowerPoint

5. Discussion

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Formulation of variational assimilation
  5. 3. Three-body model
  6. 4. Three-body model experiments
  7. 5. Discussion
  8. Acknowledgements
  9. References

We have tested cycled 4D-Var with an imperfect model that supports two time-scales, both of which are regarded as predictable. The assimilation window was chosen to be comparable to the fast time-scale, and the same mix of observations was used in each window. The effect is that the growth of errors on the fast time-scale can be large between observation times, while that of errors on the slow time-scale is small. It is shown that the large growth of analysis increments on the fast time-scale during the window can be exploited to construct a model trajectory that is almost an interpolator between the observations at different times, only a very small increment is added at each analysis time. For large model errors, this increment is such that the evolved increment during an assimilation window is comparable to the model error accumulated over a window. The effect is that the forecast errors can be reduced to values smaller than the model error growth over the same period, provided the observations are accurate enough and used over a sufficiently long time interval. This is confirmed by the sensitivity of the results to the observation error. Though the error is controlled by the accuracy of the observations, the size of the error is much less than that implied by the normal observability calculation for a single window. Thus the accuracy of the forecasts shows that effective use is being made of observations spread over multiple windows. If more observations are used in each window, the effect is degraded, because less use is made of the observations in previous windows and the time interpolation effect, which allows the model error growth to be compensated for, is reduced.

The forecast errors in the slow mode, which is well observed, are controlled by the model error. The time interpolation of observations does not take place over a long enough period to compensate for the model error. However, if the observation coverage is degraded, optimization of the C matrix becomes effective in reducing the forecast error and the forecast errors can be reduced below those implied by the model error growth.

In situations where the forecast-error growth is less than the model error growth, the analysis error must be compensating for future model error growth, so the analysis itself is suboptimal. This explains why the standard theory set out in section 2 does not describe what is happening. The behaviour is more like that of a long-window 4D-Var (Trémolet 2006), where small model-state corrections are added periodically to allow a model trajectory to be fitted to a set of observations over a long time period. The present method is a degenerate example of this procedure, where the trajectory is built up in short sequential steps rather than by a simultaneous minimization over the whole window. However, the recalculation of C does introduce a dependence on the whole assimilation period of 300 cycles. It would be of interest in future work to see whether further benefit is obtained from a simultaneous minimization.

There is an important difference from the long-window approach of Fisher et al. (2005) and Trémolet (2006) in that the corrections that give the optimal forecasts are much less than would be implied by the size of the model error. This appears to be mainly because (19) permits the analysis increments to be much less than the model error accumulated over a window for rapidly growing modes of the model. However, the model error used in this study does not conform to the statistical assumptions used to define the model error in Kalman filtering, so there is no reason why that theory should predict the optimal size of the increments.

The experiments shown in this article share the characteristic behaviour of operational systems in that the analysis error is not much less than the background error. The main difference in behaviour is that 3D-Var works well in the operational context but is unable to control error growth in the three-body system. In the three-body system, the typical perturbation growth for the fast variables dominates the model error growth. This results in the best forecasts being obtained with a C that is much less than that implied by model error growth. Assuming that the viability of 3D-Var means that the finite-amplitude error growth in real systems is not large over an assimilation window, which is consistent with the diagnostics in Piccolo (2010), it is likely that the increments in real systems will have to be comparable to the model error.

The issue for practical application is how to choose C. In well-observed situations, such as the slow mode in most of the experiments, the optimum performance is given by setting C to a climatological estimate of the true background error, so the behaviour is consistent with standard theory. However, the optimum performance for the fast mode is obtained by using much smaller values of C. The use of analysis increments to determine C is a practical way of achieving the required behaviour, since the analysis increments found by 4D-Var will tend to be small for rapidly growing modes.

In the three-body system, the best forecasts are not consistently obtained from the C that gives the best analyses. This can be seen in Figures 10 and 16. This illustrates that the C most appropriate for accurate forecasts may not be the same as that where the analysis itself is the most important product, such as in reanalyses. This issue has been widely recognized.

Acknowledgements

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Formulation of variational assimilation
  5. 3. Three-body model
  6. 4. Three-body model experiments
  7. 5. Discussion
  8. Acknowledgements
  9. References

The three-body model code and much of the expository material were provided by Gordon Inverarity. Gordon also gave much useful advice and constructive criticism during the project and reviewed the final draft of the article. Andrew Lorenc, the Associate Editor and two anonymous referees also provided valuable suggestions for improvements.

References

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Formulation of variational assimilation
  5. 3. Three-body model
  6. 4. Three-body model experiments
  7. 5. Discussion
  8. Acknowledgements
  9. References