## 1. Introduction

Two- and three-dimensional elliptic partial differential equations (PDEs) play a major role in numerical weather prediction (NWP) and their solution is often the main bottleneck.

In NWP, the state of the atmosphere is described using a set of ‘model variables’ such as wind velocity, pressure, moisture and temperature. Data assimilation (DA) is the process of finding the best estimation of the current state of the atmosphere by combining a previous forecast and the knowledge of atmospheric dynamics with observational data and statistical data that measure the accuracy of the forecast and of the observations. Uncertainties are quantified with probability density functions, and a model state is found that is the statistically optimal estimate of the truth given the previous forecast and new observations, as well as estimates of the errors in each of them. Due to the chaotic nature of the governing equations in the model, any errors in the initial conditions will be amplified in the forecast. Thus, despite the continuous advancements in computational power and in numerical methods, these benefits cannot be fully realised in NWP without accurate data assimilation techniques.

A very successful and popular data assimilation technique, used e.g. in the UK Met Office's DA software VAR, is incremental 4D-VAR described by Lorenc (1986), Rawlins *et al.* (2007), and Katz *et al.* (2011). This method attempts to minimise a cost function with respect to the forecast error (or increment) **x**′ = **x** − **x**_{f} of the model variables, where **x** is the true model state and **x**_{f} is the previous forecast. It relies on the inversion of the ‘background error covariance’ matrix **B**, which measures the uncertainties of the background state, i.e. an approximation of the current state of the atmosphere produced from a previous forecast. Unfortunately, **B** is a dense matrix due to the strong correlations between errors in the model variables. Moreover, **B** is a large matrix of size ��(10^{7} × 10^{7}), hence it is too large to store explicitly, and attempting to invert it is operationally impractical.

However, the VAR problem can be expressed in terms of new variables that allow to simplify the background error covariance matrix in a process known as the ‘control variable transformation’ (CVT) or parameter transform (Bannister, 2008). The CVT is the transformation between the model variables and new ‘control variables’, whose errors may be assumed uncorrelated to a reasonable approximation. By expressing the VAR problem in terms of the control variables, it is possible to approximate the **B**-matrix by a block-diagonal matrix neglecting correlations between different control variables. Further transformations can then be used to spatially decorrelate the variables so that the **B**-matrix essentially becomes a diagonal matrix (see Ingleby, 2001).

The idea of the CVT is to partition errors into ‘balanced’ and ‘unbalanced’ components which are assumed mutually uncorrelated. ‘Balanced’ flows are those which are close to geostrophic and hydrostatic balance. Following Hoskins *et al.* (1985), it is known that the evolution of such flows can be described by the transport of the potential vorticity (PV) together with a diagnostic calculation of the mass and wind fields. This calculation involves solving two- and three-dimensional elliptic problems which can be computationally very costly. For example, an option in the current CVT is the solution of the quasi-geostrophic Omega (QG-Ω) equation for finding the vertical velocity increment (see Fisher, 2003). A robust numerical method for this equation has recently been developed by Buckeridge and Scheichl (2010).

Operationally at the UK Met Office, the control variables used are the streamfunction *ψ*, the unbalanced pressure *p*_{u} and the velocity potential *χ*. The streamfunction is used to represent the balanced component of the flow. However, that assumption cannot be applied on scales larger than the Rossby radius of deformation, or aspect ratios greater than *f/N* where *f* is the Coriolis parameter and *N* the Brunt-Väisälä frequency, meaning that the balance operator has to be disabled on those scales.

A better decorrelation of the variables can be obtained by a CVT based on PV described and formulated by Cullen (2003), Wlasak *et al.* (2006); Bannister *et al.* (2007). The balanced components of flow are described by the PV whilst the unbalanced components of flow have no PV and are therefore associated with so-called ‘anti-PV’. Using this premise, a new control variable is chosen that is related to PV to represent the balanced component of the forecast error. The control variables used in this new ‘PV-based’ CVT (see Bannister and Cullen, 2009) are the balanced streamfunction *ψ*_{b}, the unbalanced pressure *p*_{u} and the velocity potential *χ*. The new formulation recognises the presence of an unbalanced streamfunction and exploits the association between PV and the balanced component of the flow. Hence, it should not suffer the shortcomings of the vorticity-based CVT. However, it introduces a new highly ill-conditioned PDE, namely the ‘balanced PV-equation’ (see Bannister and Cullen, 2007):

where is the 2D Laplacian in spherical polar coordinates (*r,ϕ,λ*). The coefficients *α*_{0}(·,·), *β*_{0}(·,·), *γ*_{0}(·,·), *ε*_{0}(·,·) and *m*_{0}(·,·) are reference state values, specified in Appendix A (see also Bannister and Cullen, 2009), that only vary with latitude and radius. *PV*′(·,·,·) is the potential vorticity increment. The balanced PV-equation (1) has to be solved for the balanced streamfunction increment *ψ*_{b}′(·,·,·). However, it requires the balanced pressure increment *p*_{b}′(*r,*·,·) which is the solution to the linear balance equation (see Bannister and Cullen, 2007):

where *ρ*_{0} is a reference value of the density and *f* is the Coriolis force. Note that for a constant Coriolis force, i.e. *f*(*ϕ*) = *f*_{0}, (2) reduces to . In this case, (1) simplifies to a standard elliptic PDE, similar to the QG-Ω equation, that can be solved optimally using a novel multigrid method based on a conditional semi-coarsening strategy in Buckeridge and Scheichl (2010). Appendix C briefly explains some of the mathematical jargon related to the iterative solution of elliptic linear systems.

For global computations it is not reasonable to assume that *f*(*ϕ*) is constant, and so, due to the embedded 2D elliptic solves for *p*_{b}′, the multigrid method in Buckeridge and Scheichl (2010) cannot be applied directly to the 3D problem (1). Instead we have to resort to a Krylov subspace method (such as Conjugate Gradients) and precondition it by applying multigrid to a simplified form of (1), similar to the QG-Ω equation. This preconditioning system can be solved optimally (i.e. robust with respect to grid refinement) using the method in Buckeridge and Scheichl (2010). Since, asymptotically, as the mesh size goes to zero, the simplified system has essentially got the same spectrum as the full system (1), the optimal convergence of the multigrid preconditioner for the simplified system (demonstrated already in Buckeridge and Scheichl (2010)) translates also into an optimal convergence of the resulting preconditioned Krylov method for (1). Numerical tests confirm that the number of iterations does not grow with problem size.

The Krylov method is only optimal when preconditioned with a robust multigrid method. With standard preconditioners, such as ADI-type preconditioners used at the Met Office (see Birkhoff *et al.*, 1962), the number of iterations grows with problem size and the method actually fails to converge beyond a residual tolerance of 10^{−1}.

The rest of the paper is organised as follows. In Section 2, the individual steps in the PV-based CVT are described. Section 3 describes the discretisation of a general class of elliptic problems in spherical polar coordinates by means of a finite volume method on the type of grids used in VAR. In particular, this will cover the simplified preconditioning system. Section 4 then gives the results for solving this system numerically using the multigrid method in Buckeridge and Scheichl (2010) confirming its optimality in solving elliptic problems in spherical polar coordinates. In Section 5, we use a Krylov subspace method to devise an optimal solver with a multigrid preconditioner for (1). We test the performance of the method and make comparisons with the method currently employed at the Met Office using actual case studies in VAR. We then use this solver to test the accuracy of the complete PV-based CVT in Section 6. Finally, in Section 7 we confirm the parallel scalability of the code on a typical multi-core architecture.