### Abstract

- Top of page
- Abstract
- 1. Introduction
- 2. Background-error covariances for rainy and non-rainy areas
- 3. Implementation of a heterogeneous B matrix in 3D-Var
- 4. Results of assimilation with four observations
- 5. Conclusions
- Acknowledgements
- References

This study focuses on diagnosing variations of background-error covariances between precipitating and non-precipitating areas, and on presenting a heterogeneous covariance formulation to represent these variations in a variational framework. The context of this work is the assimilation of observations linked to precipitation (radar data especially) in the AROME model, which has been running operationally at Météo-France since December 2008 over French territory with a 2.5 km horizontal resolution. This system uses multivariate background-error covariances deduced from an ensemble-based method. At first, such statistics have been computed for 17 precipitating cases using an ensemble of AROME forecasts coupled with an ALADIN ensemble assimilation. Results, obtained from 3 h forecast differences performed separately for non-precipitating and precipitating columns, display large discrepancies in error variances, correlation lengths and the correlations between humidity, temperature and divergence errors.

These results argue in favour of including heterogeneous background-error covariances in AROME incremental 3D-Var, allowing different covariances to be used in regions with different meteorological patterns. Such a method enables us to get increments more adequately structured in those regions, and thus potentially to make better use of observations in a data assimilation system. The implementation consists of expressing the analysis increment as the sum of two terms, one for precipitating areas and the other for non-precipitating areas, making use of a mask that defines rainy regions. This implies a doubling in the size of the control variable and of the gradient of the cost function. The feasibility of this method is shown through experiments with four isolated observations. Copyright © 2010 Royal Meteorological Society

### 1. Introduction

- Top of page
- Abstract
- 1. Introduction
- 2. Background-error covariances for rainy and non-rainy areas
- 3. Implementation of a heterogeneous B matrix in 3D-Var
- 4. Results of assimilation with four observations
- 5. Conclusions
- Acknowledgements
- References

In order to provide information in areas with no observations and to supply a realistic reference state for use in observation operators, variational data assimilation needs an *a priori* (or background) meteorological state. The background state is not perfect and its errors are taken into account in the variational system through the use of the background-error covariance matrix, denoted **B**. As pointed out by Daley (1991), **B** has a profound impact on the analysis, by (i) weighting the importance of the *a priori* state, (ii) smoothing and spreading information from observation points, and (iii) imposing balance between the model control variables. Accurate knowledge of background-error statistics is thus important to the success of the assimilation process. To estimate the background error, two main practical difficulties occur. First, the ‘true’ state is unknown. This can be addressed by using differences between forecasts as a proxy for background errors (see reviews by Berre *et al.*, 2006; Bannister, 2008). Second, because of its size, **B** can be neither estimated at full rank nor stored explicitly. This is commonly addressed by simplifying the covariance model as the product of a balance operator, which takes multivariate couplings into account, and of a spatial transform, which often uses assumptions such as homogeneity and isotropy.

The AROME numerical weather prediction (NWP) system, which provides operational forecasts of tropospheric phenomena with a 2.5 km horizontal resolution over France, uses a climatological multivariate background-error covariance matrix deduced from an ensemble-based method (Brousseau, *et al.*., 2008). The ensemble of forecasts built for that purpose gathers summer and winter cases in order to compute background-error covariances representative of a wide range of forecast errors. However it is well known that the structure functions that compose the covariances strongly depend on the weather regime. For instance, large differences between midlatitude and tropical error covariances have been shown at synoptic (Daley 1991; Derber and Bouttier 1999) and regional (Montmerle, *et al.*., 2006) scales. More recently, Caron and Fillion (2010) have shown that, compared to dry areas, forecast errors drift away from linear geostrophic balance over precipitation areas and that this deviation is proportional to the intensity of precipitation. In variational data assimilation, such discrepancies have a direct impact on the structures of analysis increments and on multivariate couplings through balance relationships. As a consequence, climatological covariances often produce sub-optimal increment structures in regions characterized by strong gradients such as precipitating fronts or the top of the boundary layer. This is particularly true when assimilating observations linked to precipitation such as radar data, because rainy areas are likely to be under-represented in the ensemble.

As pointed out by Auligné, *et al.*. (2010), one of the main issues to analyze clouds and precipitations at the mesoscale is to develop methods that aim to bring more flow-dependency to the background-error covariances. By applying specific diabatic balance operators in the linear balance operator proposed by Derber and Bouttier (1999), Caron and Fillion (2010) show that some improvements in the coupling between mass and rotational wind increments could be obtained over precipitating areas. However, the horizontal variation of this type of diabatic balance to be used during the minimization is not addressed in this latter study. One way to get some flow dependency is to consider nonlinear balance relationships in the balance operator, such as the nonlinear geostrophic balance equation (Barker, *et al.*., 2004; Fisher 2003) and the quasi-geostrophic omega balance (Fisher 2003). The latter approach is properly justified for synoptic scales, but for mesoscales prognostic rather than diagnostic balance equations (where moist physical processes play a dominant role to the leading order of approximation) may be more appropriate (e.g. Pagé, *et al.*., 2007. One could also use 4D-Var instead of 3D-Var, the evolution of **B** in 4D-Var corresponding to the evolution of the error covariances in a Kalman filter (Fisher, *et al.*., 2005). However, in this case, flow-dependency is somehow limited because **B** is replaced by a static estimate at the beginning of each assimilation cycle. Main efforts are nowadays given to the use of ensemble-based flow-dependent background-error covariances, which consists of including (partially or totally) into 3D-Var or 4D-Var a flow-dependent **B** matrix, which is computed from daily runs of an ensemble assimilation system (e.g. Kucukkaraca and Fisher, 2006; Berre, *et al.*., 2007). Although very attractive (weather-dependent covariances, sharper correlations), this method is computationally very expensive, especially for cloud-resolving models (CRMs). A solution is to consider an ensemble with few members and use optimized filtering techniques to reduce sampling error in **B**. Two main approaches are currently used. The first one considers control variable transform in ensemble sub-space (En3D-Var: Lorenc *et al.*, 2003; En4D-Var: Liu *et al.*, 2008) using localizations with Schur operators to reduce sampling noise effects. The second approach, which is developed and used operationally at Météo-France (Desroziers, *et al.*., 2008), is based on spectral filtering of error standard deviations (Raynaud, *et al.*., 2009). Flow-dependent ensemble-based correlations may be represented and filtered using a wavelet approach (Fisher 2003; Pannekoucke, *et al.*., 2007).

We propose here an alternative approach allowing specifically computed background-error covariances to be applied in regions characterized by different meteorological situations. In this paper, these different regions are precipitating and non-precipitating areas. The goal of such a method is to make better use of observations through the application of more adequate multivariate relationships and through a better localization of increments, avoiding the computation of a daily ensemble of forecast. This idea has been briefly mentioned in Courtier *et al.* (1998; their Eq. (31)), who propose such a formulation in order to use different length-scales depending on geographical location, allowing for broader length-scales in data-sparse areas than in data-rich areas. A similar idea has been considered at Environnement Canada (M. Buehner, personal communication) for representing latitude-dependent covariances.

By applying geographical masks to differences of AROME forecasts valid at the same time, section 2 presents to what extent modelled covariances can differ in precipitating and non-precipitating areas over France for a CRM that explicitely represents convective processes. The large differences that have been found lead us to include a heterogeneous background-error covariance matrix in AROME incremental 3D-Var, allowing different covariances to be used in precipitating and non-precipitating areas. The theoretical aspects of such a method are discussed in section 3, followed by a proof of concept based on a simple four-observation experiment in section 4.

### 4. Results of assimilation with four observations

- Top of page
- Abstract
- 1. Introduction
- 2. Background-error covariances for rainy and non-rainy areas
- 3. Implementation of a heterogeneous B matrix in 3D-Var
- 4. Results of assimilation with four observations
- 5. Conclusions
- Acknowledgements
- References

To ensure the reliability of the new formulation of the variational system described in the previous section, three different experiments have been performed:

**CNTRL-Bnp** aims at controlling the impact on analysis of the non-precipitating **B**_{np} matrix, using the standard formulation of the variational system (e.g. considering only one **B** matrix). For that purpose, four pseudo-observations, whose locations are 48°N, 4.5°E and 42.5°N, 4.5°E at 800 and 500 hPa, are assimilated. These pseudo-observations have been generated by considering–30% relative humidity innovations (e.g. observation minus background) at those locations.

**CNTRL-Bp** is the equivalent of CNTRL-Bnp but using the precipitating **B**_{p} matrix.

**Bnp-Bp** uses the hybrid formulation of Eq. (4) and other ingredients listed in the previous section in the variational system, considering that the northern (i.e. north of 46.5°N) and southern halves of the domain are precipitating and non-precipitating areas respectively.

The large differences of correlation lengths between the two **B** matrices displayed in Figure 2 directly impact on the structure of *q* and *T* increments, much tighter increments being obtained in CNTRL-Bp than for CNTRL-Bnp (Figures 10 and 9 respectively). A more pronounced impact on the wind field can be seen for CNTRL-Bp, due to differences in error balances. The Bnp-Bp experiment displays what was expected: in the ‘rainy’ northern part and in the ‘non-precipitating’ southern part of the domain, Figure 11 shows exactly the same increment structures as CNTRL-Bp and CNTRL-Bnp respectively. This is a proof of concept that increments with very different behaviours, in terms of intensities and shapes, can be obtained simultaneously using the heterogeneous **B** matrix formulation, and that different multivariate relationships can be used over different areas in a more adaptive way.

To check how the variational system behaves near the precipitating/non-precipitating border, one extra experiment that considers four pseudo-observations localized near 46.6°N, 4.5°E and 46.3°N, 4.5°E has been performed. Strongly anisotropic vertical structures of increments are displayed in Figure 12 for that case. This shows that, using the heterogeneous background-error covariance formalism, the information brought by observations of precipitation such as radar reflectivities or radial velocities could be localized more realistically and used more optimally in precipitating areas, without spreading too much towards non-precipitating regions, thanks to the masking and the use of smaller correlation lengths. To smooth increment structures in the transition zone however, recent studies have shown that the use of a gaussian transition between masks in **D** seems well suited.

### 5. Conclusions

- Top of page
- Abstract
- 1. Introduction
- 2. Background-error covariances for rainy and non-rainy areas
- 3. Implementation of a heterogeneous B matrix in 3D-Var
- 4. Results of assimilation with four observations
- 5. Conclusions
- Acknowledgements
- References

When assimilating observations at the mesoscale, such as satellite radiances in clear-air conditions or radar data in precipitating areas, one can often notice unrealistic increments. Typically, an isolated observation spreads information too much spatially, due to the background-error covariance matrix that propagates the innovation (observation minus background) with no regard to the surrounding meteorological conditions. In the French operational model at the mesoscale (AROME), such a matrix contains climatological values computed from an ensemble assimilation that gathers winter and summer situations, precipitating and non-precipitating cases. Thus, it poorly represents strong spatial and temporal meteorological variations, such as the overpass of a narrow rainband or the presence of strongly convective conditions. In order to quantify this misrepresentation, background-error covariances have been firstly diagnosed for precipitating and non-precipitating areas. For that purpose, an ensemble of six AROME forecasts, nested in an ensemble of six ALADIN forecasts initialized from perturbed analyses and coupled to perturbed ARPEGE forecasts, has been built for 17 convective situations. Statistics have been computed from forecast differences, and considering separately rainy and non-rainy profiles in those forecasts. Compared to non-rainy areas, precipitating areas are mainly characterized by:

- (i)
larger error standard deviations for *η* and *ζ* (and for T, to a lesser extent), which denotes a more intense small-scale dynamical activity,

- (ii)
smaller error standard deviations for *q*,

- (iii)
50% smaller correlation lengths for *q* and *T*,

- (iv)
larger vertical auto-correlations in the mid-troposphere reflecting the stronger vertical mixing within clouds performed by the explicitly resolved convection, and

- (v)
very different contributions, in scale and in intensity, to the explained *q* error variances due to different effects of precipitating clouds such as the presence of low-level cold pool, low-level convergence, latent heat release, and cloud-top divergence.

Thanks to smaller correlation lengths, using such statistics exclusively in precipitating regions would allow the number of observations such as radar data to be increased, permitting more realistic analysis of precipitating structures. Furthermore, the impact of such observations would be optimized through the different auto-covariances and couplings between variables which favour convective activity. This has motivated us to find a heterogeneous covariance formulation to represent different background-error covariances over different regions.

An original method, which consists of expressing the increment as a linear combination of two terms, characterizing precipitating and non-precipitating areas respectively, has been developed in this way. This method implies the doubling of the size of the control variable and of the gradient of the cost function. A mask, which specifies where each of the two matrices should be applied, must also be defined. The feasibility of such a method has been shown by performing academic experiments using four observations and by considering basic masks. The experiment that uses the heterogeneous **B** matrix formulation displays increments with strongly different shapes in pseudo-non-precipitating and pseudo-precipitating areas. Thus, this approach addresses some of the main issues that have been pointed out by Caron and Fillion (2010), namely the use of different horizontally varying scale-dependent correlations between mass, humidity, vorticity and divergence increments. On real-case experiments however, one important issue may be the conservation of the subsequent structures of the analysis increments in the forecast, considering the spin-up of precipitation and the possible generation of spurious gravity waves.

Tests on such real cases are ongoing using the operational AROME 3D-Var and by considering both reflectivities (as in Wattrelot, *et al.*., 2008) and Doppler velocities (as in Montmerle and Faccani, 2009) from the French ARAMIS radar network as additional data in the variational process. For these experiments, the mask is deduced from an interpolation of reflectivities from the first elevations on the AROME grid. Other applications will be addressed at the mesoscale, especially the use of heterogeneous background-error covariances devoted to the analysis of fog.