## 1. Introduction

[2] A typical Eulerian atmospheric chemistry-transport model (CTM) computes the concentrations *c* of a set of chemical species by solving the system of advection-diffusion-reaction equations,

where *c*_{i} is the concentration of the *i*th species, **V** is the wind velocity, *ρ* is the air density, **D** is the turbulent diffusion matrix, *χ*_{i}(*c*, *t*) stands for the species production and loss due to the chemical reactions, and *E*_{i} stands for the elevated emissions. At ground the boundary condition is given by

where **n** is the upward unitary vector, *S*_{i} stands for the surface emissions and *v*_{i}^{dep} is the dry deposition velocity.

[3] In the numerical model (the CTM), the dimension of the discretized system is usually 10^{6}–10^{7}. The model computes ozone hourly concentrations over Europe (for instance) given the initial conditions and the input data (also designated herein as parameters).

[4] Data assimilation can be considered as the determination of the initial conditions or of model uncertain parameters by coupling the heterogeneous available information, for example, model simulations, observations, and statistics for errors. Data assimilation methods are roughly catalogued into variational and sequential ones [*Le Dimet and Talagrand*, 1986; *Evensen*, 1994]. The objective of the former can be defined as state estimation by minimizing the quadratic discrepancy between model simulation and a block of observations, usually combined with a priori background knowledge. This can be formalized and solved efficiently with the optimal control theory. The sequential methods make use of observations as soon as they are available. Since this is a filtering process, filter theory (linear or nonlinear) applies.

[5] Both methods have found their applications for CTMs. The pioneering work dates back to *Fisher and Lary* [1995]. On the variational side, *Elbern and Schmidt* [2001] use a comprehensive model rather than an academical model in order to assimilate real observations with assessment of ozone forecast. *Chai et al.* [2007] follow with assimilation of new types of observations, and several practical issues, for example, background error modeling, are investigated with details. Very few work deals with the assimilation of initial conditions jointly with uncertain parameters [*Elbern et al.*, 2007]. By contrast, *Segers* [2002] conducts in-depth studies on the applications of efficient filtering methods, in which emissions, photolysis rates and deposition are considered to be uncertain. The model state as well as uncertain parameters are estimated. *Constantinescu et al.* [2007b] report the filtering results obtained with perturbations on emissions and on boundary conditions, and with distance constraints on the spatial correlations.

[6] All these efforts are part of the recent diffusion of data assimilation expertise from numerical weather prediction (NWP) to air quality community. For a review see *Carmichael et al.* [2008]. The CTMs are stiff but stable systems with high uncertainties [*Hanna et al.*, 1998]; the perturbations on initial conditions tend to be smoothed out rather than amplified. Therefore the conclusions from meteorological experiences [*Lorenc*, 2003; *Kalnay et al.*, 2007] cannot be applied directly.

[7] The objective of this paper is to evaluate different assimilation algorithms for ozone forecasts in the same experimental settings. Hopefully this could serve as a base point for the design of assimilation algorithms suitable for ozone forecasts in realistic applications. Four algorithms, namely optimal interpolation (OI), ensemble Kalman filter (EnKF), reduced-rank square root Kalman filter (RRSQRT) and four-dimensional variational assimilation (4DVar) were implemented.

[8] We note that this comparison study has its limitations in that: (1) Only model state is adjusted and uncertain model parameters remain unchanged. (2) The treatment of uncertainties are different. OI parameterizes aggregate uncertainties using the homogeneous Balgovind correlation function. In 4DVar the uncertainties are taken into account, in a way similar to OI (Balgovind correlation), but only at the initial date of the assimilation period. The underlying model is assumed to be perfect, that is, we consider a strongly constrained 4DVar. By contrast, EnKF and RRSQRT represent model uncertainties with ensemble generated by Monte Carlo samplings of uncertain parameters. The reasons for the first limitation are that (1) the adjoint model with respect to model parameters is not available, and (2) correlations between the model state and parameters are unknown. Clearly this should be a research task in near future. The second limitation stems from the unsettled formulation of model error. A novelty of our EnKF and RRSQRT implementation is the perturbation method, originally employed in uncertainty studies for air quality models [*Hanna et al.*, 2001].

[9] The paper is organized as follows. Section 2 documents the assimilation algorithms and their implementations. The experiment setup concerning the model and observations is detailed in section 3. We report the comparison results in section 4. Therein sensitivity studies with respect to the assimilation algorithm settings are also conducted. Conclusions and discussions can be found in section 5.