## 1. Introduction

Numerical weather prediction (NWP) models are complex technical expressions of the science of weather forecasting. They operate at a generally high level of forecast skill which implies that all relevant multi-scale interactions and dynamics–physics feedbacks are tuned into harmony. The need for model tuning arises in part from the fact that discrete numerical representation splits atmospheric processes into resolved and unresolved ones. Subgrid-scale physical processes are parametrized with numerical schemes that contain explicit closure parameters (e.g. Stensrud, 2007). Typically, expert knowledge and manual techniques are used to specify the optimal parameter values at various stages of the model development and tuning process. This is a laborious task, which needs to be repeated after any major model upgrade. Due to the high computational cost of NWP models, the tuning is limited by the affordable number of test cases. The optimal values and uncertainties of these parameters are therefore only approximately known.

Algorithmic techniques to estimate model parameters can speed up model development, and improve usefulness of simulation results as their uncertainties are better understood. A prerequisite for parameter estimation is to understand the relationship between parameter variations and model response. Parameter variations can be used in ensemble prediction systems (EPS) to represent model uncertainty in addition to stochastic parametrization schemes (e.g. Bowler *et al.*, 2008, and references therein). The reason is that in EPS initial condition perturbations alone do not generate enough spread to the ensemble of forecasts. Thus the ensemble, which should properly sample forecast uncertainty, may appear overconfident unless uncertainties in the model formulation and boundary forcing are accounted for, as well. For instance, the impact of parameter variations related to convection and boundary-layer parametrization on tropical ensemble spread and Brier scores was positively assessed by Reynolds *et al.* (2011) in a global forecasting system.

Nielsen-Gammon *et al.* (2010) advocate studies of the sensitivities of model simulations to model parameter variations: successful parameter estimation requires that variations in a subset of parameters to be estimated produces sufficiently large, well-behaved, and unique signatures to model output. Hacker *et al.* (2011) studied the model response to parameter variations in a mesoscale ensemble prediction system. They did not find any clear linear scaling between parameter variations and ensemble properties: the perturbed models were typically indistinguishable. They concluded that ensemble prediction using perturbed parameters complement more complex model-error simulation methods, but parameter estimation may prove difficult or costly for real mesoscale NWP applications. A possible alternative avenue is to apply meta-models for parameter dependencies (Neelin *et al.*, 2010).

Applicability of ensemble techniques in parameter estimation has usually been considered from the state augmentation viewpoint, i.e. using state filters augmented with parameters as artificial states (e.g. Aksoy *et al.*, 2006a, 2006b). In filtering approaches, the focus is on very-short-range forecasting as the state is propagated basically from one observation time to the next. Thus, parameter estimation is conditional on model performance in very-short-range forecasts. NWP systems are known to suffer from spin-up/down problems. For instance, moist variables exhibit a tendency towards the model attractor because the model hydrological cycle is not in balance at the initial state (e.g. Trenberth and Guillemot, 1998; Betts *et al.*, 2003). Uncertainties are often related to parameters in moist physical processes, and may be affected by this imbalance early in the forecast. In the following, we put forward our parameter estimation approach, which is not in the immediate context of data assimilation, but uses short-to-medium range forecasts generated in abundance by ensemble prediction systems.

Forecast error growth studies hint at how parameter variations evolve in complex systems. Forecast errors are due to initial state errors and model errors but these are not easily separable because the estimation of the initial state involves a forecast model, and thus initial state errors are affected by model errors too (Leutbecher and Palmer, 2008). Growth of very-short-range forecast error is nevertheless dominated by the exponential growth of initial state errors, and the linear growth of model errors becomes important later in the forecast (Savijärvi, 1995). This tends to imply that early in the forecast range, the parameter variations do not yet have a sizable effect. On the other hand, late in the forecast, parameter variations have a stronger impact but are masked by the quadratic nonlinearity of the system and weaken parameter identifiability—nonlinearity is considered quadratic because the main terms of atmospheric dynamics have quadratic expressions. Thus, somewhere in between these extremes, there might be an optimal forecast range where parameter variations already affect the model output but chaoticity does not yet dominate the system behaviour and overwhelm parameter identifiability. This is supported by the finding of Zhu and Navon (1999) that a low-resolution global atmospheric general circulation model (GCM) tends to first lose the impact of the optimal initial condition while the impact of optimally identified parameter values persists beyond 72 hours. Interestingly, experiments with the European Centre for Medium-range Forecasts (ECMWF) analysis and forecasting system, in which satellite data are first denied and then reintroduced, suggest that observations older than about three days have no influence on the quality of the analysis (Fisher, 2006).

A recent dual article (Järvinen *et al.*, 2012; Laine *et al.*, 2012; hereafter JL2012) presented an Ensemble Prediction and Parameter Estimation System (EPPES), and argued that ensemble prediction systems can be utilized to make statistical inference about the NWP model closure parameters by means of parameter perturbations. The idea of JL2012 was to impose initial-time parameter variations on an ensemble of forecasts and to infer the parameter values and their uncertainties based on how likely different ensemble members appear against observations. From the parameter-estimation point of view, the initial values can be seen as ‘nuisance parameters’ whose uncertainty should be integrated out. In EPPES, this is done by sampling over a large number of different flow types (i.e. initial states). This is further enhanced by the use of initial state perturbations.

In EPPES, the likelihood can be formulated in terms of forecast skill at some suitable forecast range covered by the ensemble, say at five days. Thus, one directly attempts to optimize the medium-range forecast skill. The appeal of EPPES is that the computational power traditionally used in operational ensemble production for assessing the forecast uncertainties could be harnessed for model tuning too. The method can be implemented into operational EPS with minimal technical changes to the code infrastructure. The EPPES algorithm itself is virtually cost-free. Moreover, it is a model-independent algorithm which can be easily transferred to new modelling systems, as long as the relationship of parameter variation and signatures in the model output are sufficiently understood. NWP model tuning is certainly challenging, but these potential benefits render further experimentation worthwhile.

Based on experimentation with a stochastic version of the Lorenz-95 model (Lorenz, 1995; Wilks, 2005), JL2012 concluded that EPPES might be a step towards algorithmic model parameter estimation. The results of JL2012 cannot, however, be directly scaled up to realistic systems since the rich dynamics of the atmospheric circulation cannot be simulated with the Lorenz-95 model. Therefore, this article takes a step towards a more realistic set-up, and demonstrates the EPPES method using a global atmospheric GCM. An ‘EPS emulator’ is developed based on the ECMWF model HAMburg version (ECHAM5: Roeckner *et al.*, 2003). The motivation to use a climate model rather than an NWP model is that for the proof of concept in large- and multi-scale systems, basically any primitive equation system should suffice. In our case, the ECHAM5 model provided the shortest development path. Furthermore, we rely on the ECMWF operational analyses and their EPS initial state perturbations. We copy the perturbed initial conditions and verifying analyses from ECMWF, and use ECHAM5 as a state propagator to make 10-day global forecasts. This enables very convenient testing of the EPPES algorithm and allows full control of the necessary components while avoiding the need to develop the EPS infrastructure. We present the experimental set-up in section 2, the parameter estimation and validation results in section 3, before discussion and conclusions.