## 1. Introduction

The inclusion of uncertainty in weather forecasts can lead to substantial increase of their economic value (Zhu *et al.*, 2002; Palmer *et al.*, 2007). The most popular way to account for this uncertainty is through the use of probabilistic forecasts, which essentially assign a probability to the occurrence of future events, based on a combination of presently available information about those events (e.g. numerical weather predictions, forecaster's experience, past observations). Ensemble forecasting (Epstein, 1969) has introduced new tools for the computation of probabilities, since it can provide estimations of the first and second moments of the future probability distribution function (PDF) of any variable of interest, such as, for example, precipitation amount. The ability of ensemble systems to outperform deterministic-style forecasts and to predict forecast skill has been convincingly established (see Palmer *et al.*, 2007 and references therein). However, various challenges in the statistical postprocessing of ensemble outputs have been described. As documented by many papers (e.g., Wilks and Hamill, 2007 and references therein), probability derived directly from the ensemble is largely affected by model errors, and leads to unreliable probabilistic forecasts that reduce the economic value of the information.

In order to correct the effect of the ensemble systematic errors, several techniques have been developed, all of them based on the study of the relationship between error and forecast value and in the development of statistical models to compute a calibrated probability given the forecasts of the ensemble members (Hamill and Colucci, 1997, 1998; Eckel and Walters, 1998; Applequist *et al.*, 2002; Gahrs *et al.*, 2003; Gallus and Seagal, 2004; Raftery *et al.*, 2005; Hamill and Whitaker, 2006; McLean Sloughter *et al.*, 2007; Stensrud and Yussouf, 2007, among others). Some of these techniques can be applied even to a single deterministic forecast allowing the computation of probabilities without running an ensemble system (e.g., Gallus and Seagal, 2004). Although most of them share common principles, there can be recognized differences in the implementation and/or in the mathematical algorithms employed by them. To this end, given the variety of calibration strategies available, many of them referred to above, it remains unclear which is their relative performance, for example, in terms of forecast quality. Accordingly, one of the objectives of this paper is to describe the sensitivity of probabilistic forecasts to the calibration algorithm using selected techniques, including a brief discussion of their pros and cons. In addition, some modifications are proposed to improve either the performance or the ease of computation of some of these algorithms, which are also compared with the original implementations.

All the calibration techniques proposed are applied to precipitation forecasts, since they are one of the most challenging and least accurate products available from numerical weather prediction (Ebert, 2001; Stensrud and Yussouf, 2007). The algorithms have been applied to a short range regional ensemble system over South America based on the WRF (Weather Research and Forecasting) model. This work is an extension of Ruiz *et al.* (2009) where PQPF (Probabilistic Quantitative Precipitation Forecasts) generated with two different ensemble systems and using two calibration strategies (Hamill and Colucci, 1997; Gallus and Seagal, 2004) were compared. In that case, the validity of the results was limited due to the relatively small amount of observations available for calibration. In the present work, satellite estimates have been used for calibration and verification purposes, in order to reduce the uncertainty due to the small number of observations over the area of interest. According to Ruiz (2009), this choice does not affect the general conclusions regarding calibration performance, while providing more robust statistics due to more data availability.

Sensitivity to calibration strategies constitutes the first part of this assessment. It is also of interest to make a more comprehensive analysis including ensemble generation, taking into consideration the variety of alternatives to generate computationally cheap regional ensemble systems (see Applequist *et al.*, 2002; Gahrs *et al.*, 2003). Both assessments could help designing a probabilistic forecast system, suitable for small operational/research centres. This complementary evaluation is addressed in Ruiz *et al.* (2011).

The article is organized as follows: Section 2 describes the ensemble system, the dataset for verification/calibration and different calibration methods used: results are analysed in Section 3, and Section 4 provides the conclusions of this work.