By continuing to browse this site you agree to us using cookies as described in About Cookies
Notice: Due to essential maintenance the subscribe/renew pages will be unavailable on Wednesday 26 October between 02:00- 08:00 BST/ 09:00 – 15:00 SGT/ 21:00- 03:00 EDT. Apologies for the inconvenience.
1. Single- and probabilistic-based approaches to the prediction problem
Ensemble systems, capable of estimating the probability distribution of forecast states can be used not only to identify the most likely outcome as was done in the past using single, high-resolution forecasts but also to assess the probability of occurrence of weather event of interest. In particular, a probabilistic forecast can be used to assess the probability of occurrence of events linked with maximum acceptable losses, and can thus be more valuable than single forecasts in weather-risk management. Following the operational implementation of global ensemble predictions systems in the 1990s at the European Centre for Medium-range Weather Forecasts (ECMWF, Molteni et al., 1996), the National Centres for Environmental Predictions (NCEP, Tracton and Kalnay, 1993) and the Meteorological Service of Canada (MSC, Houtekamer et al., 1996), other centres implemented ensembles systems in the 2000s. Today, ten operational centres (based in Australia, Brazil, Canada, China, England, Japan, Korea, and the USA) run global ensemble prediction systems for up to 10 days. Following the availability of ensemble-based probabilistic weather forecasts and indications of their values, the past decade saw the development and implementation of hydrological probabilistic forecasting systems (e.g. Gouweleeuw et al., 2005; Schaake et al., 2006). A key role in the development and promotion of the use of a probabilistic approach in ensemble has been played by HEPEX (the Hydrological Ensemble Prediction Experiment) established in 2004, it has fostered the communication between meteorologists and hydrologists, and promoted the establishment of test-bed projects to assess the potential value of ensemble prediction in hydrological applications (see the HEPEX web page for more details at http://hydis8.eng.uci.edu/hepex/).The scope of this communication is to discuss two of the advantages of ensemble-based probabilistic forecasts compared to single ones. The first advantage, already mentioned above, is linked to the fact that ensemble systems allow to predict not only the most likely scenario but also the probability of occurrence of any event, as can be quantified using the potential economic value (Richardson, 2000; Buizza, 2001). The second advantage is linked to the fact that ensemble systems can provide forecasters with more consistent (i.e. less changeable) successive forecasts: this fact is illustrated considering one case study.
After this introduction, a brief review of the ECMWF approach to ensemble meteorological prediction and of the use of the ECMWF ensemble forecasts in hydrological ensemble prediction systems is discussed in Section 2. The seasonal average potential economic value of single and ensemble forecasts of meteorological values are compared in Section 3. The importance of having consistent successive forecasts is discussed in Section 4, considering one case study. Finally, some conclusions are drawn in Section 5.
2. The meteorological and hydrological ensemble approaches to probabilistic prediction
In meteorology, ensemble prediction systems based on a finite number of deterministic integrations appear to be, so far, the only feasible method to predict this probability density function beyond the range of linear error growth. These systems have been designed to simulate the effect on initial and model uncertainties on forecast states. When first implemented in 1992 (note that at the time of writing, ECMWF is celebrating 15 years of operational ensemble prediction), the ECMWF Ensemble Prediction System (EPS, Molteni et al., 1996) was based on 33 forecasts produced with a T63L19 (spectral triangular truncation T63 with 19 vertical levels) resolution version of the ECMWF model. The initial uncertainties were simulated by starting 32 members from perturbed initial conditions defined by the fastest growing perturbations (the singular vectors of the tangent forward model version, Buizza and Palmer, 1995). Between December 1992 and September 2006, the EPS was upgraded several times, benefiting both from changes of the ECMWF data assimilation and forecasting system, and from modifications of the EPS configuration designed to improve the simulation of initial and model uncertainties. Among these changes, it is worth reminding that the introduction, in 1998, of a stochastic scheme to simulate the effect of model uncertainties on the forecast probability; the resolution increases in 1996 and 2000, and the increase in membership from 33 (32 perturbed plus a control integration) to 51 in 1996. More recently, in 2006 the EPS was upgraded to the new variable resolution EPS (variable resolution ensemble prediction system (VAREPS), Buizza et al., 2007), characterized by a TL399L62 (spectral triangular truncation TL399 with linear grid and 62 vertical levels) resolution between forecast day 0 and 10, and TL255L62 between day 10 and 15. Since September 2006, VAREPS has been providing ECMWF users with 15-day global ensemble forecasts twice a-day, with initial times 0000UTC and 1200UTC (at the time of writing, September 2007, this is still the operational system).
Hydrological ensemble prediction models have been developed either to use directly global meteorological ensemble forecasts as input, or to use dynamically downscaled meteorological ensemble forecasts, e.g. generated by meso-scale ensemble prediction system nested in global ensembles. In hydrological ensemble systems, observations are used to estimate the initial states, and an ensemble of weather forecasts (of variables such as precipitation, surface temperature, soil moisture, and snow) is used to predict the future hydrological conditions. For example, the European Flood Alert System (EFAS), developed and run in preoperation mode at the Joint Research Centre of Ispra (De Roo et al., 2003; Thielen et al., 2006), uses the ECMWF global ensemble forecasts to generate an ensemble of river discharge forecasts for all European river catchments larger than 4000 km2. EFAS forecasts are used as a prealert to allow the receiving authorities to be aware of the possibility of a flood to take place. The reader is referred, e.g. Diomede et al. (2007); Grossi et al. (2007); Hou et al. (2007); Voisin et al. (2007) and Pietroniro et al. (2007) for a review of similar hydrological systems.
3. The ‘Potential Economic Value’: a metric of forecast value
One way to quantify the potential value of a forecasting system is to use the concept of potential economic value, estimated using simple cost/loss models (Richardson, 2000; Buizza, 2001).
Consider a user who has access to a forecast, and who can take a protection action of cost C to avoid a loss L (note that L denotes only the avoidable losses not all the losses that a user can incur). This forecast can lead to four possible outcomes (Table I). The forecast system can produce two right decisions, a hit and an inverse hit: in these cases, an event (or no event) which is forecast is also observed (or not observed). Alternatively, the system can produce two false decisions, a false alarm and a failure. A failure means that no event has been forecast, but the event occurs. According to this simple cost/loss decision model the potential economic value of a forecast system can be calculated by combining the outcomes of the decision-making process with an economic decision model like the static cost/loss model approach. In this cost/loss model, a hit and a false alarm are associated with a cost C, since an alarm causes the user to protect his environment against the event at a cost C. By contrast, if no alarm is given, no protective action is done: if the event is not observed, no loss occurs, but if the event occurs the user faces a loss L.
Table 1. Possible Outcomes (hit a, false alarm b, failure c and inverse hit d) and expense matrix (protection cost C, loss L) of a decision-making process for a decision maker that takes a protective action or not. A hit and a false alarm a related with a cost C, whereas a failure of the system causes a loss L
Considering Table I, the mean expense Ef that users with a cost-loss ratio C/L face can be calculated in the following way:
where a, b and c are defined in Table I. This average expense can be compared with the average expense of a reference forecast Ec:
where Ec is the minimum expense of the two following decisions: either (1) the user always protects if the climatological base rate of an event s is smaller than the cost-loss ratio Ea = C, thus incurring an expense Ea, or (2) the user never protects if s is greater than the cost-loss ratio En = s·L, thus incurring an expense En. Since for some variables, e.g. for synoptic scale meteorological variables or for hydrological discharges, the forecasts are highly auto-correlated, a persistence forecast might be successful in predicting an event, especially for the next forecast time-step. In this case, the average expense of a persistence forecast Ep can be used as reference:
where ap is the number of hits, bp is number of false alarms, cp is the number of failures and np is the number of forecasts. In this case, the reference expense can be defined in the following way:
Note that the average expense sustained by using a perfect forecast system E1 is given by:
The average reference expense associated with the forecast Ef can be transformed into its potential economic value PEVf using the average expenses of the reference forecast Ec and of the perfect forecast E1:
PEVf ranges between minus infinity to 1: a forecast that is better than the reference has positive PEV, and a perfect forecast has PEVf = 1. Equation (6) is very similar to the formulation of Richardson (2000), the only difference being that the reference forecast is extended by the persistence forecast.
Figure 1 shows the potential economic value of single EPS-controlled forecasts (i.e. forecasts given by the single EPS member started from the unperturbed analysis) and of EPS probabilistic predictions for four events: total precipitation in excess of 5 and 20 mm/day, and 850-hPa temperature anomalies warmer/colder than 4 °C (these variables were selected because precipitation and temperature are two key drivers of hydrological discharge models). Figure 1 shows that for all four predictions the potential economic value of EPS probabilistic forecasts is higher. As discussed in Richardson (2000) and Buizza (2001), this is due to the fact that the information content of a probabilistic forecast is higher than the information content of a single (yes/no) forecast. The end result is that users can manage weather-related risks better if they are given access to the whole probability distribution function.
4. On the consistency of successive single and ensemble forecasts: the case of the flood of Northern Italy of October 2000
Consistency between forecasts issued on consecutive days is a desirable property of a forecasting system. Figures 2 and 3 illustrate the importance of consistency for the prediction of 48-hour precipitation accumulated between 14 and 16 October 2000. During this period, intense precipitation caused the Po river to flood parts of Northern Italy, causing death, the evacuation of 40 000 people from flooded areas, and a lot of damages. This event was the most severe flood of the Po river after the terrible events of 1951, which lead to an even higher death toll and to the evacuation of about 160 000 people from flooded areas.
Single control forecasts (Figure 2) of 48-hour precipitation issued on 8, 10 and 12 of October and valid from 14 to 16 October were rather inconsistent. The 144-to-192 hour forecast issued on the 8th predicted more than 50 mm of rain over Central Italy and the Northern Adriatic Sea rather than over Northwestern Italy where 50 mm (actually up to 100 mm, see Figure 2 bottom-right panel) of rain was observed (the observed map has been constructed interpolating data from synop stations on the regular 0.5 degree resolution grid where forecasts have been generated). The 96-to-144 hour forecast issued on the 10th moved the area with more than 50 mm of precipitation over the whole of Northern Italy, more in agreement with the observed pattern. The 48-to-96 hour forecast kept the area with more than 25 mm of precipitation over the whole of Northern Italy, but reduced the area with more than 50 mm to a single point. Making a prediction using these successive forecasts would have been rather difficult, due to these changes in the area and the amount of predicted rainfall.
By contrast, EPS probabilistic forecasts of precipitation in excess of 50 mm (Figure 3) issued on the same dates were more consistent. The 144-to-196 forecast gave a 2–10% probability of more than 50mm of rain would fall over Northwestern Italy. The 96-to-144 hour forecast gave a higher probability of 10–20% over Northern Italy, with a 20–30% probability over Liguria (the strip around 44°N latitude and 7–10°E longitude) and a 30–60% probability over Northeastern Italy. Finally, the 48-to-96 hour forecast gave a 30–50% probability over Northwestern Italy, in the region where more than 50 mm of rainfall was observed, and reduced the probability over Northeastern Italy to 10–20%. Figure 3 indicates that there is a more consistent gradual refinement of the area with higher probabilities of intense precipitation towards the area where intense precipitation was observed, with probabilities consistently increasing.
It is also interesting to point out that the combined ensemble-based probabilistic forecasts and single forecasts can help the forecasters to judge the potential forecast skill of the single forecast: regions with smaller probabilities are regions where the ensemble members differ most, and thus are the ones more uncertain and less predictable. Disagreements between probabilistic and single forecasts can be used as an indication of potentially low predictability. For example, the disagreement between the 144-to-196 hour probability and the EPS-control forecast that more than 50 mm of rain would hit Central Italy and the Northern Adriatic Sea can be used as an indication that the control forecast has a high chance to be wrong. By contrast, the agreement between the 96-to-144 hour probability and EPS-control forecast can be used as an indication that the Northwestward shift of the area of intense precipitation towards Northern Italy has a high chance to be right. Furthermore, the agreement between the 48-to-96 hour and the 96-to-144 hour probabilities can be used as an indication that there is a high chance that Northern rather than Central Italy was going to be hit by intense precipitation. By contrast, the disagreement between the 48-to-96 hour probability of precipitation in excess of 50 mm and the EPS-control forecast can be used as an indication that the 48-to-96 hour control forecast was most likely underestimating the precipitation amount.
5. Conclusions, and ongoing research at ECMWF
Two of the main advantages of ensemble-based probabilistic forecasts compared to single forecasts have been discussed: first, the fact that an ensemble system predicts not only the most likely scenario but also the probability of occurrence of any event, and second the fact that an ensemble system can provide more consistent (i.e. less changeable) successive forecasts. The first point has been discussed comparing the average potential economic value of four types of forecasts given either by the single EPS-control forecast or EPS probabilities. The second point has been illustrated discussing precipitation forecasts for the case of the flood of the Po river of October 2000 (although the skill of an ensemble system can only be judged considering a large sample of cases, and thus no statistically significant conclusions can be drawn, this case illustrates the advantage of having more consistent, ensemble-based forecasts).
Despite all the recent progresss in probabilistic prediction (see Palmer et al., 2007 for a review of the status of the ECMWF ensemble system), it should be stated that the current operational ensemble prediction systems still suffer from some limitations. One of them is that their spatial resolution is still less then the one of single high-resolution systems (at ECMWF the two systems have a spatial resolution equivalent to ∼50 and ∼25 km, but the difference in resolution is even higher for other centres), a fact that makes it difficult to use ensemble forecasts to complement single higher-resolution forecasts. Another limitation, which is also shared by single high-resolution forecasts, is that forecasts of weather variables that drive hydrological models (e.g. rainfall, surface temperature) are still problematic, often not quantitatively accurate enough, requiring the post-processing and calibration of weather forecasts before using them in hydrological systems. On this second point, ECMWF is planning for the beginning of 2008 the operational implementation of a reforecast suite, which should help users to calibrate ensemble weather forecasts before they are used to drive hydrological models. In fact, as Hamill et al. (2004) has shown, reforecast datasets can be used to estimate in a more accurate way the ensemble model climatology, and can thus reduce the problem. At ECMWF, work is also progressing in two other areas to further improve the performance of the ensemble system: in the simulation of initial uncertainties, by exploring the use of ensemble data assimilation methods combined with singular vectors and by improving the tangent forward and adjoint physics in the singular vector computation and in the simulation of model errors, by testing new stochastic schemes. These ongoing works should make it easier to integrate meteorological and hydrological systems, and provide users with better, more integrated metrological/hydrological forecast information.