A European precipitation index for extreme rain-storm and flash flood early warning



Extreme rain storms are known for triggering devastating flash floods in various regions of Europe and particularly along the Mediterranean coasts. Despite recent notable advances in weather forecasting, most operational early warning systems for extreme rainstorms and flash floods are still based on rainfall measurements from rain gauges and weather radars, rather than on forecasts. As a result, warning lead times are bounded to a few hours and warnings are usually issued when the event is already taking place.

This work proposes a novel early warning system for heavy precipitation events in Europe, aimed at identifying forecasts of extreme rainfall accumulations over short durations and within small-size catchments prone to flash flooding. The system is based on the recently developed European Precipitation Index based on simulated Climatology (EPIC), which is calculated using COSMO-LEPS ensemble weather forecasts and subsequently fitted with gamma distributions at each time step of the forecast horizon. Probabilistic exceedence of warning thresholds is calculated and alert points are generated where potentially extreme events are detected. Comparison of daily runs over 22 months with observed rainstorm events and flash floods in Europe denotes a probability of detection up to 90%, corresponding to 45 events correctly predicted, with average lead time of 32 h. Copyright © 2012 Royal Meteorological Society

1 Introduction

Flash floods have a devastating impact on human activities. They are difficult to predict and develop so rapidly that time to react and to initiate emergency action is very short, often too short to prevent loss of life. Although flash floods typically affect limited areas up to few hundreds of square kilometres (Gaume et al., 2009), they are reported as the deadliest weather related hazard in a number of countries (Jonkman and Kelman, 2005; Ashley and Ashley, 2008). The highest proportion of casualties is recorded among drivers trapped in their vehicles and swept away by the raging flow (Ruin et al., 2008; Xia et al., 2011). Thus, providing earlier warnings is of utmost importance to leave more time to prepare for the emergencies and react. The importance of flash flood forecasting and mitigation has also been acknowledged at the European Union level. European projects such as HYDRATE (Borga et al., 2011) and IMPRINTS (e.g., see Alfieri et al., 2011a) represent recent concrete efforts for improving the understanding of hydrometeorological processes leading to extreme events and increasing the preparedness through specific tools for flash flood forecasting and early warning. Particular attention is paid to ungauged catchments in flash flood prone areas along the Mediterranean coasts and the main mountain ranges.

Hapuarachchi et al. (2011) provided an in-depth review of recent advances in existing flash flood forecasting and early warning systems. They classified the different approaches into: (1) Flood Susceptibility Assessment procedures (FSA); (2) Rainfall Comparison Methods (RCM), and, (3) Flow Comparison Methods (FCM), sorted by increasing reliability and complexity. FCM approaches have probably the highest share among operational early warning systems. They often use distributed or semi-distributed hydrological modelling to produce quantitative streamflow estimation (e.g., Addor et al., 2011; Vincendon et al., 2011) or simply detect the upcoming exceedence of alert thresholds through frequency analysis (e.g., Reed et al., 2007; Alfieri et al., 2012). FSA methods provide simple qualitative information at the event scale (e.g., Collier and Fox, 2003), but have the drawback that they are mostly based on heuristic approaches and are unsuitable to be used in operational warning systems.

When dealing with such extreme events, an alternative approach for early warning systems is to develop robust indicators for flood detection, rather than quantitative discharge/level forecasting. Here, RCM represent a good tradeoff between low complexity and good skill. RCM require limited effort in data collection and model calibration, as the main input data is quantitative precipitation estimation (QPE) or forecast (QPF). Some approaches are aimed at detecting extreme weather conditions by using indices based purely on meteorological variables (e.g., Lalaurette, 2003; Golding, 2009; Hurford et al., 2011). Müller et al. (2009) analysed the extremeness of different meteorological variables for a set of days in which significant flooding occurred in the Czech Republic between 1958 and 2002. A combined index of 26 predictors was found in strong correlation with the most severe events, suggesting strong potential in flood detection through the use of Numerical Weather Prediction (NWP). A further step to relate meteorological extremes to flash floods is the use of topographic data such as the drainage network at the ground and of information on the initial soil wetness condition. The Flash Flood Guidance (FFG) has been developed in the US since the 1970s (e.g., Mogil et al., 1978; Georgakakos, 1987; Ntelekos et al., 2006) and makes use of a simple hydrological model, run in backward mode, to estimate the amount of rainfall which produces flood flow at selected outlets, for specific soil moisture conditions. Despite some inherent limitations (e.g., see Reed et al., 2007; Hapuarachchi et al., 2011) the FFG has been widely used for operational monitoring and a number of similar methods based on rainfall thresholds have been proposed in recent years (Martina et al., 2006; Norbiato et al., 2009; Javelle et al., 2010).

Within such an overview, the approach proposed in this work tackles the issue of flash flood early warning in Europe through the detection of rainstorms with extreme rainfall accumulations over short durations and within small catchments prone to flash flooding. The European Precipitation Index based on simulated Climatology (EPIC, Alfieri et al., 2011b) is used in the proposed system as an indicator to monitor the European domain for upcoming hazardous events. System results only depend on the QPF and on the modelled river network, while other hydrological processes (e.g., initial soil moisture, snow accumulation and melting) are not considered. Despite some important simplifications, compared to the actual processes involved the system has no calibration parameters and can be seen as an extreme frequency analysis of the aforementioned indicator. As Guillot and Duband (1967) described in the Gradex method, the gradient of the statistical distribution of discharges tends to follow asymptotically that of rainfall, for high return periods. On this basis, this work investigates the extent to which EPIC can be used to predict discharge threshold exceedance in extreme conditions. The paper is structured as follows. Section 'Data' includes a description of the meteorological data and of the testbed used for validation. Section 'Methods and results' is divided into three parts. First, it shows a comparison between EPIC and the simulated discharge in a testbed, both calculated from the same meteorological input data. Section 'Probabilistic EPIC and visualization of results' gives details on the probabilistic formulation of EPIC and the visualization of daily forecasts on a web platform. Section 'Performance in early detection of extreme storms and flash floods' discusses results of EPIC in extreme rain-storm and flash flood detection over a 22 month testing period. System performance is discussed in Section 'Discussion and concluding remarks' together with some concluding remarks.

2 Data

2.1 Meteorological data

Ensemble weather predictions are provided daily by the Consortium for Small-scale Modeling (COSMO). The Limited-Area Ensemble Prediction System (LEPS) of the COSMO model (Marsigli et al., 2005) is produced once per day at 1200 UTC, with a forecasting range of 132 h. COSMO-LEPS is a 16-member ensemble covering central-southern Europe, stretching as far north as Scotland, Denmark and Latvia. Maps are provided on a rotated spherical grid with horizontal resolution of about 7 km (∼10 km before December 2009) and temporal resolution of 3 h.

Climatological values are derived by two long-term reforecast datasets produced by COSMO with the same model configuration used for operational forecasts. A 30 year meteorological climatology starting in 1971, with 0.09° resolution (∼10 km), was created by initializing the model every 90 h using ERA 40 re-analysis dataset as initial and boundary conditions. Further details on this dataset are given by Fundel et al. (2010). Similarly, an additional 20 year climatology starting in 1989 was created with the same model configuration of the current forecasts, with 0.0625° resolution (∼7 km). Initial and boundary conditions are taken from ERA-Interim dataset (Dee et al., 2011).

2.2 Testbed and hydrological model

The study area is the Gard catchment in the French Cévennes-Vivarais Region (see Figure 1). The catchment ranges between 25 m a.s.l. and 1570 m a.s.l., covering an upstream area of 1890 km2 at the considered outlet near the village of Remoulins. The topography is characterized by high mountain peaks, steep hill slopes and narrow valleys, resulting in a herring-bone channel network (Moussa et al., 2007). High intensity rainfall, particularly in autumn, can produce devastating floods.

Figure 1.

Map of the Gard catchment and modelled river network at 1 km resolution. The boundary of the Gardon d'Anduze is shown with a thick solid line, while a triangle indicates the outlet in Anduze. Circles indicate points on the river network where KPmax is compared with KQ.

The model used for this study is the distributed hydrological model Lisflood (van der Knijff et al., 2010). Lisflood is a hybrid between a conceptual and physically based rainfall-runoff model combined with a routing module in the river channels. It has been specifically designed for large river basins but has also been applied to smaller watersheds, including the one presented in this work (Younis et al., 2008). For this study, Lisflood has been set up for the Gard on 1 km horizontal grid, as shown in Figure 1. Seven model parameters were calibrated using the Shuffled Complex Evolution algorithm (Duan et al., 1994). The observation dataset consists of 4 months of spatially interpolated observed hourly precipitation at up to 101 concurrent rain-gauges, daily average temperature and evapotranspiration at 4 synoptic stations, and hourly discharge measurements in Anduze (upstream area of 544 km2) collected in 2002, which include data for a severe flash flood event occurred in September. Simulated discharges from the calibrated model fit the observation record with Nash–Sutcliffe efficiency of 0.81 and Pearson correlation coefficient of 0.91. For the proposed validation experiment, the hydrological model of the Gard is run for 30 years, starting in 1971, using as meteorological input the COSMO deterministic climatology at 0.09° grid resolution and 3 h time steps.

3 Methods and results

The European Precipitation Index based on simulated Climatology (EPIC) is an indicator to monitor the European domain for upcoming severe storms possibly leading to flash floods (Alfieri et al., 2011b). EPIC is defined as:

display math(1)

where UPdi is the upstream cumulated precipitation, that is the double summation of precipitation depth over the upstream area and over a certain duration di preceding the considered time t. Operationally, UP is calculated for each time step t of the forecasting range and then rescaled by the corresponding mean of the annual maxima derived from a consistent climatology of N years, for the same point and rainfall duration. EPIC is run once a day from the latest COSMO-LEPS forecasts. It is calculated for each pixel of the river network at 1 km grid resolution within the COSMO-LEPS domain, resulting in more than one million points.

3.1 Validation

Assessing the performance of EPIC in operational flash flood detection is a non-trivial matter, as its values include errors due to (1) incorrect weather predictions, and (2) using EPIC as a proxy estimation for normalized discharge at each point. However, in an early warning system based on numerical weather predictions, the error due to incorrect input cannot be eliminated if the aim is to detect hazardous events before they take place. In the following, a validation experiment is presented, based on a comparison between EPIC and normalized discharge derived from the same meteorological input data. Variables are calculated for a number of points in the modelled river network of the Gard River (see the circles in Figure 1), where simulated discharge is calculated for a 30 year period through a calibrated hydrological model, as described in Section 'Testbed and hydrological model' A series of 30 maps of annual maxima is then extracted by the simulated discharge climatology and their mean is derived accordingly. Discharge maps Q for each time step t are then normalized by their mean of the annual maxima, according to the relation:

display math(2)

where N and yi are defined as in Equation ((1)).

The same meteorological climatology derived with the COSMO model at 0.09° grid resolution and 3 h time steps is used to calculate cumulated upstream precipitation maps for durations of 3, 6, 12 and 24 h, which are then normalized by their corresponding mean of the annual maxima. The resulting coefficients are referred to as KP(di) and correspond to the ratio in brackets in Equation ((1)), for each duration di. These are typical durations of intense storms leading to flash floods in catchments with a size up to about 2000 km2 (Reed et al., 2007; Gaume et al., 2009). Differently from Equation ((1)), also the 3 h duration is tested at this stage, which is also the resolution of the input data.

Figure 2(a) compares KQ to KP in Anduze. Variables are calculated for each 3 h time step in the 30 year dataset, resulting in 87 657 data pairs for each rainfall duration and each point in the selected sample. One can note that the highest (normalized) discharges in Anduze are mostly produced by storms of duration 12 (x symbol) and 24 (+ symbol) hours. In Figure 2(b), the maximum KP is selected between the four durations (KPmax), as is done in operational EPIC (see Equation ((1))). As KPs do not account for the hydrological processes which turn precipitation into runoff (e.g., snow accumulation and melting, initial wetness conditions in the catchment, among others), data pairs in Figure 2(b) are widely scattered for low flow conditions. However, extreme flows (i.e., right side of figure) in small-size catchments are mostly driven by short-lived storms with high rainfall rates, which result in stronger linear correlation between KQ and KPmax. This property is of special interest in flood early warning, where the aim is detecting upcoming extreme events, rather than estimating quantitatively their discharge values. To this purpose, a subset of the KQKPmax data pairs was selected and shown in Figure 2(c), considering only peak flows above a fixed threshold KQ, T = 0.3, which is representative of high flow conditions following significant rainfall events. For the modelled river network this corresponds to discharge percentiles ranging between 98.4 and 99.6% of their climatology. For example, the simulated normalized climatology in Anduze is shown in Figure 3, where 207 peaks over threshold have been derived. KQKPmax data pairs for these peaks show a well-defined linear trend in Figure 2(c), with a coefficient of determination R2 = 0.93 passing the t-test at the 0.05% significance level. A similar comparison is shown in Figure 2(d), where the return period T of each peak is estimated for the two variables, assuming a Gumbel extreme value distribution for the annual maxima of KQ and of KPmax.

Figure 2.

Scatter plots comparing KQ and KP in Anduze over the 30 year simulation period: in panel (a) KQ versus KP, in (b) only the maximum KP is plotted among the four rainfall durations, in (c) KPmax versus KQ for peaks over threshold and in (d) the corresponding return periods.

Figure 3.

Time series of normalized discharge KQ in Anduze, over the 30 year simulation period. Selected peaks over threshold are shown with circles, while the threshold value is plotted with a horizontal dotted line.

Further insight is given in Figure 4, where simulated KQ and KPmax in Anduze are compared over 2 months in 1976, including two extreme events. The figure shows that the two variables often differ over low-flow periods. KPmax has no memory about past events and drops to zero as soon as no rainfall is forecast. On the other hand, for high flows and especially for extreme events, KQ and KPmax have similar response and peak magnitude (KQ ≈ 2 for the two most severe events shown in Figure 4). Also, one can see from the figure that the rising and falling limb of the two variables can vary substantially during high flow events. In fact, KPmax has no boundaries in the mass balance, so that the total area below its curve is not necessarily the same as that of the normalized discharge KQ. This justifies the approach of comparing the two variables only for selected peaks over thresholds in the context of an early warning system, where detecting threshold exceedence is the ultimate result. Interestingly, two flood events were actually observed in Anduze on the same days as the forecast ones, with peak flow reaching Q1 = 450 m3 s−1 on 29 August 1976 and Q2 = 910 m3 s−1 on 12 September 1976 (for further details see Bouvier et al., 2004; Marchandise, 2007), which correspond to normalized discharges of KQ1 = 0.8 and KQ2 = 1.6, respectively. This is a noteworthy result, considering the space-time resolution of the forecast data and the limited availability of meteorological data to initialize the weather prediction model as early as 1976.

Figure 4.

Simulated KQ (dashed black) and KPmax (solid grey) for 2 months in 1976. Threshold KQ value is plotted with a horizontal dotted line

An assessment of the linear correlation between peaks over threshold of KQ and KPmax was carried out for 70 evenly distributed points in the modelled river network of the Gard (see Figure 1), with upstream area ranging between 6 and 1620 km2. Results are summarized in Figure 5, which shows the Root Mean Square Factor (RMSF), the co-efficient of determination (R2) and the slope of the regression line, towards the upstream area of each point. The RMSF is calculated with the following formulation:

display math(3)

considering all the Np peaks over threshold for each location. Whereas the RMS Error can be interpreted as giving a scale to the additive error, i.e. KPmax = KQ ± RMS, the RMSF can be interpreted as giving a scale to the multiplicative error, i.e. math formula. The RMSF also gives more weight to the highest values, which is important when extreme values are of highest interest as in this method.

Figure 5.

Root Mean Squared Factor (RMSF), coefficient of determination (R2) and slope of the linear regressions between KQ and KPmax towards the upstream area of the selected points in the Gard modelled river network

In the case of perfect matching between the two sets of variables, the three scores shown in Figure 5 would all converge to the horizontal line with y = 1. The main condition for using KPmax as a proxy estimator for peak discharge is the ability to explain its variability, thus it is best identified with the co-efficient of determination. Indeed, as the system is aimed at detecting values with low probability of occurrence, additive or multiplicative constants would not affect its detection skills. However, Figure 5 shows positive skills of all the three skill scores, particularly for points with the largest upstream area. Also, one should consider the error induced by using rainfall fields with spatial resolution (i.e., 100 km2 for the considered dataset) coarser than the catchment size, which results in an underestimation of the peak magnitude (see e.g., Sangati and Borga, 2009), particularly for short and intense events. In the operational runs of EPIC, results are calculated and shown for the whole river network, but alert points are created only if their upstream area is larger than the rainfall spatial resolution, which is about 50 km2 for the current COSMO-LEPS forecasts.

3.2 Probabilistic EPIC and visualization of results

In operational daily runs, EPIC is calculated for each member of COSMO-LEPS forecasts, resulting in an ensemble of 16 possible temporal evolutions on each grid point over the forecast range (e.g., see Figure 6(a)). Reference values are EPIC = 0, when no rainfall is forecast, and EPIC = 1, when the cumulated upstream precipitation at a point equals the corresponding mean of the annual maxima for at least one of the considered rainfall durations. As flood warning thresholds are often set for specific return periods, a more intuitive representation is to estimate the return period of EPIC and show its values for the selected events. The adopted approach is described as follows.

  • At each forecast, a preliminary empirical set of rules selects the most downstream points with EPIC > 1 for at least 4 members out of 16 (25% probability) and with EPIC > 1.5 for at least 3 members. An additional criterion is set on the upstream area of points, which is bound in the range 50–5000 km2, to address the analysis on flashflood prone catchments. As mentioned in Section 'Validation', the lowest value is bounded by the spatial resolution of the weather prediction data.
  • For the selected set of points, a two parameter gamma distribution is fitted to EPIC ensembles at each time step of the forecast horizon (e.g., see Figure 6(b)). The probability density function (pdf) of a gamma-distributed random variable x is defined as:
    display math(4)
    where α is the shape parameter, β the scale parameter, and Γ(·) denotes the gamma function. L-moments estimators are used to fit empirical values as described in Hosking (1990). A similar approach is used and described by Alfieri et al. (2012) for fitting ensemble streamflow predictions derived by COSMO-LEPS weather forecasts. Results by Alfieri et al. (2012) show that fitting raw ensembles with gamma distributions leads to improvements both in the quantitative streamflow estimation and particularly in the threshold exceedance analysis.
  • A Gumbel extreme value distribution is hypothesized for the annual maxima of KP for durations of 6, 12 and 24 h, derived by the 20 year climatology. Its cumulative distribution function (cdf) takes the form:
    display math(5)
    where α is the scale parameter and ξ the location parameter. The two parameters of each distribution are estimated by equalling the first two sample L-moments with those of the analytical distribution (λ1, λ2):
    display math(6)
    display math(7)
    where γ is the Euler's constant: γ = 0.5772. Return periods T of KP values are estimated from the fitted analytical distributions and by recalling the relation T = 1/(1 − F(KP)).
  • The initial set of reporting points is regrouped into three alert classes. Medium alert class includes all points having a maximum probability larger than 15% of exceeding the 2 year return period. Similarly, high and severe alert classes include all points having a maximum probability larger than 15% of exceeding the 5 and 20 year return period respectively.

EPIC is calculated operationally and results are visualized in a web-interface and analysed on a daily basis for detecting small-scale extreme events over Europe. Products have been designed in analogy to those developed for the European Flood Alert System (Thielen et al., 2009), as they are specifically targeted to explore and visualize probabilistic forecasts. Products shown include a map of the maximum probability of exceeding the mean annual maximum of EPIC over the forecast range (i.e., math formula, where different probabilities are indicated with colour shades. Also, the three layers of alert points defined above are shown with triangles of size proportional to the probability of EPIC to exceed the corresponding alert class. An example of the resulting display on the web interface is shown in Figure 7, for EPIC forecasts on 28 February 2011 1200 UTC in the Marche Region (Central Italy). At each reporting point, a time-plot displays with colour shadings the forecast return period of EPIC for a probability range of 5–95% (e.g., see Figure 6(c)). One can see in Figure 6(c) that return period time-plots put the focus on extreme conditions, while flows below the mean annual peak tend to converge to the 1 year return period. Vice versa, in such plots the uncertainty spread increases for extreme values, thus proving the usefulness of probabilistic information but also showing the difficulty in providing accurate alerts in operational warning systems. As example, for the event peak in Figure 6 the coefficient of variation of the predicted EPIC ensemble is CVEPIC = 0.17, while the corresponding one calculated on the ensemble of return periods, CVT = 0.87, is about five times larger.

Figure 6.

Probabilistic EPIC forecast on 28 February 2011 1200 UTC at a reporting point in Central Italy. (a) Raw ensemble, (b) gamma fit and (c) corresponding return period. In panel (c), the maximum probabilities of exceeding the three alert classes are shown on the right. The ensemble mean and the interquartile range are also shown with solid and dashed black lines in panels (b) and (c).

Figure 7.

Maximum probability (%) of exceeding EPIC mean annual maxima in the Marche Region (Central Italy), forecast on 28 February 2011 1200 UTC (see legend for colour shadings). Reporting points on severe alert are shown with triangles with aside the corresponding probability of threshold exceedance. A circle indicates the point which forecast is shown in Figure 6.

3.3 Performance in early detection of extreme storms and flash floods

By its definition EPIC is not designed to detect all types of floods, but rather those in small size catchments (with a lower boundary depending on the resolution of the NWP) induced by short and intense rainfall events. The collection of quantitative discharge data and of flood thresholds in small rivers throughout Europe, for validation purposes, is a huge and painstaking task. Flash floods usually occur in ungauged catchments, where the only source of information is post-event descriptive reports. Besides, even when gauging stations are available they are sometimes damaged and made inoperative by the intensity of the flood flow. Hence, performance of the proposed early warning system in operational monitoring is assessed through a qualitative approach, by selecting the strongest recorded signals of upcoming severe events from EPIC and verifying the actual occurrence of flooding events in the areas where they were forecast.

EPIC was calculated over Europe from 1 December 2009 and results were evaluated until 30 September 2011. In total, 45 340 reporting points with at least 15% probability of exceeding the 2 year return period were automatically selected by the system (see Figure 8, black dots). The cdf of the probability of exceeding return periods of 2, 5 and 20 years of the full set of reporting points (i.e., only for the event peak) are shown in the three panels of Figure 9. They are shown with grey shades for each forecast lead time between 6 and 132 h. Contour lines are also plotted at selected quantiles. In Figure 9, one can see that the probability of exceeding the three warning thresholds is roughly constant, over the lead time range, for most quantiles of their distributions. For example, in Figure 9(b), half of the points of each class (i.e., quantile 0.5) have a probability of about 12% of exceeding the 5 year return period, for most lead times. On the other hand, probabilities of exceedence close to 100% are detected only for the shortest lead times, especially for the two highest thresholds. In detail, the two peaks in Figure 9(c) for lead times of 24 and 84 h are mostly due to points which correctly detected the devastating flash floods occurred in Andalusia, Spain, on 22 and 24 December 2009. For the first event, five reporting points detected a probability larger than 50% of exceeding the severe threshold (i.e., 20 year return period) in the forecasts of 18 December 2009. For the second event, eight points detected probabilities larger than 80% of exceeding the severe threshold, 24 h before the storm peak.

Figure 8.

EPIC reporting points in the period December 2009 to September 2011 (black dots) and storm high alerts (circles). COSMO-LEPS spatial domain is also indicated with a grey shaded area.

Figure 9.

(a) Empirical cdf of the probability of exceeding the medium, (b) high, and (c) severe threshold for all reporting points, versus their forecast lead time (in grey shades). Relevant quantiles are contoured with solid lines

In order to evaluate the system's skill against observed events, a warning threshold and a minimum probability need to be defined for classifying a detected reporting point as a potential alert. Similarly to EFAS, the high warning level, corresponding to a 5 year return period, was chosen as the threshold for defining an alert. From the set of reporting points, those with a probability larger than 60% of exceeding the high threshold were then selected. This produced a subset of 363 points (i.e., circles in Figure 8) belonging to 57 different ensemble forecasts. Hereinafter, these are referred to as ‘storm high alerts’. Points were intuitively clustered into 50 different events according to criteria of proximity and timing of the peak value of EPIC. Resulting events include a range between 1 and 85 reporting points, spotted by up to 4 consecutive forecasts.

Figure 10 shows the empirical pdf and cdf of the forecast lead time and the upstream area of the selected subset of storm high alerts. Figure 10(a) shows that the majority of events was forecast in the range 24–30 h before their peak, while the average alert lead time of the subset was 32 h. The upstream area of alert points in the system is confined in the range 50–5000 km2. Figure 10(b) shows that more than 50% of the alerts refer to catchments smaller than 300 km2, where flood events usually develop most rapidly and unexpectedly.

Figure 10.

(a) Frequency histogram and empirical cdf of forecast lead time and (b) upstream area of storm high alerts

The actual occurrence of the 50 forecast events was verified by searching for reported news on the internet. The main source of information used is the Flooding section of European Media Monitoring (EMM, http://www.emm.jrc.it/). EMM News Brief was developed at the Joint Research Centre of the European Commission. It is a summary of news from the world in several languages, which is generated automatically by software algorithms. EMM news have been complemented by the Emergency Events Database (EM-DAT, http://www.emdat.be/) of the Centre for Research on the Epidemiology of Disasters (CRED) and by targeted internet searches on national and regional news websites.

Out of 50 events, reported news of rain storms and economic losses due to a combination of floods, flash floods, surface water flooding, debris flow, landslides, hail, lightning, sea waves or wind storms was found in 42 cases. Results are summarized in Tables 1 and 2, which shows for each predicted event the maximum probability of exceeding the high alert threshold, the corresponding date of the forecast and the lead time to the predicted event peak. In Table 1 the causes of the occurred economic losses are indicated in the last column. The magnitude of confirmed events ranges from local storms, causing temporary disruption to transport and human activities (e.g., no. 7, 29, 42), to flash floods affecting large areas, causing massive flooding on downstream rivers (e.g., no. 5, 16, 22, 32). A striking example is that of the flash floods affecting southern Poland and the eastern Czech Republic in the middle of May 2010, which preceded and contributed to the following catastrophic floods in central Europe affecting several thousand people. These were first forecast as extreme events with a lead time of 36 h on the event peak.

Table 1. Storm high alerts detected by EPIC, with reported economic losses
No.Max P(T > 5) (%)Forecast dateLT (h)LocationType
  1. H = hail, LS = landslide, RS = rain storm, SWF = surface water flooding, WS = wind storm.
1999 December 200936Central GreeceFlood
26220 December 200954Corse (France)Flood
39721 December 200924Andalusia (Spain)Flood
48422 December 200972NW Slovenia, Isonzo River (Italy)Flood
510023 December 200924Andalusia (Spain)Flood
68524 December 200912Magra River (Toscana, Italy)Flood
76731 December 200924Corse (France)RS, WS
87910 January 201048PortugalSWF, H
98815 February 201024Malaga Province (Spain)Flood
10625 March 201018Estremadura, SpainFlood, SWF
11849 March 201024South ItalySWF, LS
127130 March 201024Edinburgh (Scotland, UK)Flood
13603 May 201042Midi Pyrénées (France)RS, WS
147314 May 201036Marche (Italy)LS, WS
156814 May 201054Eastern AustriaSWF, LS, H
166215 May 201036East Czech Rep., South PolandFlood
17688 June 201030Lot Department (France)SWF
18869 June 201030Asturias, Galicia (Spain)Flood
196315 June 201012Piemonte (Italy)LS
206815 June 201036NE SpainFloods, SWF
217020 June 201024Emilia Romagna (Italy)SWF
226021 June 201030BosniaFlood
238414 August 201036Paris (France)SWF
246215 August 201030BelgiumSWF
256617 August 201024DenmarkSWF
269417 September 201024SloveniaFlood
27722 October 201018Galicia (Spain)SWF
28839 October 201042Languedoc-Roussillon (France)Flood
296211 October 201036Ibiza (Spain)SWF,LS,WS
306818 October 201024Calabria (Italy)Flood
316927 October 201024North GreeceFlood
329830 October 201048Veneto (Italy)Flood
336331 October 201012SE FranceSWF
347431 October 201024Piemonte (Italy)RS, LS
35678 November 201024Campania (Italy)Flood
36616 December 201030Andalusia (Spain)Flood
37631 February 201112Sardegna (Italy)LS
38863 February 201112Athens, Peloponnese (Greece)SWF
396018 February 201124Sicilia (Italy)Flood
409228 February 201142Marche (Italy)Flood
416219 July 201130Bayern (Germany)SWF, WS
427517 September 201130Ticino (Switzerland)SWF
Table 2. Storm high alerts detected by EPIC with no confirmed news on follow-up events
No.Max P(T > 5) (%)Forecast dateLT (h)Location
43955 December 200942Norway
448425 December 200930Estonia, Russia
4510010 February 20106Greece
468130 March 201036Norway
476314 May 201042North Croatia
486225 December 201030West Slovakia
49707 November 201054South Albania
507318 September 201130Mur River (Austria)

Although EPIC is intended to give an overview of areas potentially at risk of extreme precipitation events, on some occasions the forecast location of the worst affected areas was very accurate, spotting flash floods in catchments as small as few hundreds square kilometres. Relevant examples are for the flash floods in the River Esk (no. 12) in Scotland, UK (∼300 km2) on 31 March 2010, in the Magra River (no. 6) in Liguria, Italy (∼1600 km2) on 25 December 2009, and in the Ete, Chienti, Tronto, Aso (no. 40) in the Marche Region, Italy (ranging between ∼300 and 1300 km2) on 2 March 2011 (see also Figures 6 and 7).

4 Discussion and concluding remarks

The presented work describes the theoretical basis and the operational implementation of EPIC, a newly proposed indicator for short-lived extreme precipitation events potentially inducing flash floods on small European catchments. Analysis in Section 'Validation' shows that EPIC is an accurate proxy estimator of the normalized discharge (i.e., discharge rescaled by the corresponding mean of the annual maxima) in high flow conditions and thus is a suitable indicator to consider in the context of flood early warning. EPIC provides probabilistic alerts for upcoming extreme rain-storms over a 132 h forecast horizon, on small-size catchments within the spatial domain covered by COSMO-LEPS weather predictions. The long-term reference climatology, derived from the COSMO reforecast dataset, is particularly useful in flash-floods prediction, as these events often take place in small watersheds where little or no measurement is available. In addition, operational weather predictions used are coherent with the climatological values (i.e., they have the same space-time resolution and result from the same circulation model). As a result, no additional post-processing or bias correction is necessary (e.g., see Reed et al., 2007; Hopson and Webster, 2010), as warning thresholds are consistent with real-time forecasts.

A noteworthy novelty proposed with this system concerns the visualization of probabilistic forecasts through return period time-plots as shown in Figure 6(c). The main strengths of such type of representation are:

  • results derived by a dimensionless indicator are translated into intuitive quantities, which are overlaid in the same graph to alert thresholds. Despite the considerable amount of information carried in each plot and the ongoing debates on the communication of probabilistic results (e.g., Pappenberger et al., 2011), graphs results of simple interpretation, and,
  • return periods of EPIC are derived through a non-linear transformation of the initial variable. The resulting representation is optimized to display results in extreme conditions. It shows that a relatively narrow uncertainty range in the linear space (in Figure 6(b)) is translated into a wider spread of estimation of the corresponding return period of the event (Figure 6(c)), particularly for extreme values.

In addition, warnings detected by EPIC are meant to describe the hazard conditions in a certain time span: hence they are not univocally related to the magnitude of the economic losses associated to the events. In particular:

  • EPIC accounts the hazard of extreme events occurrence but it does not include vulnerability maps, which give information on urban settlements, transport networks and flood protection measures.
  • EPIC is calculated on the full COSMO-LEPS domain; yet, regions prone to flash flooding are mostly located across mountain ranges with steep slopes and fast runoff response following rainfall events. Lowland areas also suffer from extreme rainfall events causing surface water flooding which can disrupt road networks, building basements and crop fields, among others. However, in such cases the extent of the losses (particularly in terms of victims) is on average lower, as the decreased flow velocity and debris content reduces the destructive power of the flood flow and increase the warning lead time. Hence, in such cases, the rising of the flood wave is slower, compared to flash floods, giving more time to the affected population to find shelter.

With regard to unconfirmed events shown in Table 2, they were analysed more closely, searching for the causes of the false alarms and possibilities for improving the future system performance. Three main reasons were found, which are described in the following.

  1. Errors in event severity. In two cases (events nos. 47 and 50 in Table 2) rainfall storms occurred with the predicted location and timing, though no significant disruption or economic loss was reported in post-event news. This error type is mainly influenced by the reliability of weather predictions and by the appropriateness of the chosen alert threshold (i.e., 5 year return period) as indicator of hazardous event. However, false alarms are intrinsic outcomes of a probabilistic system like the one proposed, which are likely to occur whenever selected threshold probabilities are lower than 100%. Indeed, the two reported cases had maximum probability of high threshold exceedance of 63 and 73%.
  2. Location errors. For three events (nos. 45, 48, 49) floods were observed in catchments about 200–300 km from the predicted locations, though with the correct peak timing. For this error type the same considerations hold as for the previous category, as performance mostly depends on the skills of quantitative precipitation forecasts.
  3. Boundary errors. Reported interpolation issues near the boundary of COSMO-LEPS domain often result in overestimated precipitation rates, which led to the prediction of three unreported events in Norway, Russia and Estonia (no 43, 44, 46). These occurrences can be easily recognized when points are overlaid to the extents of COSMO-LEPS domain (see Figure 8) and removed from the group of high alerts.

In summary, results of this work showed that EPIC is a useful tool to aid the detection of extreme rain-storms and flash floods, with probability of detection reaching 90% if alerts located at the boundary of the forecast window are excluded. By its definition, EPIC is designed to detect only specific types of event within a well-defined range of event duration and catchment size. Therefore, it is intended to be used as a complementary tool to support the detection of extreme rain-storms and flash floods. Results show that using such a simple framework, based only on accumulated upstream precipitation, is often justified by the large uncertainty spread of the ensemble weather predictions, which outweigh that of other hydrological processes not considered.


This work has been carried out within the IMPRINTS project (FP7-ENV-2008-1-226555). Davide Muraro and Milan Kalas are gratefully acknowledged for their support in the operational visualization of the daily forecasts on the web interface.