Probabilistic rainfall thresholds for landslide occurrence using a Bayesian approach



[1] Various methods have been proposed in the literature to predict the rainfall conditions that are likely to trigger landslides in a given area. Most of these methods, however, only consider the rainfall events that resulted in landslides and provide deterministic thresholds with a single possible output (landslide or no-landslide) for a given input (rainfall conditions). Such a deterministic view is not always suited to landslides. Slope stability, in fact, is not ruled by rainfall alone and failure conditions are commonly achieved with a combination of numerous relevant factors. When different outputs (landslide or no-landslide) can be obtained for the same input a probabilistic approach is preferable. In this work we propose a new method for evaluating rainfall thresholds based on Bayesian probability. The method is simple, statistically rigorous, and returns a value of landslide probability (from 0 to 1) for each combination of the selected rainfall variables. The proposed approach was applied to the Emilia-Romagna Region of Italy taking advantage of the historical landslide archive, which includes more than 4000 events for which the date of occurrence is known with daily accuracy. The results show that landsliding in the study area is strongly related to rainfall event parameters (duration, intensity, total rainfall) while antecedent rainfall seems to be less important. The distribution of landslide probability in the rainfall duration-intensity shows an abrupt increase at certain duration-intensity values which indicates a radical change of state of the system and suggests the existence of a real physical threshold.

1. Introduction

[2] Rainfall is the most common cause of landslides. However, the actual failure mechanism is complex and involves a number of factors that influence the hydrologic behavior of the slope, the shear stresses acting in the slope and the mechanical resistance along the potential slip surface. There is therefore not a direct cause-effect relationship between rainfall and slope failure. To analyze such relation, many studies have developed rainfall thresholds for landslide occurrence using empirical or physical (process-based) models.

[3] Models employing physically based thresholds use spatially variable characteristics (e.g., slope gradient, soil depth, and shear resistance) with a simplified dynamic hydrological model to predict pore pressure in which rainfall is the most important input parameter [Frattini et al., 2009; Gabet et al., 2004; Terlien, 1998; Wilson and Wieczorek, 1995]. These models usually require calibration over a well-specified type of failures and, in general, they are difficult to apply over large areas where detailed knowledge of input parameters (e.g., soil thickness, groundwater conditions, shear resistance parameters) is very difficult to acquire.

[4] Empirical models are more suitable for the development of rainfall thresholds at regional scale provided a sufficient amount of information is available. In particular the timing and location of landslides is required in addition to rainfall data sufficiently detailed to describe the rainfall at the landslide site. Caine [1980] was the first to propose a global rainfall intensity–duration (ID) threshold for the occurrence of shallow landslides. Since then, many rainfall thresholds have been proposed, either in the ID plane or using other rainfall parameters, at the local, regional, and global scales (for detailed reviews of the published thresholds see, e.g., Aleotti [2004], Corominas [2000], Guzzetti et al. [2007, 2008], Saito et al. [2010]). The threshold can be drawn visually or by statistical techniques. In this latter case, the percentage of known landslide events below a threshold over the IDspace can be used, or more sophisticated threshold-like models can be applied for an objective identification [Brunetti et al., 2010; Guzzetti et al., 2007; Saito et al., 2010]. However, given that rainfall is not the only factor that causes landslides, a certain degree of uncertainty is unavoidable in the definition of rainfall thresholds [Aleotti and Chowdhury, 1999]. The poor quality of data on which the empirical methods are based may also increase uncertainty. Only very recently, researchers tried to address this issue by complementing the rainfall threshold information with probabilities of landslide occurrence. Frattini et al. [2009] used logistic regression to define ID thresholds associated to different levels of return period of rainfall responsible for landslide triggering. Jaiswal and van Westen [2009] validated visually drawn thresholds using a control data set not employed in the empirical model. This way, they can estimate the conditional probability of landsliding after the threshold has been exceeded and the overall temporal probability of landslide initiation. The use of rainfall thresholds is becoming common in the context of flash flood forecasting [Carpenter et al., 1999; Georgakakos, 2006; Martina et al., 2006; Norbiato et al., 2008; Ravazzani et al., 2007]. Flash floods are rapid processes closely correlated to rainfall events and in many cases traditional flood warning systems based on hydrological models are not suitable for the short forecast times required.

[5] A number of factors may impact the results of the empirical approach. Major limitations are caused by the quality of data (homogeneity and completeness, landslide timing, rainfall data resolution and rain gauge location), but a key factor is the method used to identify and describe the triggering rainfall. Many authors either do not specify the criteria used for rainfall identification or generally refer to “the beginning of a rainfall event.” However, qualitative criteria for rainfall identification leave room to subjectivity and impair comparison of results. To our knowledge, few authors addressed the problem of rainfall identification [Aleotti, 2004; Brunetti et al., 2010; Tiranti and Rabuffetti, 2010].

[6] The problem of defining a threshold is implicitly linked to the problem of estimating the probability of landsliding and choosing an acceptable probability value. In this work, we propose a Bayesian approach to estimate the probability of landsliding conditional to characteristics of rainfall events. We take advantage of the extensive catalog of historical landslides of the Emilia-Romagna Region (12,000 km2of mountain territory) to apply the proposed methodology. After a selection based on the quality of landslide information (timing and location) we used a data set consisting of 4141 events that occurred between 1939 and present. Daily rainfall data were provided by a network of 176 tipping-bucket rain gauges homogeneously distributed in the study area and available for the full study period. We specifically address the problem of identifying the rainfall event. This problem is relevant for all methods but it is of major concern in Bayesian inference because we will compare the frequency distribution of the rainfall events that resulted and did not result in slope failures, and it is therefore essential to adopt the same criterion for the two data sets.

2. Methodology

2.1. Motivation

[7] A threshold is defined as the level or the value that must be exceeded to produce a given effect or result. When a threshold is crossed, a radical change of state within a system will occur and this change often manifests suddenly. Implied in this definition is an inherently deterministic view: the state of the system can be predicted by comparing the input value (or a set of input values) with the threshold. Also implicit is that a given input will have a single possible output (above or below the threshold) since no randomness is involved in the development of future states of the system.

[8] Such a deterministic approach can be successfully used to define the rainfall threshold in the simplest cases. Figure 1a, for example, shows an ideal condition in which there is a clear separation between rainfall that triggered (black dots) and did not triggered (white dots) landslides. This may be the case of debris flows in coarse granular material that are initiated by channel runoff [Berti and Simoni, 2005; Coe et al., 2008], a triggering mechanism directly controlled by rainfall. In most cases, however, the distinction between critical and non-critical rainfall is not trivial.Figure 1bshows the conceptual case of deep-seated landslides. For these landslides the stability conditions are controlled by a complex combination of rainfall forcing and time-dependent factors such as near-surface soil moisture, pore pressure distribution, weathering and softening of materials, and long-term changes in field stress [Fell et al., 2000; Leroueil, 2001]. Failure conditions are achieved with a unique combination of all these relevant factors and the state of the system cannot be predicted by rainfall alone. When different outputs (failure or no-failure) can be obtained for the same input (a given rainfall event) a deterministic approach is no longer applicable and a probabilistic model is needed.

Figure 1.

Rainfall intensity-duration thresholds in the two conceptual cases of debris flows triggered by (a) channel runoff and (b) deep-seated landslides. In this latter case it is difficult to identify a threshold because rainfall events that result in landslides are not clearly distinguished from those that do not.

[9] Probability-based methods are advantageous for several reasons. First, they incorporate variability and uncertainty into the model, providing a quantitative assessment of threshold reliability [Bean, 2009]. For instance, in both cases described above a deterministic threshold could be defined at the lower bound of the rainfall that triggered landslides (Figure 1). This way, however, the meaning of the threshold is ambiguous (what happens when the threshold is exceeded?) and uncertainty is unaccounted. A probabilistic analysis, including the distribution of non-triggering rainfall, is much more informative and is capable of assign a reliability to a given threshold (e.g., threshold reliability higher for case 1a inFigure 1). Second, unlike the categorical forecast of deterministic methods, probabilistic models furnish a probability distribution of the forecast quantity thus providing a better ground for estimating extreme events (which correspond to the tail of probability distributions). Finally, probabilistic approaches are commonly used in quantitative risk assessment to determine the confidence levels of the prediction [Refice and Capolongo, 2004].

[10] In this section we describe a method to determine probabilistic rainfall thresholds based on Bayesian theory. The probabilistic approach provides an objective way to define thresholds in complex cases when conventional methods become highly subjective.

2.2. One-Dimensional Case

[11] Bayes' theorem is a direct application of conditional probabilities. The conditional probability is the probability of some event A (in our case a landslide) given the occurrence of some other event B (a rainfall episode with a certain magnitude, expressed in terms of total rainfall, intensity or any other variable). Conditional probability is written P(A|B) and it is read “the probability a landslide (A) occurs given a rainfall episode (B).” This probability is provided by the Bayes' theorem:

display math


P(B|A) = conditional probability of B given A (also called the likelihood), that is the probability of observing a rainfall event of magnitude B when a landslide occurs,

P(A) = prior probability of A (or simply prior), that is the probability a landslide occurs regardless of whether a rainfall event of magnitude B occurs or not,

P(B) = marginal probability of B, that is the probability of observing a rainfall of magnitude B regardless of whether a landslide occurs or not,

P(A|B)=conditional probability of A given B (also called posterior probability), that is the probability of observing a landslide when a rainfall event of magnitude B occurs.

[12] Bayesian probability is usually computed in terms of relative frequencies. Thus, if NR is the total number of rainfall events recorded during a given time reference; NA is the total number of landslides occurred during the same period; NB is the number of rainfall events of magnitude B; and N(B|A) is the number of rainfall events of magnitude B that resulted in landslides, the probability terms in (1) can be approximated to:

display math
display math
display math

and equation (1) reduces to P(A|B) ≈ N(B|A)/NB.

[13] The fundamental aspect of Bayes' inference is the use of prior and marginal probabilities. For example, assume that 10 landslides occurred in a certain area during a given time reference, and that 8 of them were triggered by rainfall B with an intensity I > 50 mm/day. Common thinking would say that a rainfall of magnitude BI > 50 mm/day has a probability 8/10 = 0.8 to trigger a landslide. This is incorrect because the ratio 8/10 indicates the probability P(B|A) to observe a rainfall of magnitude B when a landslide occurs (the likelihood), not the probability P(A|B) to observe a landslide when a rainfall B occurs. According to Bayes theorem, the value of P(A|B) depends on P(B|A) and also on prior and marginal probabilities. If, for instance, 1000 rainfall events occurred in the considered area and 200 of them had an intensity higher than 50 mm/day, we have P(A) = 10/1000 = 0.01 and P(B) = P(I > 50) = 200/1000 = 0.2. The actual landslide probability is then P(A|B) = P(A|I > 50) = 0.8 · 0.01/0.2 = 0.04 rather than 0.8.

[14] A rainfall threshold based solely on the rainfall events that resulted in landslides (the likelihood) is not truly informative because prior and marginal probabilities are neglected. This error is particularly insidious since it seems to be inherent in our way of thinking. Most cognitive scientists acknowledge that our beliefs rely on a limited number of heuristic principles such as similarity or representativeness [Tversky and Kahneman, 1974]. These principles may lead to severe and systematic errors when applied to uncertain events because of their insensitivity to prior probability.

[15] An additional working example is given in Table 1. The table lists the duration (D) and intensity (I) of all the rainfall events recorded in a hypothetical area during a given time frame. Five of the twenty rainfall events resulted in landslides, indicating a prior landslide probability P(A) = 5/20 = 0.25. The data also show that a rainfall intensity I> 40 mm/day was responsible for most of the historical landslides (4 out of 5), though this value was exceeded five times without causing any landslides. One-dimensional Bayes inference expresses this uncertainty in terms of probability. For a rainfallBI > 40 mm/day, it is P(B|A) = P(I > 40|A) = 4/5 = 0.80 and P(B) = P(I > 40) = 9/20 = 0.45 (9 out of the 20 rainfall evens fall in the considered range of intensity). The corresponding landslide probability is P(A|B) = P(A|I > 40) = 0.80 · 0.25/0.45 = 0.44.

Table 1. Sample Data Set for the Application of the Bayes Theorem
NDuration (day)Intensity (mm/day)Landslide

[16] Running the same analysis for different intensity classes (0 ≤ I < 40, 40 ≤ I < 80, I > 80 mm/day) a histogram of landslide probability is obtained (Figure 2). The intensity classes with the higher P(A|B) values are the most susceptible to landslides. The computed probabilities can be compared with the prior landslide probability P(A) = 0.25 to evaluate the significance of the conditional event BI1I < I2, where I1 and I2 define each specific intensity class (see the dashed line in Figure 2). In Bayesian terms, this comparison indicates how effective the variable is in sliding the prior probability P(A) to the posterior probability P(A|B), which is how our prior knowledge is improved by the additional information provided by the variable B. If the variable is completely irrelevant to the process, it would be randomly related with A and the two probability distributions P(B) and P(B|A) would be roughly the same. According to equation (1) the posterior probability would be P(A|B) ≈ P(A). In our example, instead, the posterior probability is well above the reference prior, indicating a good explanatory power in the highest classes of rainfall intensity (Figure 2b).

Figure 2.

Worked example of one-dimensional Bayesian analysis. (a) Comparison of prior landslide probabilityP(A), prior rainfall probability P(B), and conditional probability P(B|A) for three different classes of rainfall intensity. (b) Computed values of conditional landslide probability P(A|B) and comparison with prior landslide probability P(A).

2.3. Two-Dimensional Case

[17] Equation (1) can be easily extended to the case of two variables B and C:

display math

where the notation B, C indicates the joint probability of having a certain value (or range of values) of the two variables. If, for example, BI is rainfall intensity and CD is rainfall duration, equation (3) provides the probability of a landslide in response to a rainfall event of given duration and intensity.

[18] Figure 3 shows the application of equation (3) to our sample data set (Table 1). All the twenty rainfall events are plotted in the duration-intensity plane and the plane is divided into four regions delimited byI and D values (Figure 3a). Equation (3) is then computed separately for each region obtaining probabilistic information in the ID space (Figure 3b). In the upper-left cell, for example, 2 rainfall events out of 4 resulted in landslides, that isP(I, D|A) = 2/5 = 0.40 and P(I|D) = 4/20 = 0.20. The prior landslide probability is P(A) = 5/20 = 0.25 and the posterior landslide probability is P(A|I, D) = 0.40 · 0.25/0.20 = 0.50 (Figure 3b).

Figure 3.

Worked example of two-dimensional Bayesian analysis. (a) Rainfall intensity-duration plot showing rainfall that did and did not result in landslides. (b) Histogram of conditional landslide probability for four different combinations of rainfall intensity and duration.

[19] Any pair of variables can be considered in two-dimensional Bayesian analysis (e.g., peak rainfall intensity, total event rainfall, antecedent rainfall, groundwater level), and their significance can be assessed by comparing the computed posterior landslide probability with the prior landslide probabilityP(A). In principle, the Bayes' approach is suited to handle multidimensional analysis with n-variables, for example, the combined effect of rainfall duration, rainfall intensity and antecedent precipitation on landslide triggering. However, besides the limits imposed by the scarcity of data, multidimensional data are difficult to visualize in an efficient or useful manner and for this reason we will restrict the analysis to the two-dimensional case.

2.4. Multiple Rain Gauges

[20] In some cases it may be desirable to use multiple rain gauges to get a more accurate description of rainfall in the area. To do this, we divide the study area into homogeneous zones Ai in which the rainfall conditions are similar, and assign a reference rain gauge to each zone. Individual areas of influence were defined using Thiessen polygons [Croley and Hartmann, 1985]. The Bayesian analysis is then applied separately to each rain gauge polygon by considering the local rainfall and the historical landslides that occurred in that polygon. The result is a mosaic map of spatially variable landslide probability.

[21] A major drawback of this approach, however, is that the number of historical landslides in each rain gauge area can be very small, and this leads to inaccurate estimates of landslide probability. A way to avoid data splitting while retaining the spatial dependence of rainfall is conceptually shown in Figure 4. We assign to each historical landslides the rainfall data recorded in the corresponding rain gauge “homogeneous” area in order to have representative data. All the data recorded by the NG rain gauges (NG = 2 in the figure) are then merged into a single data table and analyzed as a unique data set. The result is a unique landslide probability value P(A|B, C) for the entire study area. This value, however, indicates the probability to have a landslide within an area AiA/NG proximal to the ith rain gauge (referred in the following as “rain gauge area”), not in the entire area A. The Bayesian probability unavoidably depends on the scale at which the observations were available and the computed values decrease with the number of rain gauges because the prior probability P(A) = NA/NR becomes progressively smaller (the same rainfall event is recorded by multiple instruments while landslides are spatially discrete events). Scale dependence is therefore implicitly included in Bayesian analysis and the ratio A/NG indicates the reference area for the computed probability. For example, if the study area is 100 km2 and the data set combines data from 5 rain gauges, equation (3) will provide the landslide probability P(A|B, C) into a reference area of about 20 km2.

Figure 4.

Conceptual sketch showing a merged data set which combines the rainfall events recorded by two rain gauges (R1 and R2) and the historical landslides that occurred in the corresponding reference areas (A1 and A2). N = event number; D = rainfall duration (day); I = rainfall intensity (mm/day); L = landslide occurrence.

[22] Sometime it is necessary to apply the results to a different scale than that used to infer the probabilities. The landslide probability P(A|B, C) can be upscaled to a larger area made of NP adjacent polygons by employing a binomial probability model. The underlying assumption is that the change of rainfall probability with scale can be neglected. If this assumption holds, the binomial distribution can be used to obtain the probability of observing k successes in n trials, with the probability of success on a single trial denoted by p:

display math

If we define p = P(A|BC), equation (4) gives the probability to observe k landslides (“successes”) in an area constituted by n = Np polygons. Therefore, the binomial probability Pbin(A|BC) to have at least one landslide in the large area is given by the complementary probability of no landslides 1 − P(k = 0):

display math

Back to the previous example, a Bayesian probability P(A|B, C) = 0.10 in a rain gauge area (20 km2) would be upscaled to Pbin(A|BC) = 1 − [1 − 0.1]5 = 0.41 for the entire area (100 km2).

2.5. Multiple Landslides Triggered by the Same Rainfall

[23] It is quite common during major storms that multiple landslides are triggered by the same rainfall event in the same area. This case can be explicitly considered in the Bayesian analysis by introducing an additional variable which counts the number of landslides triggered by each rainfall event. However, multidimensional analysis is hampered by the scarcity of data and hence is of limited applicability in practical applications. It is therefore better to count multiple landslides as one single event and to define the event A as “the occurrence of at least one landslide in the proximity area.” This also ensures that the number of landslides NA is not larger than the number or rainfall events NR, which would lead to the unrealistic prior P(A) = NA/NR > 1.

[24] The main objection to this approach is that an important piece of information (the number of landslides triggered by a given rainfall) is lost. An alternative solution might be to count the multiple landslides in each bin, and to replicate the rainfall events as many times the number of multiple landslides (to ensure P(A) ≤ 1). This would increase the landslide probability in the rainfall classes that generated multiple landslides in the past. In the example application described in section 4 we used the first approach because it is simpler and more statistically rigorous.

3. Study Area

3.1. General Setting

[25] The Emilia-Romagna region is located in the north of Italy and is one of the country's most populated areas. The study area includes the mountainous part of the region, which belongs to the northern Apeninnes chain and covers approximately 12,000 km2. The elevation ranges from 50 m to 2100 m a.s.l. over a distance of about 50 km running north-south. The area has a mild Mediterranean climate with distinct cold and dry seasons. The average annual rainfall is around 1300–1400 mm and varies across the area from a minimum of 500–600 mm in the foothills to more than 2000 mm along the main divide. The bedrock geology is characterized by three main rock types (Figure 5): clastic rocks, flysch, and clays units [Bettelli and Vannucchi, 2003; Pini, 1999]. Clastic rocks account for about 10% of the mountainous territory and are mostly sandstones, calcarenites and marls. Flysch (48% of the area) consists of interbedded clastic rocks (mostly sandstones and calcarenites) and marls with variable ratio of coarse to fine beds. Clay units (42% of the area) consist of overconsolidated fissured clays, clayshales, and chaotic clay complexes made of rock blocks and disrupted strata floating in a scaly clay matrix. The map in Figure 5 provides a view of the geological complexity at the scale of the analysis.

Figure 5.

Lithological map of the Emilia-Romagna region.

[26] The Emilia-Romagna region is strongly affected by landslides. More than 20% of the mountain territory is covered by active or dormant landslide deposits. Though landslides do not usually cause causalities, they cause severe damage to properties, facilities, and infrastructure. About €130 million has been spent in the last 4 years on regeneration and remedial works.

[27] The most common types of landslides are earth slides and earth flows in the clay units. Earthflow deposits are usually elongated with moderately lobate shape. The feeding zone typically consists of a bowl-shaped area characterized by failure scars, disturbed slumps, and rotational slips, and it is bounded by a main headscarp whose rugged morphology is clearly visible in case of active or recent phenomena. The toe of the deposit often reaches the main valley bottom or the bed of a small tributary. Multiple deposits formed by the juxtaposition of single earthflows are common. The vast majority of earth flows are subjected to periodic reactivations triggered by intense rainfall. The return period and the extent of the reactivation (partial or complete) are highly variable while the reactivation mechanism consistently shows earth slides in the feeding basin followed by flow (or sliding) along the main track.

[28] Flysch units are affected by complex landslides, large rotational slips, translational slides along bedding planes, or compound failures. Transition to flow is less common than in clay units and strongly depends on local geological and geomorphological conditions. Rockfalls are present in massive rocks although not very common. In addition to these large slope failures, many small-scale shallow landslides occur almost everywhere in the area affecting the weathered cover of both clay and flysch units. In particular, the frequency of landslide-induced debris flows is increasing in recent years (unpublished data), seriously endangering local communities which are not accustomed to these phenomena.

[29] We performed our analysis considering all historical records of landslides in the study region. We have not distinguished between different types of landslide or the different rock units for the following reasons. First, in most cases the information reported in the historical landslide catalog do not allow to identify the landslide type or even the precise location of the failure. Second, the partition of the database into subsets reduces the number of data in each “homogeneous” group affecting the reliability of the result. Third, our field experience has shown that major storms are able to trigger landslides of different types in all the lithological units, while light rainfall events do not trigger landslides anywhere. If present, site-specific triggering conditions do not emerge clearly. Finally, the rainfall threshold should be implemented in the civil protection alert system which works at a regional and sub-regional scale.

3.2. Landslide Database

[30] The Emilia-Romagna Geological Survey maintains a catalog of historical landslides in the study area. The catalog includes the data of the Italian Archive of floods and landslides [Guzzetti et al., 1994] integrated with the information collected from parochial archives, technical documentation, reports to local authorities, national and local press. The landslides listed in the catalog are those reported to local authorities or described in some historical or journalistic document. Any slope movement causing some sort of damage was most likely reported while minor phenomena occurred in remote areas were likely to go undetected. Although the catalog is not truly comprehensive it provides an accurate inventory of the landslides that caused any damage in the area. Rossi et al. [2010] showed that the catalog is statistically complete from about 1951 and they used it as a proxy of actual landsliding.

[31] The historical catalog is available as an Access database. A total of 9004 landslides are reported over the period 1400–2009, and for each landslide the following information is stored: location, date of occurrence, landslide characteristics (length, width, type, and material), triggering factors, damages, references. Not all information is available for all landslides and in most cases the classification is lacking or ambiguous. To the purpose of the analysis, however, we only need the location and the triggering date of the historical landslide, in order to identify the triggering rainfall (section 4.1). The selection of the landslides for which the date of occurrence was known with daily accuracy led to a data set of 4141 landslides in the period 1939–2009 that was used for the analysis (Figure 6a). These historical landslides are quite evenly distributed in the area and affect all the geological units described above (compare Figure 6a with Figure 5).

Figure 6.

(a) Distribution of the 4141 historical landslides for which the date of occurrence is known with daily accuracy and (b) rain gauge network in the mountainous part of the Emilia-Romagna region.

3.3. Rainfall Database

[32] The monitoring rain gauge network of the Emilia-Romagna region consists of over 200 tipping-bucket rain gauges homogeneously distributed over the entire regional territory, 176 of them located in the mountainous part (Figure 6b). Rainfall data were collected daily in manual rain gauges before 2001 and automatically every 30 min since 2001. Daily rainfall was then used for the analysis throughout the period 1939–2000. Despite their low resolution, rainfall data are suitable to identify the critical rainfall events that triggered historical landslides in the study area (see section 4.1). In fact, because of the predominance of fine-grained soils, landslides generally occur after several days of continuous rainfall and the response time to rainfall forcing is much longer than usually observed for shallow landsliding in coarse soils [e.g.,Caine, 1980; Aleotti, 2004]. Measurements of the snow cover are only available in a few stations since 2000 and were not used for the analysis.

4. Application

4.1. Identification of Rainfall Events

[33] The first step in the evaluation of any rainfall threshold is to identify the rainfall episodes that triggered the historical landslides, here referred as “triggering rainfall.” Ideally a triggering rainfall event should be a well-defined rainfall episode, described by its duration (D), amount of precipitation (E), and intensity (I) and clearly related to a given landslide. In some cases the identification is simple (for instance if the landslide occurred after a heavy rainfall preceded by a prolonged dry period) but usually it is not. Landslides may result from complex rainfall sequences made of multiple bursts of variable duration and intensity that make it difficult to detect a well-defined triggering episode. The greatest uncertainties usually derive from the identification of the beginning of the triggering rainfall while the time of landsliding is taken as its end. A certain amount of time without rainfall (or limited rainfall) can be used as criterion to truncate the rainfall sequence that precedes a landslide event [Brunetti et al., 2010]. Alternatively, the identification relies on author judgment [Aleotti, 2004] or on the use of multiple time frames [Frattini et al., 2009]. Additional sources of uncertainty are the time of occurrence of the landslide (that may have been reported late or not be representative of the initial failure in the source area), or the role of snowmelting.

[34] The identification of triggering rainfall is of particular concern in Bayesian inference. For such an analysis it is essential to adopt the same objective criteria to detect rainfall events that have triggered landslides and those that have not caused landslides. In Bayesian terms, this means that the same criteria must be used to define the conditional distribution of the triggering rainfall B|A and the marginal rainfall distribution B (see section 2). In order to address the problem, we first analyzed the rainfall sequence for each landslide event and isolated the triggering rainfalls based on expert judgment. We then used an automatic procedure to extrapolate the criteria that better reproduced our results in terms of rainfall identification.

[35] For each of the 4141 historical landslides, we compared the rainfall data recorded by the reference rain gauge (see Figure 4) with those recorded by the other two closest rain gauges. This comparison was done to detect a possible malfunctioning of the reference instrument and to investigate the spatial variability of the rainfall event. In almost all cases no significant differences were observed between the recorded data, thus the reference rain gauge was used in the analysis. The triggering rainfall was then defined visually by selecting the rainfall episode closest to the date of occurrence of the landslide. This work was done independently by three of the authors to evaluate possible interpretation differences. The results were then compared and the discrepancies (usually related to the beginning of the triggering event) were discussed to arrive at a shared definition. We generally agreed to define the triggering event as a period of continuous or nearly continuous precipitation which starts with the onset of the rainfall (or with an abrupt increase of rainfall intensity in a period of light rain) and ends the day of occurrence of the landslide. For those landslides that have occurred after the end of the rainfall, the duration of the triggering event was set equal to the rainfall duration.

[36] Each triggering rainfall was then classified as: well-defined (type 1), uncertain (type 2), or undefined (type 3).

[37] Type 1: Well-defined rainfall events can be clearly identified, as shown in the examples ofFigures 7a–7d. In these cases the degree of subjectivity is very low, though not entirely absent. For instance, the first rainfall pulse in Figure 10d could have been included into the triggering event, or the triggering rainfall in Figure 10b limited to the second rainfall burst. Type 1 events were detected for 2741 of the 4141 historical landslides (66%).

Figure 7.

Examples of manual (gray areas) and automatic (dashed areas) identification of the triggering rainfall. Black arrows indicate the date of occurrence of the landslide reported in the historical catalog. (a–f) Types 1, 2, and 3 are distinguished for the different ease of identification (see text).

[38] Type 2: Uncertain events (10% of the historical landslides) consist of distinct rainfall episodes characterized by uncertain or subjective limits because of the presence of secondary rainfall episodes (Figure 7f).

[39] Type 3: Undefined events (24% of the historical landslides) consist of all those landslides without a significant rainfall event close to the date of occurrence, such as landslides triggered during complex rainfall sequences (Figure 7g) or by light rainfall in winter time (snowmelting?). The difficulty to establish the exact time of a landslide [Guzzetti et al., 2007] and the influence of factors other than rainfall are of the main reasons for this undefined events.

[40] Only well-defined events (Type 1) were considered in Bayesian analysis. When using this method trigger rainfall must be reliably identified in order to get a reliable likelihood functionP(B|A). Landslides triggered by snowmelting or influenced by other factors (Type 2 and 3) were then omitted. Among the 2741 Type 1 records, 1573 correspond to duplicated landslides triggered by the same rainfall in the same rain gauge area. Since multiple landslides are not counted in the analysis (see section 2.5), the suitable landslide records are then NA= 2741 − 1573 = 1168. Interestingly, about 60% of these well-defined landslides occurred at the end of the rainfall event or a few days later (as inFigures 7b and 7c), while the remaining 40% even occurred before the end (Figure 7d). This indicates a rapid hydrological response of the slopes to rainfall, despite the data set mainly consist of landslides in fine-grained soils.

[41] An automated detection algorithm was used to detect all the rainfall events that happened in the study area in the last 50 years. The algorithm scans a rainfall time series and detect the rainfall events using a moving-window technique: a new event starts when the precipitation cumulated overDT days exceeds a certain threshold ET, and ends when it goes below this threshold. For instance, if DT = 3 days and ET = 2 mm, the rainfall event starts when the cumulative rainfall exceeds 2 mm in 1, 2, or 3 days (that is if 2 mm are exceeded on the first day, the rainfall starts at day 1). Then, the rainfall event stops when it rains less than 2 mm in 3 days; the end of the event is defined as the last of the three days in which the rainfall is greater than zero.

[42] Different combinations of DT and ET were tested, with values ranging from 1 to 10 days and from 0 to 10 mm respectively. The best combination was defined as the one that can reproduce more closely our expert judgment. To this purpose we reconsidered the 1168 triggering rainfall of Type 1 and for all the landslides which occurred before the end of the rainfall (40% of the cases) we redefined the event to the end. Such modified data set was used to calibrate the automatic detection algorithm. For each combination of DT and ET, the rainfall events detected by the algorithm were compared with those manually defined, using the percentage root mean square error of the prediction (RMSEP) to measure the goodness of fit. As can be seen in Figure 8, the prediction error has a minimum using DT = 3 days and ET = 5 mm. By adopting these values the algorithm is able to replicate our expert judgment (see the examples in Figures 7a and 7b) although minor discrepancies still occur in some cases (Figures 7c and 7d).

Figure 8.

Calibration of the automated detection algorithm used to isolate the rainfall events in the rain gauge data series. The chart shows the goodness of fit between automated and manually defined triggering rainfall (in terms of percentage root mean square error of the prediction) as a function of the two algorithm parameters. The circled values indicate the calibrated values of DT and ET associated to the minimum square error (black dot).

[43] The calibrated algorithm was finally applied to all the 176 rain gauges available in the area. A total number of 250177 rainfall events were identified and characterized in terms of total event rainfall E (mm), event duration D (days), average intensity over the event I = E/D (mm/day), and antecedent rainfall in the 14 (AE14), 30 (AE30), and 60 (AE60) days preceding the triggering rainfall (mm). The frequency distribution of these parameters will be compared with those pertaining to critical rainfalls to compute landslide probability. In this respect it could be argued that the two data sets are not truly comparable because triggering rainfall are truncated at the landslide date. However, as already mentioned, about 60% of the landslides occurred at (or after) the end of the rainfall and this percentage rises to 90% by including the three preceding days . In our sample application the discrepancy between the two data sets is then small. In cases where a systematic difference exists between triggering and non-triggering rainfall (for instance, if landslides typically occur at peak rainfall) the latter should be differently defined to make the two data sets comparable.

4.2. Conventional Methods

[44] Conventional rainfall thresholds are usually obtained by drawing the lower-bound limit of the rainfall events that have resulted in landslides (triggering rainfall) or by defining a dividing line through these data using some statistical technique. Intensity-duration thresholds are the most commonly reported. They often take the form of a power law with a negative scaling exponent and exhibit a linear trend on a logD − logI space.

[45] Before applying the Bayesian method it can be useful to draw visually the lower boundary of triggering rainfalls for our data. The lower envelope of the critical rainfall events is often used as operational threshold in conventional methods [e.g., Jibson, 1989; Tiranti and Rabuffetti, 2010]. Figure 9ashows the log-logIDplot for the 1168 well-defined rainfall events that triggered 2741 landslides in the Emilia-Romagna region. The lower envelope of experimental points is located below the regional threshold proposed byGuzzetti et al. [2007]for the CADSES area (Central European Adriatic Danubian South-Eastern Space, mild midlatitude climates). This threshold was defined by a thorough analysis of many empirical thresholds published in the literature, and it is taken as representative outcome of conventional methods with the purpose of comparing deterministic and probabilistic results (although this threshold was inferred using a Bayesian technique it is here considered a “conventional threshold” since only the triggering rainfall events were used in the analysis).

Figure 9.

Rainfall intensity-duration thresholds for the initiation of landslides in the Emilia-Romagna region. Black lines show the regional threshold proposed byGuzzetti et al. [2007]and the lower envelope of (a) the triggering rainfall of the Emilia-Romagna data set. (b) The two thresholds are compared with the rainfall events that have not resulted in landslides during the same time period.

[46] It is very unlikely that a rainfall event below the two lines in Figure 9a will trigger a landslide. Both thresholds, however, would be of limited use in practice: if we compare the two thresholds with the distribution of the rainfall that did not result in landslides (Figure 9b) we see a large number of non-critical rainfall falling above thresholds. If the threshold is used to trigger an alarm, the percentage of false alarms is as high as 32% for the CADSES threshold and 75% for the lower envelope. Similar results are obtained using other rainfall descriptors such as total event rainfall, or normalizing the data by the mean annual precipitation. In all the cases it is not possible to draw any line of distinction between triggering and non-triggering rainfall.

4.3. One-Dimensional Bayesian Probability

[47] The one-dimensional Bayesian analysis evaluates the significance of a variableB in explaining a certain event A. In our application A represents the occurrence of landslide in the study area and B any variable describing the rainfall event, such as rainfall duration or intensity. As described in section 2.2, the comparison between the posterior landslide probability P(A|B) and the prior landslide probability P(A) shows the gain of knowledge due to the condition, and therefore indicates the significance of B in the occurrence of event A. If P(A|B) objectively differs from P(A) the variable B has a significant influence, if P(A|B) ≈ P(A) it has not.

[48] The analysis can be applied to the Emilia-Romagna data set following the procedure described insection 2.2. The marginal rainfall probability P(B) is computed using the NR = 250177 rainfall events recorded in the last 50 years, and the conditional probability P(A|B) using the NA= 1168 well-defined rainfall that triggered the historical landslides. Prior landslide probability is thereforeP(A) = NA/NR = 1168/250177 = 0.005. Five explanatory variables are tested: event rainfall E, rainfall duration D, mean rainfall intensity I, antecedent rainfall in the 14 (AE14) and 30 (AE30) days before the event.

[49] The results of the analysis are shown in Figures 10 and 11. The charts on the left compare the frequency distributions of triggering rainfall versus overall rainfall, that is P(B|A) versus P(B). The ratio of the two distributions (multiplied by P(A)) gives the landslide probability P(A|B) shown on the right. A large difference between P(B|A) and P(B) gives high landslide probability and indicates the high significance of the considered variable. The results in Figure 10 clearly show that both event rainfall, rainfall duration, and rainfall intensity are strongly significant: in all cases the distributions P(B|A) and P(B) are markedly different and the corresponding landslide probability is well above the prior probability P(A) (see section 2.2). Rainfall intensity, in particular, seems to be the most significant variable among the three, showing values of P(A|B) as high as 0.28 for I > 100 mm/day.

Figure 10.

One-dimensional Bayesian analysis of the Emilia-Romagna data set. Three different rainfall variablesB are considered: (a–b) event rainfall, (c–d) rainfall duration, and (e–f) rainfall intensity. Charts on the left show the prior landslide probability P(A), the prior rainfall probability P(B), and the conditional (known) probability P(B|A) for different values of the considered variable. Charts on the right compare the computed landslide probability P(A|B) with the prior landslide probability P(A) (dotted lines) to evaluate the significance of the considered variable. Dashed lines on the right charts indicate the 95% confidence bounds around P(A|B).

Figure 11.

One-dimensional Bayesian analysis of the Emilia-Romagna data set considering the antecedent precipitation in the (a–b) 14 and (c–d) 30 days before the triggering event. Symbols are the same as inFigure 10.

[50] The probability of landsliding increases with the severity of the event (increased values of rainfall precipitation, duration, or intensity) although the rise is somewhat irregular due to the uneven distribution of data. However, at the highest values of the three parameters, the landslide probability seems to decrease (Figures 10b–10d and 10f). This unexpected trend is mainly due to two factors. First, the computed probabilities of such extreme events are affected by a lack of significance due to low sample sizes. Bins with few data may not be sufficiently informative and a small variation in the reported number of events could result in a very different probability. To evaluate the impact of this uncertainty we computed the 95% confidence intervals from Poisson counting errors to the number of landslides [Bailar and Ederer, 1964; Naylor et al., 2009] and propagated these through the analysis to define the confidence bounds of landslide probability (dashed lines in Figures 10 and 11). As expected, the upper bounds increase rapidly with the severity of the rainfall event and the final decrease of landslide probability is less noticeable. A second reason of equal importance is the bias introduced by the definition of the triggering rainfall (section 4.1). Triggering rainfall are truncated at the landslide date while non-triggering rainfall continue until the end of the rain. Therefore, it may happen that the landslides triggered by a long-lasting rainfall event are counted in another bin because they occurred before the end of the rainfall. This bias may explain, together with sampling effects, the observed trend of landslide probability associated to extreme rainfall events.

[51] Bayesian analysis also shows that landslides in the study area are not correlated with the antecedent precipitation in the 14 or 30 days before the event (Figure 11). In both cases the conditional distribution of triggering events P(B|A) is very similar to the rainfall marginal P(B) and for all classes it is P(A|B) ≈ P(A). This finding is rather surprising because it is generally regarded that antecedent rainfall conditions strongly affect slope stability in fine-grained soils [Corominas, 2000]. This issue will be discussed in section 5.

4.4. Two-Dimensional Bayesian Probability

[52] Two-dimensional Bayesian analysis evaluates the conditional probability of the event given the joint occurrence of two control variables (section 2.3). The variables should be selected among those exhibiting the highest explanatory power in one-dimensional analysis, which in our case are event rainfall, rainfall duration, and rainfall intensity. Given that rainfall thresholds are usually defined in the rainfall duration-intensity space, the analysis is performed using these two variables.

[53] The application to the Emilia-Romagna data set follows the procedure described insection 2.3. All the rainfall events recorded in the last 50 years are plotted on a log-logID chart (NR = 250177, white dots in Figure 12a) together with the rainfall events that resulted in landslides (NA = 1168, black dots in Figure 12a). The logD − logI space is then divided in 9 × 13 cells (a reasonable compromise between rainfall resolution and number of data in each class), and the probability of a landslide occurring is computed for each cell using equation (3). As discussed in section 2.4 these values indicate the probability to have a landslide in the proximity area of a single rain gauge (about 65 km2). Also remember that the computed values are just a proxy of the “true” landslide probability because the historical catalog may be incomplete for remote, minor landslides (section 3.2).

Figure 12.

Two-dimensional Bayesian analysis of the Emilia-Romagna data set. (a) Log duration-log intensity chart showing the distribution of the rainfall events that resulted in landslides (black dots) and that of all recorded rainfall during the same time period (gray dots). (b) Histogram of landslide probability as a function of rainfall duration and intensity. The black line is the regional threshold proposed byGuzzetti et al. [2007]. The “no landslide” area indicates rainfall conditions that never resulted in landslides during the considered time period; the striped area indicates rainfall conditions that never occurred during the same period.

[54] The result of the analysis is shown in Figure 12b. As can be seen, the probability of having a damaging landslide is zero if the intensity of the rainfall event is lower than about 2.5 mm/day (logI= 0.4), regardless the duration of the event. For short-duration rainfall (less than about 2 days) this probability is zero up to an intensity of 10 mm/day. The “no landslides” area shown in the chart encloses all the cells in whichP(A|I, D) = 0 because NA(c) = 0, that is no landslides are reported in the historical catalog. The regional threshold proposed by Guzzetti et al. [2007] falls just above this area.

[55] The landslide probability increases with both rainfall duration and intensity, although this latter variable affects the results more. The maximum probability value of 0.4 is reached for rainfall events with a duration of 3–5 days and an intensity greater than 100 mm/day. The lower landslide probability computed for rainfall events characterized by high intensity and long duration (10 days or more) is partly due to contouring effects and partly to the data bias previously discussed (section 4.3).

[56] The results can be visualized best as lines of equal landslide probability on a 2D plot (Figure 13). Interestingly, the isolines of Bayesian probability are roughly parallel to the regional threshold proposed by Guzzetti et al. [2007], indicating that our statistical analysis provides comparable results with the more traditional methods. The linearization of the isolines using a constant slope provides a series of possible rainfall thresholds associated to different probability of landslide. This raises the question of which probability value, if any, is the “true value” to consider as an alert threshold. For example, the Guzzetti threshold corresponds to a landslide probability of less than 0.01. A reasonable choice would be to set the threshold where there is an abrupt increase of the probability of failure, which indicates a radical change of state of the system. In our case a threshold could be then defined at P(A|I, D) ≈ 0.05 since the landslide probability rapidly increases above this line (at least for short-duration rainfall). However, there is not a general rule for this choice. The acceptable probability of failure is strongly related with the acceptable risk and the acceptable amount of damages and losses. Even a probability of 0.01 can be unacceptable in a vulnerable area.

Figure 13.

Lines of equal landslide probability in the rainfall duration-intensity chart (note that isolines are not equally spaced). Probability values refer to the average extent of the 176 rain gauge areas (about 65 km2). Black lines indicate possible rainfall thresholds for different values of acceptable landslide probability. Dashed line is the regional threshold proposed by Guzzetti et al. [2007]. The “no landslide” area indicates rainfall conditions that never resulted in landslides during the considered time period; the striped area rainfall conditions that never occurred during the same period. The thick ticks along the I and D axes indicate the spacing of the data points that underlie the interpolation (see the grid in Figure 12a).

[57] It should be also recalled that landslide probability strongly depends on the areal extent. In the scenario of spatially uniform rainfall over the area, the larger the area the higher is the probability to have a landslide in response to a given rainfall, and this scale-dependence is implicitly considered in Bayesian analysis (seesection 2.4). For example, if we apply the binomial theorem to upscale the computed probability to a larger area made of 40 adjacent rain gauges, which corresponds to the reference territorial unit of about 2600 km2 used in the regional alert system, we obtain the results shown in Figure 14. The isolines of landslide probability are well defined, parallel, and with slope similar to the previous case, but the numerical values are considerably higher. The Guzzetti threshold now corresponds to a landslide probability of 0.1, and the slope break shown by the isolines (at about 0.2–0.4) is much lower than for a smaller area.

Figure 14.

Lines of equal landslide probability in the rainfall duration-intensity chart (note that isolines are not equally spaced). Probability values refer to the reference territorial unit used in the regional alert system (about 2600 km2) and were computed by applying the binomial theorem to the probability values shown in Figure 13. The “no landslide” area indicates rainfall conditions that never resulted in landslides during the considered time period; the striped area rainfall conditions that never occurred during the same period. The thick ticks along the I and D axes indicate the spacing of the data points that underlie the interpolation (see the grid in Figure 12a). Note that the color scale is different from Figure 13.

5. Discussion

5.1. Advantages of the Method

[58] Several methods have been proposed in the literature to determine the uncertainty related to the definition of rainfall thresholds using both frequentist statistics or Bayesian inference [Brunetti et al., 2010; Peruccacci et al., 2012]. In all these methods, however, only the rainfall that resulted in landslides are considered. Therefore, the computed probabilities of landslide occurrence only represent a part of the overall uncertainty, which related to the scatter of the triggering rainfall. In our approach both the critical and non-critical rainfall events are considered and this allow to express all the uncertainties in terms of probabilities. Bayes rule explicitly considers the fact that in complex geological conditions the same rainfall event may or may not result in a landslide depending on a large number of factors such as hydrological behavior, decrease in shear strength, long-term deformation rate, progressive failure, or human actions. In our sample application, for instance, although it is practically impossible to draw any dividing line between the rainfall that resulted and did not result in landslides (Figure 9), it is fairly easy to guess a reasonable threshold in terms of landslide probability (Figures 1214).

[59] In addition, Bayes predictions can be updated dynamically as new data becomes available by simply shifting the posterior probability P(A|B) into the likelihood P(B|A). Last, the proposed method is tailored to practical decision making, where it is generally essential to consider the cost of both missed alerts (false negatives) and false alarms (false positives).

5.2. Limitations

[60] Critics of Bayesian methods complain that Bayes rule does not tell us anything we didn't already believe [Gelman, 2008]. The posterior probability P(A|B), in fact, entirely depends on the likelihood function and on prior and marginal probabilities: therefore, we don't learn anything new so much as merely suggested by data. Although these criticisms can be seen as a strength of the method (the Bayes approach does not create anything but makes evident what is hidden in the data), they highlight the philosophical conflict between frequentists and Bayesians, and are justified when the prior is unknown and must be selected subjectively [Efron, 1986]. The objection is not relevant in situations where, as in our application, the prior and marginal distributions have a physical basis and can be derived from empirical data. Even in this case, however, it is important not to blindly believe to the results, bearing in mind that Bayesian methods tend to support our preconceptions. For example, if the rainfall B never resulted in landslides during the observation period, the landslide probability P(A|B) will be zero because the likelihood P(B|A) is zero. However, given the uncertainties in the landslide catalogs, it can be dangerous to disbelieve results that may surprise us. It is therefore important to analyze long-term data series in order to evaluate the marginal distribution of rainfallP(B) and to use a complete landslide catalog which covers the same long period to estimate the likelihood function P(B|A). In some cases this can be a serious limitation to the application of the method.

[61] Another limitation (which also applies to traditional methods) concerns the long-term representativeness of the historical data. Natural systems are affected by radical changes in land use, land cover, rainfall pattern, and human expansion, which influences the frequency of reported landslides. Therefore, the conditions that favored or triggered landslides in the past may not be representative for the future. In probabilistic terms this means that prior probabilities may be no longer significant. Bayesian analysis allows us to add control variables or to redefine prior information to account for these changes, but we must be aware that probabilistic predictions only rely on previous knowledge. When (as in our study area) the frequency, style, and spatial distribution of landslides does not change significantly during the observation period we can assume that long-term variations are slow and that our predictions will be reliable for the near future.

[62] Another (minor) limitation of the method is that the results computed for two different areas cannot be directly compared. In fact, the landslide probability depends (beside other factors) on the extent of the reference area: the larger the area, the higher is the probability to have a landslide. Moreover, even if the two areas are similar, different criteria could have been adopted to compile landslide catalog or to define the rainfall event, leading to different values of likelihood, prior, and posterior probability. Some sort of normalization must be done in these cases in order to compare the landslide probability.

5.3. Specific Results

[63] The proposed approach was applied to the Emilia-Romagna Region of Italy (12,000 km2of mountain territory) taking advantage of the landslide historical archive that includes more than 9,000 records. The vastness of the archive offers rare opportunity to explore the relationship between rainfall and landslides. A careful selection was operated to reduce uncertainties about landslide timing and/or location and landslides possibly triggered by other factors (e.g., snowmelt, anthropic), leaving 2741 landslides associated to 1168 well-defined triggering rainfalls.

[64] The objective identification of a rainfall event and the choice of representative parameters for its description are two closely related problems. They obviously influence the results of any deterministic or probabilistic rainfall threshold analysis. Based on the analysis of triggering rainfall patterns, we propose 3 days and 5 mm of rain as criterion to locate the beginning (criterion is exceeded) and the end (criterion is not exceeded) of rainfall events. Antecedent rainfalls are measured backward from the onset of the triggering event. In Bayesian inference, an objective identification criteria is doubly important since it is used to obtain both marginal and conditional distributions.

[65] Our results show a clear dependence between the parameters describing the rainfall event (total rainfall, duration and mean intensity) and the probability of landsliding (Figure 10). Much more surprisingly, we found no clear correlation with the antecedent rainfall in the 14 and 30 days before the event (Figure 11), which seem to be unimportant compared to the event rainfall. These findings were systematically confirmed in the sensitivity analysis (section 5.4). In literature, there is general agreement in recognizing that the role of antecedent rainfall decreases with soil conductivity and thickness of the unstable mass [Bonnard and Noverraz, 2001; Campbell, 1975; Corominas, 2000]. Since our data set includes many landslides in fine-grained soils we would expect to register some sort of dependence with antecedent rainfall conditions.

[66] A possible explanation of this result may lie in the criterion used for rainfall identification. Most reported works have adopted a fix rainfall duration (one to few days) that may cause some event rainfall to be accounted for as antecedent [Chleborad et al., 2006; Glade et al., 2000; Jaiswal and van Westen, 2009; Kim et al., 1991; Terlien, 1998]. In those few cases where researchers adopted a clear distinction between antecedent and event rainfall, the relevance of antecedent precipitation did not emerge as clearly [Aleotti, 2004]. Also, it is important to bear in mind that our analysis provides the probability of having at least one landslide and does not consider multiple events. This might help to explain the result because widespread (multiple) landsliding proved to be satisfactorily related to antecedent moisture conditions [Godt et al., 2006].

[67] In any case, the substantial independence from antecedent rainfall indicates that in the study area the slopes react rapidly to the rainfall events. This finding agrees with the field data collected by Berti and Simoni [2010], who monitored representative clay slopes in the area detecting fast and transient increases of pore pressures in response to individual rainfall events. The fact that nearly all the historical landslides occurred during (or immediately after) the rainfall event is a further evidence of the strong linkage between slope stability and short-term hydrologic response.

[68] Focusing on the possible effects of a rainfall event, the two-dimensional Bayesian inference allowed quantification of a probability of triggering damaging landslides to any event in theID space. Lines of equal probability show a trend similar to other rainfall thresholds defined using conventional methods. However, in this case the impossibility of separating rainfall events that trigger or do not trigger landslides is formally recognized and any “user” faces the difficult task to choose appropriate probabilities to associate with alert levels or other specific actions. The critical value of landslide probability would depend on the vulnerability and value of exposed properties and infrastructures, and it is closely related to the concept of acceptable risk. It should then be evaluated by means of a formal risk analysis. From the point of view of researchers investigating landslide mechanisms, we stress that the numerical value of probability is tightly connected to the reference area chosen for the analysis. For example, in the specific application presented here, the probability of having at least a landslide within the reference area of a single rain gauge (∼65 km2) is 0.15 for a two-days rainfall with average intensity of 40 mm/day. The same rainfall has a probability of 0.80 within a larger territorial unit (∼2600 km2). In other terms, the selection of appropriate levels of probability cannot be separated from the specific problem that is addressed nor from the area of application.

[69] Finally, it must be borne in mind that the database includes different landslide types in different materials, which were analyzed together in order to have a suitable number of data (section 3.1). Since it is physically reasonable to expect different landslide types to by triggered by different rainfall conditions, the proposed probabilistic approach cannot provide a reliable insight into the landslide mechanics. More focused theoretical and observational studies are required to this purpose. Also the lack of resolution in rainfall data (only daily rainfall is available for the whole period) prevents a more detailed analysis of the slope response to rainfall.

5.4. Uncertainty Associated to the Input Parameters

[70] To verify and validate the results, a sensitivity analysis was carried out by changing the input values of the parameters according to the associated uncertainty. Bayesian probability was recomputed for different combinations of input data and results were compared with those described above. The considered range of variation of the parameters is listed in Table 2. In particular: (1) we included the “uncertain events” (Type 2) in the data set of the triggering rainfall in order to evaluate the sensitivity of the results to uncertainties in the events definition; (2) for all the landslides triggered before the end of a rainfall event (40% of the cases) we assumed that the duration of the critical rainfall (triggering event) is equal to the overall rainfall duration, in order to make the two data sets of triggering and non-triggering rainfall perfectly comparable (seesection 4.1); (3) we assigned different values to the parameters DT and ET used for automatic rainfall detection (DT = 1 day and ET = 5 mm according to the values used by Pizziolo et al. [2008] for the same data set); (4) we restricted the analysis to the complete period of the catalog (1951–2009) [Rossi et al., 2010].

Table 2. Range of Parameters Considered for Sensitivity Analysisa
Combination 1Combination 2
  • a

    The left column corresponds to the combination of parameters used in Figures 10–14.

Historical landslidesOnly the well-defined events [Type 1]Well-defined + uncertain events [Type 1 + Type 2]
Rainfall detection criterionDT = 3 days, ET = 5 mm [calibrated parameters]DT = 1 day, ET = 5 mm [Pizziolo et al., 2008]
Duration of the triggering eventThe duration of triggering event is always equal to the rainfall duration (even if the landslide occurs before the end of the rainfall)The triggering event stops when the landslide occurs (or at the end of the rainfall if the landslide occurs later)
Period of the analysisAll the available data [1939–2009]Only the complete part of the catalog [1951–2009] [Rossi et al., 2010]

[71] From the sensitivity analysis it was found that landslide probability is very sensitive to the rainfall detection criterion (point 3) and, to a lesser extent, to the duration of the critical rainfall (point 2). Adding the “uncertain events” (point 1) or restricting the analysis to the complete catalog (point 4) has instead a little effect on the results. Although the values of landslide probability may change up to 30% for some combinations of the parameters, in all cases the shape of the probability distribution in the ID plane is remarkably similar. The sensitivity analysis confirm the trend shown in Figures 12 and 13, with peaks of landslide probability in the same rainfall classes. Therefore, the uncertainty associated to the input parameters does not have a great influence in our sample data set.

6. Conclusions

[72] The following conclusions can be drawn from the present study.

[73] 1. Bayes statistics offers a convenient way to evaluate rainfall thresholds in complex geological environments, where the distinction between critical and non-critical rainfall is difficult and conventional methods become highly subjective.

[74] 2. In these cases it is essential to consider both the probability that a rainfall event resulted in landslides (likelihood function) and the probability that a rainfall event did not triggered landslides (marginal rainfall distribution) in order to express the uncertainties in terms of probability.

[75] 3. The computed landslide probability is scale-dependent and refers to a well-defined area. Upscaling to a larger area can be done by applying the binomial model.

[76] 4. The application to the Emilia-Romagna data set (northern Apennines, Italy) proved that the proposed method is effective. Though there was no obvious difference between critical and non-critical rainfall, Bayesian analysis clearly showed an abrupt increase of landslide probability in rainfall duration-intensity plane which allows one to define an operational threshold, that is a critical level of rainfall beyond which we observe a radical change of state of the system.

[77] 5. In the study area, landslide triggering is largely controlled by single rainfall episodes (here defined as a period of continuous rainfall separated from the next one by a period of at least 3 days with rainfall E ≤ 5 mm). Event rainfall, event duration, and average rainfall intensity are significantly correlated with landslide probability. Quite surprisingly, antecedent rainfall in the previous 14, 30, or even 60 days seems to be unimportant for landslide triggering.

[78] The proposed method needs to be tested under a wide variety of geological conditions to prove its practical effectiveness. The authors are willing to share the algorithms and Matlab codes developed for the analysis with anyone interested.


[79] This work was supported by the Italian Ministry of University and Scientific Research (PRIN 2007 N.2007ASECS4) and by the Civil Protection Agency of the Emilia-Romagna Region (ASPER-RER, 2011–2015).