Detecting fingerprints of landslide drivers: A MaxEnt model
HumNat Lab, Department of Biological Systems Engineering, Virginia Tech, Blacksburg, Virginia, USA
Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, Virginia, USA
Institute for Critical Technology and Applied Science, Virginia Tech, Blacksburg, Virginia, USA
Department of Agricultural and Biological Engineering and Florida Climate Institute, University of Florida, Gainesville, Florida, USA
Corresponding author: M. Convertino, Department of Agricultural and Biological Engineering and Florida Climate Institute, University of Florida, Frazier Rogers Hall, Museum Rd., Gainesville, FL 32611, USA. (email@example.com)
 Landslides are important geomorphic events that sculpt river basins by eroding hillslopes and providing sediments to coastal areas. However, landslides are also hazardous events for socio-ecological systems in river basins causing enormous biodiversity, economic, and social impacts. We propose a probabilistic spatially explicit model for the prediction of landslide patterns based on a maximum entropy principle model (MAXENT). The model inputs are the centers of mass of historical landslides and environmental variables at the basin scale. The model has only three parameters requiring calibration: the threshold for the network extraction, the trade-off factor between model complexity and accuracy, and the threshold of landslide susceptibility. The calibration on a subset of observations detects the environmental drivers and their relative importance for landslides. We employ the model in the Arno basin, Italy, selected because of its widespread landslide dynamics and the large availability of landslide observations. The model reproduces the size distribution and location of over 27,500 historical landslides for the Arno basin with an accuracy of 86% obtained from the variable-landslide inference on about 37% of observed landslides. Future landslide patterns are predicted for 17 A1B and A2 rainfall scenarios and for a multimodel ensemble from 2000 to 2100. We show that potential landslide hazard is strongly correlated with variation in the 12 and 48 h rainfall with a return time of 10 years. As the climate gets wetter, the average probability of landslides gets higher which is shown by the landslide size distribution. Hence, the landslide size distribution is a fingerprint of the geomorphic effectiveness of rainfall as a function of climate change. MAXENT is proposed as a parsimonious model for the prediction of landslide patterns with respect to more complex models. The need for very accurately sampled and delineated landslides is lower than for other prediction models. Moreover, the model informs about the drivers of landslides and their relative importance without assumptions on the main triggering factors. This is important to inform monitoring of environmental variables. Our modeling approach can enhance the planning of socio-ecological systems in river basins by improving the accuracy of landslide prediction in space and time.
1.1 Landslides, Climate Change, and Landslide Size Distribution
 Landslides are major and abrupt hazards whose devastating power has long-term impacts on river basin ecosystems [Schulz et al., 2009; Mangeney, 2011; Huggel et al., 2012]. Human populations, infrastructure, and biodiversity in river basins are threatened by landslides with a varying degree of risk depending on their vulnerability and exposure, and on the landslide size and frequency [Dislich et al., 2009; Vorpahl et al., 2010]. With “socio-ecological systems” (SES) [Gardner and Dekens, 2007], we refer to the combination of urban and natural systems of a landscape. Rainfall-triggered landslides are the most common type of landslides, particularly in temperate biomes [Buma and Dehn, 2000; Burlando and Rosso, 2002; Larsen et al., 2010]. Climate change will strongly affect rainfall, and thus will affect landslide size and frequency [Schulz et al., 2009; Lin et al., 2010; Winter et al., 2010; Lee and Chi, 2011; Mangeney, 2011; Huggel et al., 2012]. In general, anomalous variation of rainfall is a key driver of climate change impacts on landscapes [Rinaldo et al., 1995; Huggel et al., 2012]. Variation in rainfall affects groundwater resources, species richness, and sediment discharge [Dankers and Feyen, 2009]. Sediment discharge is also strongly affected by landslide events [Goubanova and Li, 2007; Catani et al., 2010]. On average, global rainfall is expected to increase with warming [Westra et al., 2012]. However, model projections show that rainfall does not scale linearly with surface air temperature; thus, rainfall projections are characterized by a high degree of uncertainty due to the complexity of the climatological processes [Weigel et al., 2010; Schaller et al., 2011]. The landslide risk to SES in river basins is expected to vary in time due to changing climate. Therefore, there is a need to predict landslides with parsimonious but accurate models.
 One of the most commonly studied landslides variables is the landslide size whose exceedance probability is a power law distribution [Stark and Hovius, 2001; Turcotte, 2002; Guzzetti et al., 2002; Guzzetti, 2006; Stark and Guzzetti, 2009]. The power law structure of the landslide size distribution has a striking ubiquity in ecosystems. Brunetti et al.  examined 19 data sets with measurements of landslide volume, V (proportional to landslide size), for subaerial, submarine, and extraterrestrial mass movements. The individual data sets covered different landslide types, including rock fall, rock slide, rock avalanche, soil slide, slide, and debris flow, with individual landslide volumes ranging over 104≤V≤1013m3. The scaling behavior of pdf(V)—that is equivalent to pdf(S) where S is the landslide size—for the ensemble of the 19 data sets was over 17 orders of magnitude and was independent of lithological characteristics, morphological settings, triggering mechanisms, length of period and extent of the area covered by the data sets, presence or lack of water in the failed materials, and magnitude of gravitational fields. Brunetti et al.  argued that the statistics of landslide volume is conditioned primarily on the geometrical properties of the slope or rock mass where failures occur. Due to the disparity in the mechanics of rock falls and slides, rock falls exhibit a smaller scaling exponent (1.1≤α≤1.4) than slides and soil slides (1.5≤α≤1.9). However, considering all of the observed landslides, the exponent was in the range (1.3≤α≤1.6) which gives an average characterization of the entire sample of observations. Power laws have been observed for many phenomena in nature. Species patches and forest fires are two widely known examples. Size distributions of species patches, soil moisture patches, nutrient patches, and ponds are widely used to characterize spatial changes in species abundance, soil-water availability, nutrient cycling, and hydrologic regime, respectively [Kefi et al., 2011]. However, few studies have explored the variation of the scaling exponent of the power law distribution or the change in shape of the size distribution of the analyzed variable due to natural and anthropic stressors. Song et al.  and Zheng et al.  addressed this topic for forest fires, Kéfi et al.  investigated the topic for vegetation patches subjected to grazing, and Scanlon et al.  studied the topic for vegetation patches as a function of rainfall. In this study, one of our purposes is to examine the shape of the landslide size distribution and how it is affected by climate change. This has implications for the assessment of both landslide hazard and risk of SES at the local scale, because a spatial landslide pattern corresponds to each landslide size distribution. The landslide size distribution, such as other patterns, is a pattern that emerges from the combined effects of many complex processes. For example, Densmore and Hovius  used the hillslope-to-channel distance as fingerprint of earthquake and rainfall processes that promote bedrock landslides. Further efforts in understanding the coupling of the landslide size distribution to size distributions of other biogeomorphological and climatological variables [Dislich et al., 2009; Malamud, 2004] are needed.
1.2 Previous Models of Landslide Prediction
 The calculation of landslide hazard in terms of spatial location of landslides, size, and frequency is the first step in risk assessment and management of SES [Guzzetti, 2006]. Exposure and vulnerability of SES are assessed after the detection of the landslide hazard and the risk is determined by the product of hazard, vulnerability, and exposure [Ardizzone et al., 2002; Guzzetti et al., 2005; Romeo et al., 2006; Ardizzone et al., 2008; Fell et al., 2008; de Luiz Rosito Listo and Vieira, 2012]. The determination of the hazard of landslides has been traditionally performed using heuristic, statistical, or correlative models [Catani et al., 2005; Caniani et al., 2008; Goetz et al., 2011] and deterministic or stochastic physical-based models [Ermini et al., 2005]. Many physical-based models have been developed for reproducing the relevant physical-based processes of landslide dynamics [Odorico and Fagherazzi, 2003; D'Odorico et al., 2005; Rosso et al., 2006; Talebi et al., 2008]. However, physical-based models are complex [Muller and Muñoz Carpena, 2012; Ruddell et al., 2013] and difficult to calibrate due to the large number of parameters and variables that are input factors of the models [Saltelli et al., 2004]. This calibration often relies on costly field observations of multiple factors of the processes investigated [Catani et al., 2005]. Frequently, not all of these data are used as information for the models. Moreover, physical-based models are based on a priori assumptions about the drivers and their importance in triggering landslides. These assumptions influence the model architecture by not considering other potential factors, which also focus on monitoring plans only for the variables considered. Therefore, the transferability of physical-based models from one area to another is often limited, and model complexity does not fully allow the exploration of landslide process complexity. The assumptions about landslide processes create an “ignorance loop” between the process, the model, and the monitoring.
 Heuristic approaches are also used for assessing the relationships between environmental drivers and landslide occurrences. The pitfalls of heuristic methods include their large uncertainty and lack of a rigorous quantitative framework. Statistical approaches are generally aimed at modeling the relationships that associate the presence of a landslide to the physical terrain attributes responsible for its occurrence. For this reason, statistical or correlative approaches can be considered as indirect mapping techniques. However, correlative models assume a priori dependencies between selected environmental drivers and landslide occurrences and are generally constrained to run at the pixel scale. Thus, typically these models do not explore the full spectrum of landslide drivers. An example is the model in Berti et al.  where the focus is on rainfall-triggered landslides, yet the rainfall and the rainfall-triggering threshold are the only variables considered. Hence, there is a lack of models at a medium level of complexity [Muller and Muñoz Carpena, 2012; Ruddell et al., 2013] that guarantee high accuracy in representing spatiotemporal landslide patterns based on limited data and without a priori assumptions. The absence of a priori assumptions allows for the exploration of factors which may be driving landslide patterns by analyzing the relative importance and interactions of all environmental variables that are screened. There are tools derived from statistical physics, such as the MAXENT theory that is frequently used in geomorphology and ecology [Kleidon et al., 2010; Nieves et al., 2010; Brunsell and Anderson, 2011; Wang and Bras, 2011; Ruddell et al., 2013], which simplify the computation of ecosystem patterns.
1.3 Proposed Approach
 Here we propose a model based on the statistical physics principle of entropy maximization [Banavar et al., 2010] derived from information theory [Shannon, 1948; Ruddell et al., 2013] which provides a mid-level complexity and high accuracy model useful in predicting spatiotemporal patterns of landslide occurrence and size. The selected model (MAXENT) [Phillips et al., 2006; Phillips and Miroslav, 2008; Elith et al., 2010] is a pattern-oriented model [Levin, 2003; Grimm et al., 2005; Paola and Leeder, 2011] that was developed to predict species distributions in ecosystems based on species occurrences regardless of the biological details of the species considered. The purpose of MAXENT is to detect the driving set of variables for which a pattern occurs. Thus, this set of variables creates the most “susceptible” conditions of the process considered, where the susceptibility is a spatially dependent variable.
 In this paper, MAXENT combines landslide data from heuristic models [Catani et al., 2005] and uses a similar correlative approach of statistical methods for landslides. However, no assumption is made on the drivers of landslides. The model architecture for landslides is the same as the one for modeling species distributions. Moreover, the model can be easily used for predictions of species distributions simultaneously to landslide predictions. We take a statistical physics approach that focuses on the similarities, rather than the differences, among landslides [Banavar et al., 2010; Ruddell et al., 2013]. The maximum entropy principle is, in fact, an inference technique for constructing an estimate of a probability distribution—in our case of the landslide susceptibility—using the least available information rather than all data available. Thus, MAXENT does not require the accurate delineation of all historical landslides nor the assumption of causal relationships between environmental variables and landslide occurrences. The model automatically provides itself the information about necessary and sufficient variables and the values of model factors to maximize the prediction accuracy of landslides in terms of their spatial location and size. The entropy that is maximized is the information useful to predict the desired patterns with the highest accuracy. In contrast to other models, it extracts the driving factors, their importance and interactions, for representing the observed landslide patterns without making a priori assumptions on these factors. The coexistence of environmental variables and their values is what determines the landslide susceptibility. Slope stability, in fact, is not ruled by one variable alone, and failure conditions are commonly achieved with a combination of relevant factors.
 These tools allow one to transcend particular microscale features of processes and focus only on well-established and emergent macroscale system features dictated by the drivers of the observed patterns [Pueyo et al., 2007; Ruddell et al., 2013]. In this respect, MAXENT is an innovative and parsimonious model that optimizes the trade-off among model complexity, information relevance, and uncertainty [Muñoz Carpena and Muller, 2009; Muller and Muñoz Carpena, 2012], focusing on the patterns rather than on the processes. Simple patterns emerge from the complexity of system components' processes [Levin, 2003; Paola and Leeder, 2011] and these patterns are ultimately useful for management purposes and for prediction of other socio-ecological patterns. Considering management, the model constitutes an approach for assessing the spatiotemporal hazard of landslides in river basin ecosystems. Yet the model is potentially applicable for reducing socioeconomic losses by planning the development of socio-ecological systems in areas where landslide hazard or exposure is low [Gardner and Dekens, 2007].
1.4 Framework and Hypotheses
 The prediction of size, frequency, and location of landslides is enormously useful in determining areas with potentially elevated hazard of landslide events. To illustrate the utility of the MAXENT model [Phillips et al., 2006; Phillips and Miroslav, 2008; Elith et al., 2010], we use historical data from the Arno basin (section 2.2), chosen because of widespread and frequent landslide activity and data availability. Figure 1 shows the modeling framework. The model is run conditional to past landslide occurrences and to a set of selected environmental variables at the basin scale (Figure S1 in the supporting information). We initially consider a large set of environmental variables (17) from which we detect the most important set of variables (6) to describe the observed landslides at a resolution of 500m2. We then select all the available variables that we believe are influencing the landslide patterns without making any a priori assumption. The screening of variables is performed with a jackknife test (section 2.6) that calculates the variable importance by running the models with different sets of variables, excluding each variable in turn. All the environmental variables considered are described in Text S1.1 (supporting information). The observed landslides are both smaller and larger than the minimum grid size (500m2) used by the model. The river network of the Arno (Figure 2) is extracted with a morphology-based criterion capable of detecting the channelized valleys of the basin [Luo and Stepinski, 2008] (section 2.3). The parameters of the model to calibrate are the threshold on a morphology-based curvature for the extraction of the river network, the threshold on the landslide susceptibility for which a pixel is considered to be part of a landslide, and the Lagrangian multiplier (or “global regularization parameter”) of the predicted entropy by MAXENT (sections 2.6 and 2.4). The Lagrangian multiplier that multiplies all environmental features finds the optimal trade-off between model complexity (defined as the number of environmental variables) and model accuracy. The set of parameters is identified by minimizing the prediction error between the observed and modeled landslides in 2010. The threshold on landslide susceptibility calculated by MAXENT is used to calculate the landslide size in an analogy to ecology wherein there is a threshold to the habitat suitability for calculating the species patch size [Elith et al., 2010]. The landslide size is defined as the sum of adjacent pixels whose landslide susceptibility is higher than the selected threshold. Once all landslides are delineated, the landslide size distribution is determined (section 2.5). Each predicted landslide pattern is the average over 50 replicates of the MAXENT model. Predictions of landslide patterns are made from the year 2000 to 2100 as a function of climate change scenarios (section 2.4). The following hypotheses and assumptions are made.
 We assume that the river network is statistically invariant in the simulated period. In this study, we consider landscape processes at temporal scales smaller than geological timescales; thus, invariant properties of river basins [Rodriguez-Iturbe and Rinaldo, 1997] that hold also for the Arno basin are preserved.
 We hypothesize landslide patterns as stationary realizations of a spatial stochastic process [Cox and Isham, 1980] of activation/deactivation of nonchanneled sites within river basins. Yet each landslide is thought of as a point occurrence [Cox and Isham, 1980; Finkenstadt et al., 2006] that coincides with its center of mass regardless of the landslide size (Figure 3). The maximum entropy principle predicts the landslide susceptibility of each pixel “nearby” the centers of mass as a function of the relationships inferred between the environmental variables and the centers of mass of randomly selected observed landslides (“background points”) at the landscape scale.
 We hypothesize that rainfall is the main driver of landslides in the Arno basin. Because of this hypothesis, we explore the effect of climate change (A1B and A2 scenarios from 2000 to 2100) on landslide patterns by just varying the rainfall. Variations of the double-Pareto distribution of the landslide size are expected as a function of the rainfall. Hence, the landslide size distribution is hypothesized as a potential fingerprint of climate change. Small and large landslides are determined considering the double scaling regime of the landslide size distribution.
2 Materials and Methods
2.1 Hydrogeomorphology of the Arno Basin
 The combination of geography, geology, and climate change makes Italy one of Europe's most landslide prone territories, with an average of 54 lives lost each year for the last half century [Canuti et al., 2000; Bianchi and Catani, 2002; ESA, 2005]. The Arno basin in Tuscany (Italy) is one of the most active in Italy for landslide activity [Catani et al., 2005]. The presented integrated modeling of rainfall-triggered landslides (Figure S1) is applied to the Arno basin (Figure 2) as a case study.
 The basin, whose extent and mean elevation are 9.13×103km2 and 353m above sea level, respectively, is representative of a Mediterranean climate, with a total annual rainfall from about 700 to 1700 mm. The basin is located in one of the regions in Italy that are expected to significantly be affected by global climatic change [Dankers and Feyen, 2009; Coppola and Giorgi, 2010]. The Arno River is 241km long. Heavy storms mainly occur in autumn following dry summers. The mean monthly distribution for summer and autumn rainfall clearly denotes that the analysis of potential climate change impacts is important not only for floods but also for water shortages caused by the potential increase in summer dryness that is already observed in the current climate [Bianchi and Catani, 2002; Burlando and Rosso, 2002; Coppola and Giorgi, 2010]. The Arno basin has recently shown increases in monthly temperatures, increases in winter rainfall, and typically decreasing summer rainfall but with amplified variability compared to the winter season. Hence, these extremes are likely affecting the stability of hillslopes and alteration of landslide frequency and magnitude is expected. The mixture of mountainous and flat areas offers a good opportunity to test the effect of global change on landslide activity at different elevations and for different local climatic patterns. The basin shows an ample variability in its biogeomorphological variables (i.e., mainly vegetation and geology). Catani et al.  divided the Arno basin into five macroareas as a function of lithology and microclimate.
 The over 27,500 recorded landslides (reported in Catani et al.  and updated in 2010; Figure S1) in the territory of the Arno basin are the result of the evolution of the landscape over the Holocene. These landslides are classified as 75% rotational, 17% shallow slow translational, 2% rock falls, 1% rock flow, 1% topple, and 1% lateral spreading [Catani et al., 2005]. The inventory is derived from aerial photo interpretation and remote sensing analysis techniques on the Synthetic Aperture Radar (SAR) images detected by the European Remote Sensing spacecraft [Catani et al., 2005; Lu et al., 2009, 2011]. About 350 SAR images were interferometrically processed by means of the Permanent Scatterers (PS) technique in Catani et al. . This allowed Catani et al.,  to detect about 600,000 PS, corresponding to natural reflectors on the ground where it is possible to assess precisely the velocity history of landslides over the investigated period [Catani et al., 2005b]. The landslides affecting the Arno River basin, and more generally the Northern Apennine landslides in Italy, mainly move by reactivation of dormant slides. These landslides were probably initiated during the early phases of the Holocene as a consequence of ice retreat which occurred at the end of the last glaciation [Bertolini et al., 2004]. In the Arno River basin, the frequency of first-time landslides is very low and the susceptibility of a given land area is largely a function of the presence or absence of known instability [Catani et al., 2005]. Unfortunately, the historical landslide data set does not presently have the information of the time when each landslide occurred, which can allow a backcasting of the prediction of landslides in time.
2.2 River Network Extraction
 The river network extraction is performed with a morphology-based criterion [Luo and Stepinski, 2008] applied to the elevation field. The DEM (Text S1.1) (Figure 2a) is coarse grained from 30 m to 500 m resolution (Figure 4), corresponding to the scale at which the prediction of landslides is performed. A curvature value kt is calculated for every pixel as
where z is the elevation (Figure 4a), and the subscripts of z indicate differentiation (zx and zy, and zxx and zyy are the first and second derivatives of the elevation in the x and y directions, respectively). Thus, zx, zy are the slopes and zxx, zyy are the curvatures in both directions. The curvature in equation (1) introduced by Luo and Stepinski  is a combination of a planform curvature and planar curvature [Bogaart and Troch, 2006]. For the extracted network, all the network-dependent geomorphic variables described in Text S1.1 (Strahler stream order, Hack's length, geomorphic classes, stream diameter, rescaled distance, Euclidean distance, and the hillslope-to-channel distance) are calculated. We define k0 as a threshold above which a terrain is considered sufficiently curved upward to be a valley. An optimal value of k0 for which kt≥k0 defines all the channelized valley pixels. Positive values of kt indicate concave upward (U-like shape) morphology corresponding to valleys. The value of k0 is defined simultaneously with the optimal values of the regularization parameter and threshold on the landslide susceptibility (section 2.4). These three parameters predict the observed landslide pattern with the smallest prediction error (section 3).
2.3 Jackknife Test and Calibration
 The jackknife test for the environmental variables provides alternate estimates of the landslide patterns as a function of the variables considered. This allows one to identify the most important variables in the model and to calibrate the three parameters of the model. The jackknife test considers all 17 environmental variables (Text S1.1 in the supporting information) and the historical landslide pattern in 2010 (section 2.1). We run MAXENT and the jackknife test for different sets of input parameters: the curvature threshold for network extraction (section 2.2), the global regularization parameter of MAXENT (section 2.4), and the threshold on the landslide susceptibility to define the landslide area (section 1.4). The calibration on the observed landslide pattern (section 2.1) allows the selection of the parameter value combinations and the most important environmental variables necessary to reproduce the observed landslides. Thus, this set of parameters and variables is likely the optimal set to reproduce future landslide patterns as a function of climate change. The optimal set of parameters is defined as the one that simultaneously minimizes the error in reproducing the observed landslides and maximizes the “area under the curve” (AUC) statistics [Zweig and Campbell, 1993].
 The AUC is evaluated considering the receiver operator characteristic curve (ROC). The ROC is a graphical plot of the sensitivity, or true positives (i.e., the percentage of predicted landslide occurrences that match the observed ones), versus the complementary of the specificity, or false positives (i.e., the percentage of predicted landslide occurrences which do not match any observation). This is the typical classification of binary classifier systems that is performed by varying the discrimination threshold [Zweig and Campbell, 1993] which determines the percentage of true and false positives. Therefore, the sensitivity evaluates “commission” while 1-specificity evaluates “omission.” The AUC compares the likelihood that a random landslide occurrence site has a higher predicted value in the model than a random site where no landslide occurs. Thus, the higher the AUC, the better the prediction. In the jackknife test, each variable is excluded in turn from the MAXENT run, and a model is created with the remaining variables. Then, a model is created using each variable in isolation. In addition, a run is created using all the available variables. When only one variable is used in the prediction, the AUC measures the absolute importance of the variable in predicting landslide patterns. The difference between the AUC with all variables and the AUC for the single-variable prediction is a proxy of the sum of the interactions between the variable considered in isolation and all the others. Thus, the jackknife can be used also as a global sensitivity method for the model [Saltelli et al., 2004]. After the calibration on the historical landslide pattern (for which the error is the minimum), we retain the variables for which the jackknife test shows an AUC greater than 0.5, a standard threshold in establishing the importance of the environmental variables considered.
 The error is assessed by comparing the predicted and observed probability distributions of the landslide susceptibility conditional to the environmental variables. This estimation of the error uses the landslide susceptibility everywhere in the landscape and not only in the area of the observed landslides. The total error is defined as the sum of the errors for each distribution, defined as difference between the predicted distributions from the observed distributions. Hence, there are multiple observed distributions to be fitted simultaneously because there are multiple environmental variables. The MAXENT set of input factors (parameters and variables) with the minimum total error in the prediction of landslide patterns is selected. The set of input factors of the model that fits the observed landslide pattern distributions may not exist if the model does not consider all the necessary environmental variables or if the algorithm has some analytical and/or numerical mistakes. The simultaneous fit of the diverse distributions related to each environmental variable is a very stringent test for the predictability of a model [Muneepeerakul et al., 2008], especially for a model with only three parameters as in this case.
 The error as defined above is observed to be proportional to 1−Np/Nh, where Np is the predicted number of landslide pixels over the total number of historical landslide pixels Nh. The calculation of 1−Np/Nh is conditional to the landslide occurrences. Thus, this calculation is conditional to the delineation of landslides defined by the neighboring criteria (sections 1.4 and 2.5). Therefore, the comparison of the probability distributions of landslide susceptibility unconditional to the environmental variables is the best way to assess the error since it explores both intensity and distribution of landslide susceptibility and avoid the dependence on the delineation criteria.
2.4 Maximum Entropy Model for Landslide Susceptibility Prediction
MAXENT [Phillips et al., 2006; Phillips, 2008; Phillips and Miroslav, 2008; Phillips et al., 2009; Elith et al., 2010] is a machine-learning model that estimates patterns and/or processes while incorporating the minimum amount of information. Here MAXENT uses occurrence-only data of landslides to predict the landslide susceptibility, L, based on the principle of maximum entropy. The entropy can be interpreted as the expected value of the information contained in data [Jaynes, 1982; Kleidon et al., 2010; Nieves et al., 2011] that is needed to describe a pattern or a process. Information is constituted by data; however, not all data are necessary to explain the pattern or process analyzed [Elith et al., 2010]. By thinking of the explanatory variables as drivers of the process considered, the goal is to detect the variables that best predict the observed pattern (i.e., with the “highest information” in information theory). By maximizing the entropy, MAXENT detects the variables with the highest value of information to predict the landslide patterns.
 The landslide pattern is schematized as the ensemble of point occurrences, where each point is placed in the center of mass of each landslide observed and delineated by Catani et al. . Figure 3 shows the application of this assumption to one landslide from Figure 2. Thus, each landslide is characterized by only one point regardless of the landslide size and shape (Figure 2d). The point occurrence schematization is adopted in analogy to the application of MAXENT in modeling species distributions [Phillips et al., 2006; Phillips and Miroslav, 2008; Elith et al., 2010], in which a point occurrence is an occurrence of an individual or a group of the species considered independently of the abundance of the species.
 The estimate of landslide susceptibility (in a [0, 1] range) is conditional on the recorded occurrences or presences (defined as unitary binary variables, y=1) and to the environmental variables c. We define landslide susceptibility as the probability of occurrence of a landslide whose size is at least 500 m2, that is, the resolution of the environmental variables. The higher the susceptibility, the higher the probability of occurrence of landslides. Information on true nonoccurrence events (“absences,” i.e., the sites where no landslides occurred) is not necessary in the prediction of landslide susceptibility. Thus, we ignore sites that are potentially unstable because only landslides that have occurred (presences) are incorporated into the model. We performed 50 replicates of landslide patterns for each year in order to reduce the uncertainty of predictions. Below, we report analytical details about the estimation of landslide susceptibility in MAXENT.
MAXENT depends on both the occurrence sample and the background sample that are used in forming the estimate of the landslide susceptibility. Background points are defined as all locations (pixels) or a random sample of pixels within the landscape, over which MAXENT assesses the relationships between the variables and landslide susceptibility. We select 10,000 random background points within the landscape that are sufficient for the inference of the probability of occurrence of landslides. The number of background points is consistently lower than the number of landslide points and is consistent with the number of background points in previous applications of MAXENT [Elith et al., 2010; Convertino et al., 2011]. Further studies will explore the accuracy in predictions as a function of the number of background points. For each MAXENT run, 75%of the point occurrences are randomly selected for model training and cross validation and 25% of the data are set aside for model testing and independent validation when the test sample is chosen at random. The maximum entropy distribution of landslide susceptibility conditional to the environmental variables is shown in Dudik et al.  to be a Gibbs distribution that minimizes the relative entropy in the variable space (i.e., the Kullback-Leibler divergence; Text S1.4 in the supporting information). Maximizing entropy in geographic space is equivalent to minimizing the relative entropy in variable space [Elith et al., 2010]. The relative entropy is defined as the distance between the predicted and observed probability distributions of the landslide susceptibility [Cover and Thomas, 1991]. This distribution is an exponential probability distribution function or pdf hereafter. In the variable space, the entropy assumes the form of
where f(c) is the probability density of variables across the landscape and η(c)=α+ρh(c); α is a normalizing constant that ensures that f1(c) integrates to one (f1 is the pdf of the variable or covariate c over the landslide occurrences) and ρ is the constant multiplier of the MAXENT features h(c) [Phillips et al., 2006; Phillips and Miroslav, 2008; Elith et al., 2010] that are functions related to the environmental variables (Text S1.4 in the supporting information). In fact, MAXENT fits the model on features (h(c)) that are transformations of the variables in the variable space. Features allow potentially complex relationships between variables and landslide occurrences to be modeled. Features can be linear, quadratic, product, threshold, hinge, and categorical functions. All types of features are allowed in MAXENT. The feature functions are selected automatically as a function of the occurrence size (Text S1.4 in the supporting information). We refer to the supporting information for a broader explanation of the features in MAXENT. The objective of MAXENT is the calculation of expη(c), which estimates the ratio f1(c)/f(c) (referred to as “raw” output) between the conditional pdf's and the marginal pdf's of landslide susceptibility over the landscape [Elith et al., 2010].
 Landslide susceptibility at the pixel scale in geographical space is defined according to Bayes' rule (where the two events are the susceptibility and variables; see Elith et al. ) as
where P(y=1) is the marginal pdf of the occurrences. However, P(y=1) is not available for calculating the conditional probability of occurrence P(y=1|c). Coupled to this issue, exponential models can produce bad predictions when applied to new data, for instance, the extrapolation of susceptibility from the current scenario to new environments or new climate scenarios [Elith et al., 2010]. To avoid these problems, and to sidestep the nonidentifiability of the landslide occurrence (P(y=1) in equation (3)), MAXENT's logistic output (the log of the output η(c)=log(f1(c)/f(c)) is defined as “logit score”) transforms the model from an exponential family model (equation (2)) [Elith et al., 2010] to a logistic model [Phillips and Miroslav, 2008; Phillips et al., 2009]. Therefore, the probability of landslide occurrence conditional to the variables is
where η(c)=log(f1(c)/f(c)) is the linear score from equation (2), r is the relative entropy of MAXENT's estimate, and τ is the intercept expressing the probability of occurrence at sites with “typical” conditions for the landslide (i.e., where η(c) = the average value of η(c) under f1). The probability of occurrence at sites with “typical” conditions for the landslides is a parameter τ. Knowledge of τ would solve the nonidentifiability of occurrence; however, in the absence of such knowledge, MAXENT arbitrarily sets τ to equal 0.5. The distance from f(c) is taken as the relative entropy of f1(c) with respect to f(c), known as the Kullback-Leibler divergence. This divergence is given by
where c is the variables vector for occurrence point i in the m pixels and for the j features. The feature multiplier for the feature hj is given by where s2 is the feature's variance over the m occurrence sites; α and ρ are the intercept and the weight of h(c), respectively, in the linear relationship that provides η(c). A regularization multiplier RME that is contained in the expression of all the feature multipliers λj controls the fitting of the model and the geographic range of the landslide occurrences. The geographic range of landslides occurrences is defined as their maximum extent in the geographical space considered. The global regularization multiplier can be thought of as the global Lagrangian multiplier for all the features and is employed as a strategy for finding the local maxima of the entropy [Jaynes, 1982; Kleidon et al., 2010; Nieves et al., 2011] subject to the observed landslides. The regularization is not specific to MAXENT, rather it is a common modern approach to model selection [Phillips, 2008]. It can be thought of as a way of shrinking the coefficients (the ρs)—i.e., “penalizing them”—to values that balance model fit and complexity (the first and second terms in equation (5), respectively), allowing both accurate prediction and generality. A highly complex model will have a high log of the probability of an observed outcome (log likelihood) [Elith et al., 2010] but may not generalize well. In this sense, MAXENT fits a penalized maximum likelihood model closely related to other penalties for complexity such as Akaike's Information Criterion [Elith et al., 2010]. The prediction error of MAXENT is a function of the regularization multiplier RME. MAXENTcalculates the probability map of landslide occurrence (from 0 to 1) assuming a regularization parameter RME that minimizes the difference between the historical and predicted landslide patterns. High/low RME values extend/shrink the geographic range of the predicted landslide patterns, respectively.
2.5 Landslide Size Distribution
 The size of landslides is calculated after the prediction of landslide susceptibility patterns by the MAXENT model (section 2.4). The pixels whose landslide susceptibility is equal to or higher than Lth, the threshold on the landslide susceptibility, are considered part of a landslide. The value of the landslide activation threshold is considered homogeneous throughout the basin. The size of a landslide is computed as the sum of adjacent pixels according the von Neumann neighborhood criteria (Figure 3). The von Neumann neighborhood comprises the four cells orthogonally surrounding a central cell on a two-dimensional square lattice. Figure 3 shows schematically an example of calculation of the landslide size.
 The probability density of landslide size can be universally described [Stanley, 1999] by the double-Pareto distribution introduced by Convertino et al.  in the form:
where t is the truncation point (hard truncation) for which the transition in the scaling regime of the probability distribution is observed and m is the upper cutoff (Table 1). We refer to “hard truncation” when the pdf clearly exhibits two regimes (for s<t and s>t) in which two scaling exponents can be identified for at least 1 order of magnitude [Convertino et al., 2013]. Θ(x)=1 if x>0 and zero otherwise where x is any independent function [Simini et al., 2009]. We introduce the function f(x) to give more generality to the cutoff function. f(x) is a function such that f(x)=1 if x≪1 and f(x)=0 if x≫1. Here f(x)=Θ(1−s/m). β and ε are the scaling exponents of the double regimes of the landslide size distribution. The double-Pareto pdf of the aggregate size is widely studied [Reed and Jorgensen, 2004] and has been investigated in the context of landslides by Stark and Hovius , Guzzetti et al. , and Stark and Guzzetti . Here we propose a novel analytical formulation that is capable of fitting the observed and predicted landslide size distributions. The exceedance probability of the landslide size is given by integration of equation (6):
where , C0 and C1 are constants, and F is a homogeneity function that depends on a characteristic size (L∥ is the diameter of a landslide along its principal axis of inertia, and H is the Hurst exponent) [Convertino et al., 2013], and ε=Df/2 [Mandelbrot, 1982; Convertino et al., 2013]. Df is the fractal dimension of the landslide pattern that is on average the fractal dimension of each landslide because of the scale invariance of landslide patterns [Convertino et al., 2013]. The fractal dimension of each landslide is found from the scaling of the landslide perimeter and size as in Convertino et al. . The scaling exponent is the Hurst exponent that is half of the fractal dimension [Rodriguez-Iturbe and Rinaldo, 1997]. The homogeneity function F considers possible finite size effects in the landslide size distribution. The power law distribution of landslide size is fitted by the Maximum Likelihood Estimation (MLE) method introduced by White et al.  and Clauset et al. . The MLE is explained in Text S1.5 (supporting information).
Table 1. Glossary of the Landslide Size Distribution Parameters (in White) (Section 2.5) and Parameters of the MAXENT Model to Calibrate (in Grey) (Section 2.4)
scaling exponent of the double-Pareto distribution for
scaling exponent of the double-Pareto distribution for
lower truncation point
upper truncation point
number of small landslides (for 0<s<t)
number of large landslides (for s≥t)
landslide susceptibility threshold (0.58)
MAXENT regularization multiplier (1.5)
curvature threshold for river-network extraction (0.003)
2.6 Climate Change Scenarios
 The selection of rainfall scenarios is based on the intent to consider the most extreme and the average scenarios of global warming due to climate change. At the global scale, the most extreme scenario of climate change corresponds to the A2 scenario, and the average scenario corresponds to the A1B scenario (Figures S1 and S2). It is well established that temperature and rainfall are not correlated by a strong relationship and that rainfall is a much more nonlinear phenomena than temperature. Because of this nonlinear behavior and the large uncertainty in climate change predictions, we consider different models of climate change separately and a weighted multimodel ensemble in order to provide average predictions (Text S1.2, Table S1, and Figure S3). The weighted and unweighted ensembles provide very similar predictions of rainfall change (Figure S2). Models' weights are assigned by comparing the predictions with past observations of 87 rainfall gauges from 1993 to 2006 over the Arno basin.
 The predicted 12 h rainfall, R12, for the A1B and A2 scenarios at 25 km2resolution is taken from each of the 17 ENSEMBLES regional circulation models (RCMs) and for the mean of the multimodel ensemble [ENSEMBLES, 2011] in the Arno basin (Text S1.2 and Figure S3). This rainfall is hypothesized to be the most important variable for predicting landslide patterns. However, we also consider R48 that is the 48 h rainfall to verify the importance of rainfall of different durations for the prediction of landslide patterns. The upscaling in time from R12to R48 is in section 2.6.1. Figure S3 shows the average monthly rainfall of the multimodel ensemble in space. The list of models is reported in Text S1.2 (supporting information).
 Figure S1 in the supporting information reports the daily rainfall for the Arno basin and worldwide for the A1B and A2 scenarios. The predicted daily rainfall for the A1B and A2 scenarios at 25 km resolution is taken as the mean of the models in ENSEMBLES . From 1961 to 2000, the A1B and the A2 scenarios coincide. For this reason, only the rainfall from 2000 to 2100 is reported for the A2 scenario. The increase in rainfall for the A2 scenario is approximatively 0.2 mm/d higher than the A1B scenario. Thus, at the scale of the Arno basin, the A1B and the A2 scenarios are very similar. Consequently, the A2 scenario is expected to not significantly affect the landslide patterns with respect to the A1B scenario. For the Arno basin, the rainfall is predicted to decrease from 2000 to 2100 with an average magnitude of 6 mm/d for the A1B scenario and of 4 mm/d in the A2 scenario. This is contrary to the increase in rainfall that is expected at the worldwide scale (Figure S1). From 2000 to 2100, the wettest year is 2020 for the A1B and A2 scenarios. The average value of the change in seasonal and annual rainfall is reported for the 17 RCMs (Figure S3). Rainfall is expected to decrease in winter and summer, remain similar to the current value in the spring, and increase in the autumn.
 We define ΔR12 as the difference between the 12 h rainfall of the multimodel ensemble and rainfall of the reference period 1961–1990 in each pixel of the Arno basin. The rainfall of the reference period is provided by the same climate change models. The reference period 1961–1990 is usually adopted in any study in which climate change effects are evaluated on geomorphological and ecological patterns. Moreover, the use of this reference period adds in understanding which years are going to be wetter and drier than the average climate. The calculation of ΔR12is also performed for each of the 17 models of ENSEMBLES . ΔR12 is used to make a comparison among models of climate change in the Arno basin. Figure S2 shows the average monthly rainfall for the A1B climate change scenario, derived from the multimodel ensemble [ENSEMBLES, 2011] at 25 km resolution. The analysis of these maps alone does not reveal a clear trend of rainfall (wetter or drier) for the area of analysis. However, the maps of Figure S2 reveal a large variability in the rainfall patterns of North Italy and of the Arno basin due to climate change.
2.6.1 Spatiotemporal Scaling of Rainfall
 The scaling of rainfall in this study is composed of a spatial downscaling of the 12 h rainfall of the RCMs (Text S1.3) and a temporal upscaling to calculate the 48 h rainfall. A statistical model is established between the large-scale RCM variables (predictors) and the small-scale observed parameters of interest (predictands) using a historical common period [Maraun et al., 2010]. The predictands are taken from the 12 h rainfall with a return period of 10 years, collected from 87 rainfall gauges from 1993 to 2006 over the Arno basin. The space-time heterogeneities of this rainfall are hypothesized to trigger landslides in the Arno basin. The empirical relationship between predictors and predictands is used to correct the predicted rainfall from 2000 to 2100. The climatic correction between predictors and predictands is derived after spatially downscaling the predictors from 25 km to 500 m, which is the resolution of the rainfall data and the scale at which landslide susceptibility is calculated. The space-time downscaling is performed with both a linear regression and an analog model [Zorita and Storch, 1999; San-Martin et al., 2008; Maraun et al., 2010]. The analog model is defined as “weather-typing” because it statistically relates the observed rainfall to a weather classification scheme. The yearly rainfall patterns used in modeling the landslides in the future are the average of the fields obtained by the two methods. In the landslide predictions, we consider the average rainfall of the downscaling methods in order to potentially reduce the systematic error that can be intrinsically present in both methods. Both downscaling methods are performed using MeteoLab [Cofino et al., 2011]. The analytics of the linear regression and the analog models are explained in Text S1.3 (supporting information). The temporal upscaling from the predicted 12 to the 48 h rainfall is performed each year by considering the relationship between the observed 12 and 48 h rainfall. The empirical scaling relationship for the Arno River basin between the 48 and 12 h rainfall with a return time of 10 years is reported in Figure S4. This relationship is built on observations of rainfall performed with 87 rainfall gauges from 1993 to 2006 over the basin. We assume this scaling relationship to hold in the future. This is a simplistic approach; however, (i) it is very difficult to predict the variation of this relationship in the future and (ii) the uncertainty on data is very small, and therefore, the relationship is meaningful for rainfall of different durations in the Arno basin.
2.6.2 MaxEnt Climate Change Projections
 In order to perform predictions of landslide patterns under future climate change scenarios, the values of weights ρ of features and f(c) at year i (equation (2)) are used to compute weights of features for the projected variables at year i+1. The landslide susceptibility at year i is calculated based on the probabilities of the predicted landslide occurrences conditional to the environmental variables at year i−1. Hence, in this way, we assume that the landslide activity of each year is only a function of the landslide activity in the previous year. The only variable that changes in time is the 12 and 48 h rainfall with a return time of 10 years as a function of the climate projections (section 2.5 and Text S1.2 in the supporting information). Thus, the rainfall is hypothesized as the controlling variable of landslide patterns. This is verified on data by running the MAXENT model with the jackknife test. The weights ρ are not probabilities and are not required to sum to one since they use the normalization constant computed for the environmental features rather than the normalization constant for the projected features related to rainfall projections (supporting information). Their relative magnitudes represent how much a given location is suitable to be an active landslide location over other locations. By default of MAXENT, two kinds of “clamping” are done during the projection [Elith et al., 2010]. Clamping is the process by which features are constrained to remain within the range of values in the observed training data. First, the environmental variables are clamped: If a variable in the projection has values that are larger than the maximum of the corresponding variable used during training, those values are reduced to the maximum, and similarly for values below the corresponding minimum. Second, if a feature derived from the projection has a value larger than its maximum on the training data, it is reduced to the maximum, and similarly for values below the corresponding minimum. Thus, features are also clamped. This clamping process helps to alleviate problems that can arise from making predictions outside the range of data used in the training of the model [Elith et al., 2010].
3 Results and Discussion
 The jackknife test (section 2.3) finds the key variables for reproducing the observed landslide pattern (Figure 5). The most important variables are the variables for which the AUC is greater than 0.5. The AUC is calculated for the model using each variable in isolation. The six selected variables out of the 17 variables considered (Section S1.1) are the slope, elevation, hillslope-to-channel distance, 12 and 48 h rainfall, erodibility [Larsen et al., 2010], and land cover (Figure 4). Thus, the jackknife test largely reduces the number of environmental variables with respect to all the variables originally considered. The AUC is the highest for rainfall (∼0.77) supporting the hypothesis that rainfall is the most important driver of the landslide pattern for the Arno basin. For the land cover and for the hillslope-to-channel distance, the AUC is very close to 0.54, which shows the low/medium importance of these variables in predicting landslide patterns. The intermediate importance of the land cover and of the hillslope-to-channel distance partially supports our assumption to consider these variables invariant despite the variation of rainfall due to climate change in the modeled period. The hillslope-to-channel distance is the only network-dependent geomorphic variable that is necessary to describe the landslide pattern. In contrast to Catani et al. , we account for network structure in our predictions of landslide patterns; thus, we consider network-dependent variables. We believe that the network has an influence on landslide patterns [Convertino et al., 2013]; thus, the hillslope-to-channel distance, which is inversely proportional to the drainage density, is a necessary variable to be considered in this type of modeling. The optimal MAXENT regularization parameter, RME, is estimated to be 1.5 when calibrating the model on the historical landslide pattern of 2010 (Figure S5). This value of the regularization parameter, together with the threshold on the network extraction, k0=0.003m−1, and the threshold on the landslide susceptibility, 0.58±0.02, provides a prediction error that is about 14%. The AUC for the model with the optimal values of parameters and the most important variables is 0.92 (Figure 5). The AUC should not be confused with the prediction error. In fact, the AUC is a statistical test on the convergence of the model rather than a test on the predictive accuracy of a model [Convertino et al., 2011]. In Figure 5, the AUC of a randomly selected testing sample of landslides (50% of the observed landslides) is 0.87, which is similar to the AUC (0.92) considering all the observed landslides. This shows the robustness of MAXENT with respect to the number of occurrences. The AUC for landslides as randomly generated point occurrences is 0.5, which rejects the null hypothesis of randomly distributed landslides. This result is expected because landslide patterns are the manifestation of nonrandom processes in river basins [Rodriguez-Iturbe and Rinaldo, 1997]. The aforementioned combination of values for the three parameters of the integrated modeling gives the lowest prediction error. These values and variables are used in predicting future landslide patterns as a function of climate change. Figure S5 (supporting information) shows the trend of the error versus the global regularization parameter RME. For RME<1.0, MAXENT produces a more localized landslide distribution that is a closer fit to the observed landslide distribution, but it results in overfitting. This means that the fitting is too close to the training data, thereby the model does not generalize well to independent test data. The overfitting produces a very high landslide susceptibility around the recorded occurrences and a severe underestimation of the landslide susceptibility in other areas of the basin. For small values of RME, there is an overall underprediction of the landslide geographic range. A larger regularization parameter gives a more distributed, less localized prediction of the landslide susceptibility. This does not necessarily imply that larger landslides will occur because it depends on the value of the landslide susceptibility. For RME>2, MAXENT produces a very widespread landslide distribution. Hence, this causes an overprediction of the landslide range with an overall underfitting of the susceptibility over the observed landslides. The curves of versus the threshold on landslide susceptibility, Lth, and versus the threshold on the network extraction k0 show the same pattern as the curve of versus RME.
 The conditional probabilities of the landslide susceptibility as a function of the most important environmental variables screened by the jackknife test are reported in Figure 6. These plots are called “response curves” using MAXENT terminology [Elith et al., 2010; Convertino et al., 2011]. P(L|c) is the probability calculated by equation (4) everywhere in the landscape, where c are the variables. These variables are the ones for which the AUC is higher than 0.5 (Figure 5). The logistic prediction of MAXENT changes as each environmental variable is varied, keeping all other environmental variables at their average sample value. In other words, the represented response curves show the marginal effect of changing exactly one variable, whereas the model may take advantage of sets of variables changing together. Thus, the response curves identify the classes or ranges of the environmental variables for which a landslide is more prone to be activated. The response curves reflect the high heterogeneity of the environmental variables considered. From the plots of Figure 6, it is possible to observe that the probability of landslide occurrence is higher for medium-high elevation and slope, small-medium hillslope-to-channel distance, medium erodibility, grasslands/low-vegetation and open lands with absent or spotted vegetation, and low and high rainfall of 12 and 48 h with a return time of 10 years, respectively. It is interesting to note how the landslide susceptibility is higher for high values of the erodibility that correspond to a high susceptibility of soil erosion. This explains the fact that in the Arno basin, more than 70% of the landslides are rotational [Catani et al., 2005]. Figure 6c shows that many landslides happen in highly channelized regions of the basin that occur for a low value of the hillslope-to-channel distance d (where d is the complement of the drainage density). In fact, many landslides occur in concave upward valleys of subbasins nearby the main streams. These landslides are often the result of river cutting phenomena causing hillslope instability [Catani et al., 2005].
 The predicted landslide pattern by MAXENT and the observed landslide pattern for the Arno are compared (Figure 7). The comparison of these patterns is also performed with respect to the predicted pattern by the Artificial Neural Network (ANN) model of Catani et al.  where the landslide occurrences used in this study are taken. Figure 7 reports the historical landslide pattern, the landslide pattern predicted by the ANN, and the predicted landslide pattern by MAXENT for the year 2010. In Figures 2 and 7, the red pixels represent the sites whose landslide susceptibility is higher than 0.58, which is the calibrated Lth for the Arno basin. This threshold results from the calibration of the model and is used in the prediction of landslides in time. The ANN in Catani et al.  estimates a categorical field of landslide susceptibility composed of five classes according to the landslide probability of occurrence. In Figure 7 we consider three out of the five classes defined in Catani et al. . Specifically, we consider the classes for which the susceptibility has a return time of 1, 10, and 100 years according to Catani et al. . These classes better represent the observed landslide pattern and landslide size distribution. From Figure 7b, it appears that ANN underestimates the size of landslides, and the range of landslide occurrence is poorly represented with respect to the historical landslide pattern. The MAXENT prediction of size and spatial distribution of landslides is a much closer estimate to the observed landslides than the ANN estimate. The similarity between the historical landslide and the ANN patterns and between the historical and the MAXENTpattern is higher than 60% and 80%, respectively. The similarity index can be defined as the complement of the error defined in section 2.3 multiplied by 100. The smaller the value of , the higher the predictability of the model. The similarity is calculated considering the predicted number of landslide pixels over the total number of historical landslide pixels of the historical, ANN, and MAXENTlandslide patterns. Because we successfully reproduce the observed landslides, we believe that the model reproduces both “stochastic landslides” and “steady landslides” as defined in Booth et al. . Stochastic landslides are typically small and they are more difficult to detect because they are considered as random events. On the contrary, steady landslides are large medium-/large-size landslides for which the entropy between the observed and predicted landslide susceptibility is higher than small landslides. Thus, large landslides are better predicted than small landslides.
 Figure 8a shows the exceedance probability of the landslide size for the historical landslides, the landslides predicted by the Artificial Neural Network of Catani et al., 2005, and the landslide predicted by MAXENT. The size of all predicted landslides is calculated on the patterns of Figure 7 with the method explained in section 2.5 and shown in Figure 3. The exponents ε and β of the double-Pareto distribution of the landslide size determine the probability to observe large and small landslides, respectively (equation (7) in section 2.5). Double-Pareto distributions for landslides are widely reported in literature [Guzzetti et al., 2002; Stark and Guzzetti, 2009; Convertino et al., 2013], although there is a debate about the origin of such distribution. The best set of exponents of the double-Pareto distribution that fits the observed distributions is estimated by a Kolmogorov-Smirnov (KS) test (supporting information). The parameters of the landslide size distribution (Table 1) for the historical, ANN, and predicted patterns are reported in Table 2. The exceedance probability of the landslide size is also calculated for the landslides determined to 2100 as a function of the varying rainfall (Figure 8b). Potential patterns of future landslides are shown in Figure 9.
Table 2. Parameters of the Landslide Size Distribution (Equation (7)) for the Historical, ANN, and Climate Model Average Patternsa
t and m are the truncation points, while β and ε are the power law exponents of the predicted landslide size distribution. The number of small and large landslides is Ns and Nb, respectively, and the total number of landslides is Ntot.
 The pdf of the future landslide susceptibility L is represented in Figure S6 for the years 2020, 2040, 2060, and 2100 for the whole Arno basin. The pdf in 2020 is very similar to the pdf in 2010. The pdf is computed considering all the susceptibility values predicted by MAXENTthroughout the basin without applying any threshold of activation for identifying the landslide sites. The landslide susceptibility L shows a remarkable trimodal distribution. The sites in the basin that are contributing to the first mode of L are the sites very close to the river network. The pdf's show that the probability to observe large values of landslide susceptibility is larger than the probability to observe small values of landslide susceptibility. In very dry years (e.g., in 2100; Figure S6d), landslide susceptibility assumes a unimodal distribution—above Lth—for which only small/medium-size landslides are observed (Figure 9d). After setting the calibrated threshold Lth=0.58, the pdf(L) is not bimodal; however, even without the bimodality in the landslide susceptibility, we observe a double-Pareto distribution of the landslide size. We emphasize that the two pdf's—of the landslide susceptibility and of the landslide size [Guzzetti et al., 2002; Catani et al., 2005; Stark and Guzzetti, 2009]—are distributions of two different random variables. The landslide susceptibility is the likelihood that each pixel is a landslide, while the landslide size is a quantity defined at larger scales and it defines probabilistically the extent of landslides. The size of landslides is defined after applying the threshold Lth and the von Neumann criterion (section 2.4). The two random variables are certainly interrelated, but their relationships needs to be explored further.
 The exponents of the exceedance probability of the landslide size are observed to increase in dry years (Figure 8b and Table 2). Therefore, the probability of large landslides is smaller in dry years with respect to the reference period 1961–1990 that affected the historical landslides. Although βincreases in dry years, the probability to have small landslides does not increase considerably with respect to wet years. This is because in wet years, the majority of the susceptible landslide areas are activated and in the subsequent years, the landslide activity is smaller. Generally, after widespread landslide phenomena, the entropy of the landscape is smaller due to the higher stability of potential landslide areas along the hillslopes. The wet-dry year cycles can be thought of as cycles of instability-stability of the landscape. In 2020, the power law of landslide size shows a remarkable finite-size effect due to the high presence of very large landslides. During dry years, the number of small landslides increases and the landslide size distribution assumes the form of a hard-truncated power law. The rainfall of the multimodel ensemble of RCMs is found to outperform all single models in reproducing the historical landslide patterns in 2010. We ran MAXENT for all the 17 RCMs (section 2.6) and the standard deviation is 0.016 and 0.027 around the mean of β and ε, respectively. The low variability of RCMs' rainfall for the Arno basin can be associated with the low variability of the scaling exponents. Regardless of the source of low variability, we argue that multimodel combination is a pragmatic approach to reduce model uncertainties and, thus, to make climate-related predictions more reliable [Schaller et al., 2011]. Table S1 (supporting information) reports the parameters of the landslide size distribution for each climate model in 2010. The average value of the exponents of the landslide size distribution reflects those found in other studies, for example, in Turcotte , Guzzetti et al. , Malamud and Turcotte , and Brunetti et al., .
 Future landslides in Figure 9 are composed of sites for which the landslide susceptibility is higher than 0.58 as for the predicted historical pattern in 2010. In 2020, the wettest year, a peak of landslide activity is predicted to occur very close to the mountain ridges near the valley where Florence is located. The macroarea of the basin where Florence is located (delimited as in Catani et al., ) is characterized by high elevation and slope, low erodibility, a medium value of the hillslope-to-channel distance, and forest vegetation. Thus, very wet years seem to activate sites whose intrinsic landslide instability is potentially low considering their biogeomorphological features. Hence, large changes in biodiversity may occur due to peaks in landslide activity, and therefore, it is important to explore the linkage between geomorphological and ecological processes in river basins [Rodriguez-Iturbe et al., 2009] and MAXENT offers this possibility. The landslide pattern in 2040 resembles the observed pattern of landslides (2010), although the climatic regime (wet) is predicted to be opposite of the current climatic regime (dry). This may explain the nonlinearity of landslide phenomena that most likely occur due to the relaxation of the basin after large landslide events in wet years (e.g., in 2020). The pattern in 2100 presents very small landslides scattered throughout the basin. The average landslide area, considering all landslides in the Arno basin, is 9.81%, 41%, and 3.6% of the Arno basin area in 2010, 2020, and 2100, respectively. These percentages correspond to total landslide areas of 8.04×102, 3.74×103, and 3.28×102km2.
 The time series of the 12 h rainfall variation, ΔR12, with respect to the reference period 1961–1990 for the whole modeled period (2000–2100), is shown in Figure 10a. Wet and dry years are defined for a positive and negative ΔR12, respectively. The power exponents εand βdecrease with an increase in rainfall (Figure 10b). It can be argued that the probability of large landslides gets higher as the climate gets wetter, and the probability of small landslides gets lower as shown in Figure 8 for selected years. In fact, the trend of β and ε(represented by the first derivative shown at the bottom of Figure 10) as a function of rainfall is generally the same. The hard truncation points t and m that determine the scaling regions of β and ε of the landslide size distribution (Figure 8), vary correspondingly to the wet-dry cycles. In general, t is higher for drier years because the number of small landslides is higher than for more wet years. Figure S7a shows the relationship between the exponent of the landslide size distribution of medium-large landslides (for s>t) and the variation of the rainfall R12for the period 2000–2100 with respect to the reference period 1961–1990. This scaling law holds nicely for both the dry and wet regime. Figure S6b shows the relationship β∼εφ between the exponents of the landslide size distribution for small and large landslides, respectively. The exponents γ and ψ of the aforementioned relationships are certainly related to the local climate. However, the scaling relationships in Figures S7a and S7b may hold for any rainfall-dominated landslides in basins located in different ecoregions. In general, the determination of one of the two exponents of P(S≥s) does not allow the direct calculation of the other exponent because γ and ψmay not be scaling exponents that are invariant to climate and geology. Therefore, further research would be useful to investigate the variability of these relationships for basins in different geographical areas. Figure S7 also shows the invariance of the exponents γ and ψ to the climate scenario considered. The A2 scenario determines exponents that are very similar to the exponents derived for the A1B scenario. However, this seems related to the small variability of climate at the Arno scale.
 Further discussion about the model and results is in the supporting information.
 The prediction of landslides is fundamentally important for landscape and urban managers in order to plan the spatiotemporal evolution of socio-ecological systems (or “anthromes” [Ellis and Ramankutty, 2008]) in critically unstable ecosystems [Keefer and Larsen, 2007]. Recent catastrophes show that it is necessary to carefully plan the development of urban settlements, especially in consideration of the potential and abrupt changes of climate that may trigger the instability of landscapes [Elsner et al., 2009]. This planning, however, requires the detection of the driving factors of landslides and their relative importance.
 Here we propose the MAXENT model, whose aim is to predict patterns using the least and most valuable information [Phillips et al., 2006; Phillips and Miroslav, 2008]. Each landslide can be represented as a point occurrence that coincides to its center of mass, and landslide susceptibility can be inferred by the maximum entropy principle conditional to environmental variables and a portion of observed landslide occurrences (background points). The information of sites where no landslides occurred and the accurate delineation of landslides are not required by the model. This reduces the computational burden required by MAXENT with respect to other models and possibly the monitoring effort in the field. The driving environmental variables of landslide patterns are detected by the calibration that reduces the complexity of the model and allows its transferability.
 As a case study, we consider the Arno basin. The hypothesis that landslides in the Arno are driven by rainfall is verified by reproducing the size distribution and the spatial pattern of the observed 27,500 historical landslides with an accuracy of 86%, where rainfall is the most important variable. This prediction is performed by inferring the relationships between environmental variables and landslide occurrences on only 10,000 background points, which are about 37% of the whole landslide data set. Thus, the predictive power of the model is quite high compared to the results of previous models for the same basin [see, e.g., Catani et al., 2005] and that previous studies state that at least 20% of landslides in river basins are difficult to predict (particularly small landslides [Guzzetti et al., 2002; Catani et al., 2005; Stark and Guzzetti, 2009]). MAXENT predicts at least 6% of those landslides that are difficult to detect. By reproducing the observed landslides, we verify that MAXENT, a model developed in ecology to predict species distributions, can be used in geomorphology to predict landslides. Simulations of landslides can be run in parallel with species distributions simulations to detect the linkages between geomorphological and ecological patterns.
 Our modeling framework calculates both spatial patterns of landslides and landslide size distributions without making a priori assumptions on landslide drivers as in physical-based models. We propose the landslide size distribution and its scaling exponents as potential indicators of change to inform about the landslide hazard of SES in river basins. Studies in other fields have emphasized the connection of size distributions of natural process variables to external changes [Kéfi et al., 2007; Convertino et al., 2013]. In our case study of rainfall-triggered landslides, we show how the scaling exponents of the landslide size distribution decrease and increase as a function of climate change wet and dry periods, respectively. Further studies are ongoing in connecting these exponents with risk calculations of the landslide hazard.
 The research was performed at the Department of Agricultural and Biological Engineering, Institute of Food and Agricultural Sciences, and Florida Climate Institute, University of Florida, Gainesville, Florida, USA. The computational resources of the University of Florida High-Performance Computing Center (http://hpc.ufl.edu) are kindly acknowledged. The Autoritá di Bacino del Fiume Arno, Italy, is kindly acknowledged for the data sets made available. M.C. acknowledges the funding of project “Decision and Risk Analysis Applications Environmental Assessment and Supply Chain Risks” for his research at the Risk and Decision Science Team, Environmental Laboratory, Engineering Research and Development Center, U.S. Army Corps of Engineers, in Concord, Massachusetts, USA. M.C. acknowledges N. Nelson (Ag. and Bio. Eng., University of Florida) for editing the last version of the manuscript. M.C. also acknowledges Z. Collier (Risk and Decision Science Team, ERDC, USACE) and D. Wang (Carnegie Mellon University) for a careful review of the final version of the manuscript. The authors also kindly acknowledge Paolo D'Odorico for providing comments to an earlier version of the paper. F. Morales (MSc student at the Civil and Environmental Engineering Department, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA) assisted in the research during his research internship at the Risk and Decision Science Team, Concord, Massachusetts. The authors greatly acknowledge the editorial work and the comments to the manuscript of the Editor-in-Chief Alexander Densmore and of anonymous reviewers. Permission is granted by the USACE Chief of Engineers to publish this material. The views and opinions expressed in this paper are those of the individual authors and not those of the U.S. Army, or other sponsor organizations.