Threshold modeling of extreme spatial rainfall



[1] We propose an approach to spatial modeling of extreme rainfall, based on max-stable processes fitted using partial duration series and a censored threshold likelihood function. The resulting models are coherent with classical extreme-value theory and allow the consistent treatment of spatial dependence of rainfall using ideas related to those of classical geostatistics. We illustrate the ideas through data from the Val Ferret watershed in the Swiss Alps, based on daily cumulative rainfall totals recorded at 24 stations for four summers, augmented by a longer series from nearby. We compare the fits of different statistical models appropriate for spatial extremes, select that best fitting our data, and compare return level estimates for the total daily rainfall over the stations. The method can be used in other situations to produce simulations needed for hydrological models, and in particular, for the generation of spatially heterogeneous extreme rainfall fields over catchments.

1. Introduction

[2] The spatial modeling of rainfall is a long-standing topic in the environmental sciences and has grown in importance with the realization that a warming world is likely to bring more intense precipitation events, and thus higher risk to infrastructure and populations. The topic is currently a highly active research area, some recent articles being Yang et al. [2005], Cooley et al. [2007], Feng et al. [2007], Vrac and Naveau [2007], Zheng and Katz [2008], Van de Vyver [2012], Shang et al. [2011], and Villarini et al. [2011]. Wilks and Wilby [1999] and Chandler et al. [2006] review the earlier literature. The emphasis on rare events means that extreme-value statistics [Coles, 2001; Beirlant et al., 2004] are widely used to estimate return levels and associated quantities. Classical statistics of extremes [Katz et al., 2002] underpins standard approaches to the analysis of annual maximum or partial duration series, using block maxima or peaks over threshold methods, respectively, but tools for spatial analysis that extend the classical extreme-value models have only recently begun to be used. The simplest approach to spatial analysis is to fit extreme-value distributions separately to each of many time series, as for example in Feng et al. [2007], and then to ignore any spatial correlation between the individual fits, though Madsen et al. [2002] suggest a more sophisticated approach. In some cases, this type of model may be appropriate, but in others involving spatial quantities such as joint return levels or areal rainfall, spatial dependence must be taken into account, and models are then needed that respect appropriate dependence properties of extremal distributions.

[3] Max-stable processes [de Haan, 1984; de Haan and Ferreira, 2006; Davison et al., 2012] extend the generalized extreme-value (GEV) distribution, which is widely used to describe univariate maxima, to the spatial setting, and thus provide consistent multivariate distributions for maxima in arbitrary dimensions. Although proposed some time ago [Coles, 1993; Coles and Tawn, 1996; R. L. Smith, University of Surrey, unpublished manuscript, 1990,], such processes have been little applied until very recently. Padoan et al. [2010] show how composite likelihood methods can be used to fit max-stable processes and illustrate this with U.S. rainfall data. Shang et al. [2011] use them to gauge the effect of El Niño–Southern Oscillation on winter rainfall in California, and Westra and Sisson [2011] use them to understand how extreme rainfall in Eastern Australia depends on explanatory variables such as the Southern Oscillation index and sea surface temperature. All three papers have the limitations of using block maxima and fitting only a single family of max-stable models, however, whereas in many applications it would be preferable to use threshold exceedances, which make more efficient use of the data, and to be able to compare several model classes. Indeed, Davison et al. [2012] found that other max-stable models fit extreme rainfall data appreciably better than the Smith (unpublished manuscript, 1990) model used by Shang et al. [2011] and Westra and Sisson [2011]. Huser and Davison [2013a] show that the Smith model also has theoretical drawbacks. Renard [2011] describes another approach to annual maximum rainfall analysis, based on Bayesian hierarchical models using a copula approach [Sang and Gelfand, 2010] but mentions that use of max-stable modeling of spatial dependence might constitute an improvement. Davison et al. [2012] found that max-stable models indeed provided better estimates of extreme spatial rainfall than did Bayesian and standard copula approaches. The copula approach of Salvadori and Michele [2010] is intended for a given network of gauge stations rather than for a truly spatial analysis. Buishand et al. [2008] use a rather special max-stable model to simulate daily spatial rainfall in a homogeneous region of North Holland, but their approach would be difficult to generalize to more complex settings.

[4] The contributions of the present paper are to explain how max-stable models for extreme spatial rainfall may be fitted to several simultaneous partial duration series using thresholds and a censored likelihood approach, to fit a variety of models to daily rainfall data in a small spatial domain, and to extend the max-stable models themselves by fitting so-called inverted max-stable models, which allow more flexible forms of tail behavior that are coherent with recent developments in statistics of extremes. We illustrate the ideas using data from 24 rainfall time series over a small upland domain, supplemented by a longer series from a nearby site.

[5] Section 'Study Area and Available Data' describes briefly the study site and details how the data we analyze were collected. Section 'Extreme-Value Theory' presents the main results about extremes used in this paper. Inference tools, which are based on Gaussian models and composite likelihood, are explained in section 'Inference' and applied in section 'Modeling Extreme Rainfall in Val Ferret'. Functions for fitting our models were written in R [R Development Core Team, 2012].

2. Study Area and Available Data

[6] The data set used in this study stems from an experimental catchment (see Figure 1) located in the Val Ferret region in the Swiss Alps, a valley in the southernmost ridge that borders Italy. The study area covers a total surface of 20.4 km2 with elevation ranging from 1773 m above mean sea level (amsl) at the outlet of the catchment to 3206 m amsl; its mean elevation is 2423 m amsl. It is characterized by moderate to steep slopes (mean slope: 31.6°, maximum: 88.9°). The watershed is mainly oriented southeast to northwest and is a subcatchment of the Dranse de Ferret, a tributary of the Rhone. The land use consists mostly of vegetation (mountain grassland 58% and shrubs 2%) and bare ground (bedrock outcrops 24.7% and rocks 12.7%). A small glacier and three small lakes feed the Dranse de Ferret throughout the year. During spring, snowmelt is the main contributor to the discharge, though extreme rainfall events and occasional snowfall are more important in early autumn and can lead to large rainfall-runoff peaks in the hydrograph.

Figure 1.

The Val Ferret watershed, showing the sites of the meteorological stations, with contours showing their elevations above mean sea level in meters.

[7] The site was chosen because there is very little anthropogenic influence on the hydrological regime and micrometeorological processes, and for its representativeness of small alpine watersheds. Since 2008, it has been heavily monitored with gauging stations and a wireless network of small meteorological stations relying on Sensorscope technology (SensorScope) [Ingelrest et al., 2010]. Cumulative precipitation is measured every minute with tipping bucket rain gauges (Davis Rain Collector II), along with air humidity and temperature, skin temperature, wind speed and direction, incoming solar radiation, soil moisture and suction. In such complex terrain, meteorological characteristics may vary greatly over small spatial scales, and this is usually not captured by remote sensing techniques. In order to measure this spatial variability, 10 Sensorscope stations were deployed in 2009, 15 in 2010, 26 in 2011, and 24 in 2012. There are three permanent stations, but the others are usually only deployed from May to October, due to the many avalanches in the area. Using these data, the impact of the spatial variability of the main hydrological forcings on hydrological models has been assessed [Simoni et al., 2011], and degree-day snowmelt models have been improved [Tobin et al., 2013].

[8] In this study, we restrict our analysis to a subset of 24 stations, of which 8 were deployed during the four field campaigns. Some stations were moved and thus are considered to be different for the different years. About 58% of the data in the 24 time series are missing, with 470 days of records for the longer time series and only 47 for the shortest, but with more than 100 days for 21 series. We could have excluded stations with very few data, but we kept them in order to improve the estimation of spatial association. The high number of missing data is mainly because not all the stations were deployed each year. Moreover, due to the harsh conditions in this high altitude catchment, the remote location of the stations, and some wireless communication failures, data from some stations exhibit gaps of several days. However, the missing data are independent of the rainfall amounts and thus will not bias our analysis.

[9] With such a short period of records, estimates of extremal characteristics are very variable. We therefore added another station located outside the catchment. The Swiss Federal Office of Climatology and Meteorology, MétéoSuisse, deploys weather stations all over the country, one of which is situated at the Col du Grand St-Bernard (GSB), only 5 km away from most of the stations deployed in the Val Ferret. This station is located at 2472 m amsl, with similar topographic conditions to those of the catchment. More than 31 years of data (from 1982 to 2012) are recorded at this station with good quality sensors, and with few missing data. Inclusion of these data can be expected to improve our estimation of the marginal distribution of extreme events.

[10] To reduce the strong temporal dependence, the precipitation is cumulated on a daily basis centered at noon. The resulting daily cumulative rainfall time series for the 24 stations of the Val Ferret catchment, some of which are shown in Figure 2, display the expected strong dependence across series but only limited serial dependence. The most extreme daily value recorded in the catchment over these four summers is 58 mm, with 109 mm recorded at the GSB during the 31 summers (see Figure 3). Statistical models that are capable of consistent extrapolation beyond available data are needed to estimate probabilities for higher rainfall levels, and these are provided by extreme-value theory.

Figure 2.

Daily cumulative rainfall totals for 575 days in summers 2009–2012, recorded by Sensorscope stations 1–4 in the Val Ferret region. Vertical dashed lines separate the 4 years. White spaces correspond to missing data.

Figure 3.

Daily cumulative rainfall totals for 31 years in summers 1982–2012, recorded by MétéoSuisse at the Grand St-Bernard.

3. Extreme-Value Theory

3.1. Univariate Theory

[11] Extreme-value theory began with results of Fisher and Tippett [1928] on the limiting distributions of linearly rescaled maxima of a sample of independent random variables. If such a limiting distribution H exists and is nondegenerate, then it must be max-stable, i.e.,

display math(1)

for all inline image and for some inline image and βn. In the univariate case, max-stable distributions are of the form [Coles, 2001, chapter 3]

display math(2)

where inline image for a real number a, with location parameter inline image, scale parameter inline image, and shape parameter inline image; the case inline image is interpreted as the limit for inline image. The GEV distribution H encompasses the Weibull inline image, Gumbel inline image, and Fréchet inline image cases. The shape parameter determines the weight of the tail of H; in particular, H has a finite upper limit for inline image. The distributions of other extreme order statistics and of threshold exceedances are closely related to this basic result on maxima.

[12] Partial duration series analysis was developed by hydrologists in the 1940s [see Langbein, 1949] and became increasingly popular in the 1970s [Todorovic and Rousselle, 1971; Todorovic and Zelenhasic, 1970]. Following theory developed by Pickands [1975], these threshold models were generalized by Davison and Smith [1990]. Under suitable conditions and for a sufficiently high threshold u, the upper tail distribution of a wide class of random variables Y can be well approximated by

display math(3)

where inline image, and inline image. Here inline image is the probability that the threshold u is exceeded, and τ and ξ are scale and shape parameters, respectively, determining the distribution of exceedances, ξ corresponding to those of the limiting distribution of maxima (2). The parametrization of the generalized Pareto distribution (GPD), whose survivor function appears in the braces on the right of (3), is different from the usual one [Coles, 2001, chapter 4] and has the advantage that the parameters τ and ξ do not depend on the choice of threshold u.

[13] Equation (3) provides a model for the extremes of independent stationary data. To account for dependence and nonstationarity, declustering and covariate regression are often used [Chavez-Demoulin and Davison, 2012], sometimes using nonparametric methods [Davison and Ramesh, 2000; Hall and Tajvidi, 2000; Ramesh and Davison, 2002; Chavez-Demoulin and Davison, 2005]. Examination of our data shows no evidence of temporal or spatial nonstationarity, but in order to avoid dealing with intraday effects we model daily cumulative rainfall.

[14] One way to investigate dependence in extremes of a stationary time series inline image is through the extremogram [Davis and Mikosch, 2010], various choices of which are possible. We employ the tail dependence coefficient [Ledford and Tawn, 1996],

display math(4)

which can be estimated by considering the joint exceedances of Yt and Yt+h above some fixed finite u. Figure 4 shows this with u corresponding to the empirical 90% quantile of the daily rainfall data for a subset of the data. There is slight dependence for h = 1, but it vanishes as u increases. For simplicity we model daily rainfall fields as independent from day to day; this should have little impact on our conclusions.

Figure 4.

Extremogram (4) for the daily cumulative rainfall time series at four locations computed with thresholds corresponding to the 90% quantile for each series. Horizontal dashed lines show the upper 0.975 confidence limit for independent data, obtained by random permutation of the data.

3.2. Spatial Extremes

[15] Ignoring the spatial nature of extreme events is inappropriate in situations involving estimation of quantities that depend on the multivariate distribution of the process, for example, joint return levels of rainfall at several locations or the discharge from the catchment [see Davison and Gholamrezaee, 2012]. Spatial modeling of extremes is needed for such purposes. Next we present the natural spatial extension of the univariate extreme-value distributions, namely, max-stable processes.

[16] Max-stable processes are spatial extensions of the max-stable distributions satisfying (1). By analogy with the univariate case, they arise as the only possible class of limits for rescaled componentwise maxima of spatial processes. Consider independent stochastic processes inline image defined for x lying within a spatial domain inline image and with continuous sample paths, and suppose that there exist rescaling functions inline image and inline image such that the sequence of rescaled maxima

display math

converges weakly to a process Z(x) having a nondegenerate distribution for each inline image. Then the only possibility is that the limiting process inline image is max-stable [de Haan and Ferreira, 2006, chapter 9]: in analogy with (1), after a suitable linear rescaling, for any positive integer k, the pointwise maximum of k independent copies of inline image has the same distribution as does inline image itself. For each site x, the scalar Z(x) has a GEV distribution, and for any finite set of sites inline image, the corresponding variates inline image have a multivariate extreme-value distribution [Tawn, 1988]. There is a close analogy here to a spatial Gaussian process, all of whose finite-dimensional margins are Gaussian.

[17] Just as it is convenient to standardize the marginal distributions of a Gaussian process, it is convenient to transform the max-stable process Z(x) to have a unit Fréchet marginal distribution, i.e., inline image, for inline image and inline image. In this case the process inline image is called simple max-stable, the renormalizing sequences are inline image, and the joint distribution function of inline image can be written as

display math(5)

where the so-called exponent measure function V satisfies inline image for any inline image and inline image for each inline image.

[18] A key result stemming from the work of de Haan [1984] is that every simple max-stable process can be represented in the form

display math(6)

where inline image are the points of a unit-rate Poisson process on the positive half-line and the inline image are independent replicates of a nonnegative random process W(x) that satisfies inline image for each inline image. Expression (6) can be interpreted in terms of a “rainfall-storms” model, where the inline image are the storm amplitudes, the inline image are their shapes, and Z(x) represents the effect of the largest storm observed at x. This interpretation of max-stable processes has affinities to the stochastic rainfall models of Rodriguez-Iturbe et al. [1987, 1988] and Cox and Isham [1988], and Huser and Davison [2013b] exploit this to construct a space-time model for extreme hourly rainfall. In terms of W(x), the exponent measure function in (5) may be written as

display math(7)

but although this function can usually be computed for D = 2, it is only available for inline image in certain cases. We discuss the consequences of this in section 'Inference'.

[19] Max-stable processes are asymptotically dependent [Ledford and Tawn, 1996], when the limit

display math(8)

is strictly positive for pairs of sites inline image. The so-called extremal coefficient inline image lies in the interval [1, 2] and summarizes the asymptotic dependence between inline image and inline image. If inline image, then the extremes at x1 and x2 are ultimately independent for very high z, whereas if inline image, they are completely dependent. It is straightforward to see that inline image, where V is the exponent measure function given by (5) and (7) for inline image and inline image, i.e., in the case D = 2.

[20] The discussion earlier and the existing literature focus on maxima of spatial processes, but threshold modeling is preferable in applications for the reasons discussed in section 'Univariate Theory'. Huser and Davison [2013b] extend the threshold approach and use it to fit a model for extreme spatiotemporal rainfall, based on the bivariate threshold likelihood described by Coles [2001, section 8.3.1]. Let inline image be a bivariate process whose large values are to be modeled. First, note that if Y1 and Y2 have marginal unit Fréchet distributions, then under the conditions needed for the joint distribution of their maxima to be max-stable, we have for sufficiently large z1 and z2 that [Coles, 2001, section 8.3.1]

display math(9)

[21] Expression (9) implies that we can use the joint distribution for maxima to approximate the joint upper tail of inline image, for sufficiently large values of these variables.

[22] In practice, the marginal distributions of Y1 and Y2 will not be unit Fréchet. However, the monotone increasing transformations inline image and inline image, where inline image are fitted GPDs (3), are such that the bivariate random variable inline image has approximately unit Fréchet margins for inline image and inline image, where u1 and u2 are high thresholds for Y1 and Y2. Then

display math(10)

for inline image. In section 'Inference', we show how this may be used for inference on the model.

3.3. Asymptotic Independence Models

[23] Although max-stable processes arise as the natural extension of standard scalar and multivariate extreme-value models, they can be inappropriate for modeling real data. In the threshold approach, if the threshold is chosen too low, the dependence structure may not have converged to the max-stable limit. Moreover, if the true limiting distribution yields independent extremes, this may be impossible to verify on data, for which some dependence will always be present because the limit is never attained in practice. In such cases, it will be preferable to model threshold exceedances using a model in which the degree of dependence varies according to the severity of the extreme event. de Carvalho and Ramos [2012] review related models and techniques, and Wadsworth and Tawn [2012] describe an approach to constructing models that capture this phenomenon in spatial extremes, by inverting max-stable models. Another class of asymptotic independence models involves Gaussian copulas. All can be fitted using the methods for max-stable processes described in section 'Inference'. Wadsworth and Tawn [2012] also propose hybrid models, based on max-mixtures of max-stable and asymptotically independent models, that are asymptotically dependent but not max-stable and can smoothly approach max-stability in the extremes. Owing to our limited data we do not fit them in this paper.

[24] Wadsworth and Tawn [2012] show that if the process Z(x) is simple max-stable on the domain inline image, then the inverted process

display math(11)

has unit Fréchet margins and asymptotically independent extremes, except in the pathological case never met in practice where the extremes of Z are perfectly dependent. Inverted max-stable models for exceedances of inline image over thresholds u1 and u2 are easily derived in terms of g, the transformations t1 and t2 applied in (10), and the exponent measure V of Z(x), yielding

display math(12)

[25] In max-stable models the strength of dependence between pairs of extremes is summarized by expression (8) whereas that in asymptotic independence models is summarized by the coefficient of tail dependence inline image, which appears through the expression [Ledford and Tawn, 1996]

display math(13)

where inline image is a slowly varying function, i.e., one satisfying inline image, for any inline image. The process inline image is asymptotically independent if inline image, since in that case the limit of (13) equals zero; interest then focuses on the rate of approach to zero, which is determined by η. Models derived by applying the inversion transformation (11) to a max-stable process with pairwise extremal coefficient inline image have inline image. Thus, a max-stable field in which inline image, so that extremes of inline image and inline image are highly dependent, will give a transformed field with unit Fréchet marginal distributions and with inline image. The corresponding inline image and inline image, though asymptotically independent, approach this limit only slowly. If on the other hand inline image, then inline image and inline image will approach asymptotic independence much more rapidly.

[26] Ledford and Tawn [1996] proposed a simple estimator of η, but with little data, as in our application, their estimator is too variable to help in distinguishing between asymptotic dependence and independence, so we must rely on models. In the next section we discuss max-stable models for spatial extremes, from which asymptotic independence models may be constructed through the inversion transformation (11), and describe how they may be fitted.

[27] Another class of asymptotically independent models, related to the classical theory of geostatistics and kriging [Diggle and Ribeiro, 2007], corresponds to fitting a Gaussian copula to threshold exceedances, or equivalently to fitting a Gaussian process to transformed margins. The standard bivariate normal distribution function inline image with correlation ρ is used to model threshold exceedances through

display math(14)

for transformations inline image and inline image defined such that inline image follow a standard bivariate normal distribution. If inline image, then Y1 and Y2 are asymptotically independent with inline image [Ledford and Tawn, 1996], where inline image is the lag vector, yielding another asymptotic independence model.

4. Inference

4.1. Gaussian Models

[28] Various parametric models have been proposed for the process inline image appearing in equation (6) [see Schlather, 2002; de Haan and Pereira, 2006; Kabluchko et al., 2009; Blanchet and Davison, 2011; Davison et al., 2012; Davison and Gholamrezaee, 2012; Wadsworth and Tawn, 2012; Smith, unpublished manuscript, 1990]. We concentrate here on models based on the Gaussian distribution, which are easily interpretable, and are related to classical geostatistics via the inclusion of correlation functions and variograms.

[29] A first class of models takes W(x) to be a probability density function. The Gaussian model of Smith (unpublished manuscript, 1990) takes inline image to be a multivariate normal density with covariance matrix Σ and mean s uniformly chosen on inline image. Then the exponent measure of the process Z(x) at x1 and x2 is

display math(15)

where Ф is the standard normal cumulative distribution function and inline image.

[30] A second, the Brown-Resnick model [Brown and Resnick, 1977; Kabluchko et al., 2009], is obtained by taking inline image, where inline image is a centered intrinsically stationary Gaussian process with semivariogram γ and inline image almost surely. Then the exponent measure of the process Z(x) has the form (15) with inline image. Popular semivariograms include the power-law, or stable, function [Banerjee et al., 2004, p. 28]

display math(16)

where inline image denotes the Euclidean norm. Taking inline image is equivalent to using the Smith model with a symmetric covariance matrix Σ [Huser and Davison, 2013a].

[31] Schlather [2002] proposes a third model, taking W(x) proportional to the positive part of a stationary centered Gaussian process with unit variance and correlation function inline image. The corresponding exponent measure is

display math

[32] Various choices of inline image are available, though we only use the stable correlation function inline image with inline image defined in (16). Extremes from Schlather models cannot attain independence for any correlation function, since inline image for all pairs of sites x1 and x2 in inline image [Schlather, 2002].

[33] Asymptotic independence models can be obtained by taking any of these exponent measure functions and applying the transformation leading to (12).

4.2. Pairwise Composite Likelihood

[34] Given data inline image consisting of n independent replicates from a max-stable process observed at D sites, the likelihood for the models described earlier cannot easily be expressed for general D, for two reasons. First, exact computation of the joint cumulative distribution function (5) would entail calculating (7), and in general this is out of reach for inline image. Second, even if an explicit form for (7) were available, computation of the likelihood function would involve D-fold differentiation of (5), and this leads to a combinatorial explosion [Davison and Gholamrezaee, 2012]; with D = 25 the number of terms in the likelihood is of order inline image. However, if the bivariate margins can be computed and the model parameters ψ can be identified from them, then it is possible to estimate ψ by maximizing the pairwise log likelihood [Lindsay, 1988; Varin et al., 2011]

display math(17)

where f denotes the likelihood contribution from two distinct observations from the same replicate. The marginal and dependence parameters are estimated simultaneously, as suggested by Padoan et al. [2010].

[35] Under essentially the same regularity conditions as those needed for the limiting normality of the standard maximum likelihood estimator, the maximum pairwise likelihood estimator inline image has a limiting multivariate normal distribution with mean ψ and covariance matrix of sandwich form inline image as inline image, where

display math

are the variance of the score function and the expected information matrix derived from (17). An estimate inline image of inline image is easily obtained from the Hessian given by the optimization algorithm. When independent replications of the process are available, an estimate inline image of inline image can be obtained by the empirical variance of the score contribution of each observation [Varin et al., 2011]. We have found this to be somewhat unstable, so we instead approximate inline image by the covariance matrix of bootstrap copies of the estimates [Varin et al., 2011] and then obtain inline image by multiplying both sides of this covariance matrix by inline image.

[36] Model selection may be guided by minimizing the composite likelihood information criterion inline image [Varin and Vidoni, 2005], but its values can be huge because of the numbers of terms in (17), so we prefer inline image, which corresponds closely to the usual Akaike information criterion (AIC) for independent observations. Similarly, we define a scaled pairwise log likelihood, inline image.

[37] In applying pairwise likelihood we must account for the fact that exceedances may occur in both variables, in one variable or in neither, and to do so we apply the censoring approach described by Coles [2001, section 8.3.1]. If the bivariate distribution above thresholds u1 and u2 is F, then the likelihood contribution is

display math

where inline image denotes differentiation with respect to yi. Thus, observations that lie below a threshold contribute only a censored contribution to the likelihood. These equations are used to derive composite likelihoods for the different models of equations (10), (12), and (14). If one observation of a pair is missing, then the marginal GPD contribution from the remaining observation is included in the likelihood and contributes to estimation of τ and ξ.

5. Modeling Extreme Rainfall in Val Ferret

[38] In this section, we fit asymptotic dependence and independence models (10), (12), and (14) to the data from the 24 stations in the Val Ferret region and to the 31 years of data at the GSB. The daily records are viewed as independent replicates of a spatial rainfall process, at least for extreme levels. Estimation of the extremal dependence is challenging, because it relies on a subset of 575 days of data, with about 58% missing. Marginal estimation is made more precise because of the use of the longer series of GSB data. The threshold for each station cannot be taken too high but must be high enough that the extremal models fit adequately. One consequence of having limited data is that standard errors of our estimates are large, and that return levels have large confidence intervals. With longer time series, predictions would be more accurate.

[39] We first chose the thresholds for fitting model (3) at each station by taking inline image, corresponding to the 90% empirical quantiles for each series. For the 24 stations in the catchment, this choice corresponds to an estimated threshold of between 7 and 15 mm, but these are affected by the number of missing values. Bootstrap 95% confidence intervals associated to these estimates contain 11 mm for all but one of the stations, so we decided to use a fixed threshold of 11 mm throughout. For the GSB station, the threshold was estimated to be 17 mm. These different thresholds might be explained by the different climatic conditions inside and outside the catchment. The corresponding exceedance probabilities can be supposed to equal 10%. These choices produce reasonable fits of the marginal GPD model (3).

[40] Expressions (10), (12), and (14) then model the marginal distributions and dependence structure of the rainfall series above the chosen thresholds. Taking higher thresholds had little effect on the parameter estimates and did not improve the fits.

[41] We use the composite likelihood approach described in section 'Pairwise Composite Likelihood' to fit max-stable and asymptotic independence models under the assumption that the marginal parameters τ and ξ in (3) are constant for all stations of the catchment, but with a different scale parameter τGSB for the GSB station. This is because we can expect different behavior for rainfall inside or outside the catchment. The shape parameter ξ was taken to be constant, as it is usually difficult to estimate. We compare the fits of the different models using the CLIC* (see Table 1). For the max-stable models, the values of CLIC* indicate that the Schlather model is better than the Smith and Brown-Resnick models. The Smith model is by far the worst in terms of CLIC*, agreeing with the findings of Davison et al. [2012] that it may be too smooth to adequately model complex environmental processes. The likelihood maximization fails to converge for the inverted Smith model. Among the other asymptotic independence models, the best CLIC* is for the inverted Schlather model; it is similar to that for the inverted Brown-Resnick model. The Gaussian copula model, whose likelihood is greater than those for all max-stable models, has however a larger CLIC*. Asymptotic independence models based on inverted max-stable processes seem better overall, since they outperform max-stable models both in terms of likelihood and in terms of CLIC*. The Gaussian copula model seems to fit poorly: the uncertainty for its estimated range is rather large, and this inflates the CLIC*. These results suggest that the limiting distribution is not yet attained and higher thresholds may be preferred, but we tried using thresholds up to 20 mm without any change to the conclusions. The results are similar to those found for extreme summer rainfall over a larger region of Switzerland by Davison et al. [2013], which also suggest that daily rainfall processes are asymptotically independent. Buishand [1984] found similar results for annual maximum daily rainfall in the Netherlands, at larger spatial scales.

Table 1. Parameter Estimates, Maximum Composite Log Likelihood inline image, and CLIC* for Models (10), (12), and (14)a
 ττGSBτξDependence Parameters inline imageCLIC*
  1. a

    For the Smith model, inline image, where Σ is the variance matrix of the underlying bivariate Gaussian density. For the Brown-Resnick (B-R), Schlather, and Gaussian copula (GC) models, inline image. The units of τ, τGSB, and the range parameter are mm, mm, and km, respectively. The values in parentheses indicate 95% confidence intervals obtained by bootstrapping the daily data 500 times. The likelihood maximization fails to converge for the inverted Smith model.

Max-Stable Models
Smith7.48 (5.18, 10.05)4.69 (2.47, 7.38)0.11 (0.03, 0.19)0.48 (0.33, 0.73)2.94 (1.75, 4.92)0.57 (0.27, 1.03)−612212,264
Schlather7.34 (5.19, 10.05)5.17 (2.79, 7.88)0.09 (0.01, 0.17) 10.63 (5.26, 25.85)0.52 (0.41, 0.62)−603512,087
B-R6.89 (4.74, 9.32)5.46 (2.99, 7.79)0.10 (0.02, 0.17) 5.82 (2.93, 14.85)0.46 (0.34, 0.58)−603612,092
Asymptotic Independence Models
Schlather7.85 (5.79, 10.53)6.42 (3.97, 9.21)0.04 (−0.04, 0.11) 186 (105, 367)0.60 (0.52, 0.67)−602312,065
B-R7.79 (5.41, 10.30)6.48 (4.43, 9.15)0.04 (−0.04, 0.12) 113 (48, 185)0.57 (0.48, 0.69)−602712,074
GC7.53 (5.32, 10.00)6.21 (4.04, 8.95)0.06 (−0.02, 0.12) 103 (44, 387)0.50 (0.38, 0.59)−602912,118

[42] In light of the earlier results, we base further discussion on the max-stable and asymptotic independence Schlather models. Table 1 shows that the marginal parameters are very similar, with inline image corresponding to the Fréchet distribution, but the standard errors do not allow any clear distinction of the sign of ξ for asymptotic independence models. The estimates of the range and smoothness parameters λ and κ indicate dependence at fairly long ranges but rough processes with small-scale variation. The confidence intervals for the range parameters are highly asymmetric, showing that it is impossible to estimate the upper bound of the dependence owing to the small size of the catchment. We tried to include nugget parameters [Diggle and Ribeiro, 2007, section 3.5] in the correlations and semivariogram to account for very small variation and measurement error, but it was then difficult to estimate both the smoothness and nugget parameters.

[43] To assess the validity of our marginal models, we first checked the quantile-quantile plots (QQ plots) for data from each station (not shown). As we assumed the same marginal model for all the locations of the catchment, we also computed a pooled QQ plot for the 24 stations (Figure 5), with confidence bounds based on the overall best model, i.e., the inverted max-stable model based on the Schlather model, thus taking into account the spatial dependence in the data. This QQ plot indicates a reasonable fit. It is unsurprising that stationarity seems to be reasonable, considering the size of the study region and of the data set. The QQ plots for the other models (not shown) are similar.

Figure 5.

Pooled unit Fréchet QQ plot for the marginal fits of model (12) with the Schlather model. The same marginal model is fitted to the data from all the 24 locations in the catchment. Dotted lines are the 95% confidence bounds, obtained by simulating from fitted model (12). The solid diagonal line indicates a perfect fit.

[44] Max-stable random fields can be simulated using the R package SpatialExtremes (R package version 1.8-1, 2011,, and simulations from inverted models are then easily obtained using equation (11). Spatial rainfall can then be simulated by marginal transformation of the simulated max-stable random fields, using (3) above the threshold and the empirical distributions below the threshold. Since the empirical distributions contain zero rainfall values, so too do the simulated ones. In our case the threshold is constant across our (small) region, so we simply merged the empirical distributions below the threshold, but over larger regions a suitable interpolation procedure could be used. Our transformation simulates rainfall having the estimated extremal dependence structure both above and below the thresholds, and this dependence structure may be inappropriate below them. However, the marginal distributions below the thresholds are correct, and the dependence structure for low rainfall should have little impact on conclusions for extreme events.

[45] Figure 6 shows rainfall processes simulated from the fitted Smith and Schlather models, and a simulation from the inverted Schlather model. They show the general behavior of simulated rainfall extremes for the different models, though for ease of comparison the maximum value is around 50 mm in all three cases. The elliptic contours of the Smith model look quite unrealistic, and the Schlather model seems more plausible. The difference between the Schlather max-stable and inverted models is striking. Extremes of the latter are more local, but a realization with a maximum of 50 mm is more likely to appear than for the two max-stable models.

Figure 6.

Simulation of max-stable random fields, on the original data scale (mm), from the fitted (a) Smith and (b) Schlather models and (c) an inverted max-stable process based on the Schlather model. Black dots show the locations of the 24 stations. Distances (km) have as origin the Swiss coordinate system (CH1903), and the contour shows the Val Ferret watershed, as in Figure 1.

[46] Figure 7 shows the estimated extremal coefficients and coefficients of tail dependence for the max-stable and asymptotic independence models, respectively. Empirical estimates of θ and η are obtained using the likelihood estimator of Schlather and Tawn [2003] and that of Ledford and Tawn [1996], respectively. To reduce the uncertainty of the estimated extremal coefficients, we have grouped pairs of stations into distance classes. The fitted Smith model is nonisotropic, and its extremal coefficients lie in the dashed polygon. Although the confidence intervals are large, the Smith model is not flexible enough to capture the general pattern of extremal dependence. The Schlather and Brown-Resnick model estimates are very close and seem to provide a better fit. For all these max-stable models η = 1, which seems acceptable considering the empirical estimates and their confidence intervals. However, asymptotically independent models seem to perform better, since they capture the decrease of η with distance. The two inverted models are indistinguishable. The Gaussian copula model produces a rather different fit that does not lie within the confidence interval based on the most distant pairs.

Figure 7.

(a) Fitted extremal coefficients and (b) tail dependence coefficients for the different models. In Figure 7a, the hatched polygon shows the limits of the Smith extremal coefficient curves, which are direction-dependent. In Figure 7b, the solid line corresponds to the Gaussian copula model. In both plots, dashed lines correspond to the Schlather model, and dotted lines correspond to the Brown-Resnick model. Points are the estimated coefficients for pairs of stations grouped into distance classes, with 95% confidence intervals, obtained by bootstrapping the daily data, shown as grey vertical lines.

[47] After having used CLIC* to identify the best stationary models for our data, we attempted to fit nonstationary models in which the scale parameter τ of the marginal GPDs depends on covariates such as altitude, latitude, and longitude; we kept ξ constant. The QQ plots (not shown) show no real improvement, so we retain the stationary model.

[48] We now examine results for our selected models, which are based on Schlather random fields. For comparison, we also consider the Gaussian copula model, which represents a standard geostatistical approach. As a pairwise diagnostic, Figure 8a shows the conditional exceedance probabilities

display math

predicted by the different models, for distances 1 and 5 km. Figure 8a shows that the differences between max-stable and asymptotic independence models are small for the lower rainfall levels, indicating that all the models can adequately fit the observed data, but they differ in their predictions for higher levels.

Figure 8.

Comparison of results from max-stable and asymptotic independence models. (a) Theoretical conditional probabilities of exceedances inline image for pairs of locations inline image 1 km apart (plain lines) and 5 km apart (dashed lines). (b) Return levels (solid) with 95% confidence intervals (dotted), for the total daily rainfall falling at the 24 stations in the Val Ferret watershed. In Figures 8a and 8b, blue lines correspond to the Schlather max-stable model, red lines correspond to the inverted max-stable Schlather model, and green lines correspond to the Gaussian copula model. Black lines in Figure 8b correspond to a spatially independent model.

[49] Simulations from our models provide information about quantities depending on the spatial rainfall process, such as return levels for the total amount of rainfall at the 24 stations of the catchment in 1 day. For a random variable Y of daily records, the r-day return level yr has probability inline image of being exceeded on one particular day. Return levels for the peaks over threshold model can be derived by inverting (3), and return levels for the total daily rainfall at the 24 stations can be derived by simulation from our model; since we model daily rainfall for summer months, the return levels are expressed in terms of summer days. The estimates shown in Figure 8b were obtained by simulating 200,000 summer days and taking empirical quantiles of the total amount of rainfall at the 24 stations. The confidence intervals, which are obtained from bootstrapping 200 times using all the summer days of the original data, are rather wide, but the asymptotic independence models give lower estimated return levels. The predictions based on the max-stable model can be seen as giving an upper bound for joint extreme quantities, such as return levels. Figure 8b also shows the return levels corresponding to a spatially independent model whose marginal parameters are the same as those estimated for the max-stable Schlather model; this gives much lower return levels and would lead to severe underestimation of risk.

6. Discussion

[50] In this paper we propose the use of extreme-value models to estimate the extremes of spatial daily rainfall. Our approach consists of fitting GPDs to marginal threshold exceedances and modeling spatial dependence using max-stable models or asymptotic independence models based upon them. For comparison, we also consider a model based on the Gaussian copula. The models are fitted to threshold exceedances using a censored pairwise likelihood. Nonstationary models could be fitted by regression on covariates, with model selection performed using CLIC*. Daily rainfall fields can be simulated over the whole catchment. Estimates of quantities depending on spatial extremes, such as joint return levels, can be derived by simulation from the fitted model.

[51] Our application to the Val Ferret watershed, a small mountainous catchment in Swiss Alps, shows that stationary max-stable and inverted max-stable models seem appropriate for modeling extreme rainfall in this small catchment. Although max-stable models are natural models for extremes of random fields, model selection seems to favor asymptotic independence models, for which the very rarest events are increasingly local. Simulations from the max-stable model, which assumes stronger dependence, give an upper bound for the effects of joint extremes. Schlather models give simulations that look reasonable, but the Smith model produces unrealistically smooth extremes and is also worst in terms of the information criterion CLIC*. The Schlather model seems to be appropriate for rainfall in small regions such as Val Ferret, though it cannot model the independence that would be expected to arise at larger spatial scales. This must be introduced using a Brown-Resnick model or a random set [Davison and Gholamrezaee, 2012]. Indeed, Huser and Davison [2013b] find that a Schlather model with a random set is suitable for a larger hourly summer rainfall data set, though at a much bigger spatial scale.

[52] Rainfall can be simulated over the whole region by applying a marginal transformation to max-stable simulations. In case of spatial nonstationarity, rainfall could be simulated by specifying a marginal model for the threshold (as in Northrop and Jonathan [2011]) and then applying the procedure described in section 'Modeling Extreme Rainfall in Val Ferret', with a suitable spatial extrapolation of the empirical distribution functions from observed rainfall sites to ungauged ones. We do not include temporal nonstationarity in our model, but with more extensive data this could easily be added.

[53] The interpolation of rainfall values at unobserved sites from nearby observations is generally solved via kriging [Diggle and Ribeiro, 2007], but although this yields “optimal” prediction for Gaussian processes, it may produce unrealistic predictions for extremes owing to the unsuitability of a joint Gaussian model. A more appropriate max-stable approach uses conditional simulation of rainfall at ungauged sites [Dombry et al., 2013].

[54] Our approach could be used in other situations where spatial simulation of extreme rainfall is needed. The proposed model is appropriate at timescales for which consecutive records appear to be independent, which was assumed in our application, but if finer temporal resolution is required, stronger serial correlation will be present and spatiotemporal models will be needed. Max-stable processes have been used for spatiotemporal rainfall by Huser and Davison [2013b], though fitting and simulating from their model is burdensome.

[55] In many hydrological models for rainfall-runoff simulation, temperature and other variables that affect predictions from hydrological models are also needed. An open challenge is the joint modeling of these various dependent processes, taking into account the extremes of some of them.


[56] We thank the Swiss National Science Foundation, the ETH Competence Center Environment and Sustainability SwissEx and Extremes projects, and Christophe Ancey, Marc Parlange, Simone Padoan, and anonymous reviewers for constructive comments. We thank MétéoSuisse for making the Grand St-Bernard data available.