Global warming is expected to intensify the hydrologic cycle. Documenting whether significant changes in the extreme precipitation regimes have already happened is consequently one of the challenging topics in climatic research. The high natural variability of extreme precipitation often prevents from obtaining significant results when testing changes in the empirical distribution of extreme rainfall at regional scale. A regional integrated approach is proposed here as one possible answer to this complex methodological problem. Three methods are combined in order to detect regionally significant trends and/or breakpoints in series of annual maximum daily rainfall: (1) individual stationarity tests applied to the raw point series of maxima, (2) a maximum likelihood testing of time-dependent generalized extreme value (GEV) distributions fitted to these series, and (3) a heuristic testing of a regional time-dependent GEV distribution. This approach is applied to a set of 126 daily rain gauges covering the Sahel over the period 1950–1990. It is found that only a few stations are tested as nonstationary when applying classical tests on the raw series, while the two GEV-based models converge to show that the extreme rainfall series indeed underwent a negative breakpoint around 1970. The study evidences the limits of the widely used classical stationarity tests to detect trends in noisy series affected by sampling uncertainties, while using a parametric space and time-dependent GEV efficiently reduces this effect. Showing that the great Sahelian drought was accompanied by a significant decrease of extreme rainfall events is the other main result of this study.
 The evolution of extreme rainfall events has become a major concern in the past few years because of the global warming that is expected to intensify the hydrological cycle and thus increase the precipitation intensities, even in regions where the mean annual rainfall might decrease [Alpert, 2002; Trenberth et al., 2003; Emori and Brown, 2005; Held and Soden, 2006; O'Gorman and Schneider, 2009]. This prospect is clearly visible at global scale in general circulation models (GCM) and Regional Climate Models simulations [Easterling et al., 2000; Allen and Ingram, 2002; Milly et al., 2002; Voss et al., 2002; Groisman et al., 2005; Kharin and Zwiers, 2005; Alexander et al., 2006; Sun et al., 2007; Min et al., 2011], but clear trends in that direction remain to be confirmed in observations, even though a few studies assert this to be already perceivable in some places [Dore, 2005; Zhang et al.2007].
 Researchers looking for possible changes in the accuracy and magnitude of extreme rainfall are facing the challenge of detecting a statistically significant trend or break in a process that is characterized by a large natural space and time variability. The core of the problem lies in the fact that there are not so many data sets spanning a sufficiently long period so as to perform meaningful statistical tests of nonstationarity on series characterized by such a high time variability. Satellite observations are still too recent in that respect, which implies that the only relevant long-term observations are rain gauge series. This explains why most studies aiming at detecting trends in extreme precipitation events during the last century are based on the analysis of daily values recorded by rain gauges [see, e.g., Kunkel et al., 1999; Manton et al., 2001; Bocheva et al., 2009; Costa and Soares, 2009; Shahid, 2010; Begueria et al., 2011].
 However, point rainfall series have their own weakness in terms of sampling properly the spatial variability: indeed, the strong spatial variability of extreme rainfall is another complicating factor as far as detecting trends is concerned. There is thus a need for methods allowing to combine optimally and in a robust way the time and the spatial information provided by point rainfall series covering a climatic region. The goal of this paper is precisely to propose such a method and to illustrate its efficiency by applying it to a region—the West African Sahel—where rainfall variability is notoriously high at all scales. An extensive review of literature shows that the scientific community is still lacking an integrated regional approach for characterizing extreme rainfall distribution at regional scale.
 A first widely used approach [Frich et al., 2002; Easterling et al., 2003; Kiktev et al., 2003; Klein Tank and Knnen, 2003; Moberg and Jones, 2005; Alexander et al., 2006] to assess trends in extreme precipitations is to describe the evolution of simple rainfall indices (as for instance the number of rainy days over a fixed threshold or the evolution of the 95th percentile of the daily rainfall distribution). By incorporating most of the recorded heavy rainfall events, this approach fairly deals with the issue of sampling effects; however, the resulting statistics are poorly informative regarding the evolution of high rainfall quantiles (typically 20 to 100 year, or larger return period values) that are of greatest interest from both a climatological and a water resources management points of view.
 An alternative common approach is provided by the extreme value theory (EVT) [see Coles, 2001 for details]. In the EVT framework, extreme series are created by extracting maxima in predefined periods (block maxima analysis: BMA) or values exceeding a given threshold (peaks-over-threshold: POT). Analytical functions are specified to infer the distributions of the extreme values as for instance the widely used generalized extreme value (GEV) distributions (used in the BMA approach) or the Generalized Pareto (GP) distributions (used in the POT approach). One of the main assumptions of the EVT is the stationarity of extremes. The detection of nonstationarity in series within the EVT framework mainly consists in examining the validity of the stationarity assumption.
 To that purpose, the method that has long prevailed in the literature is the use of statistical stationarity tests, designed to detect either linear or breakpoint changes in the extreme event series [Robson et al., 1998; Haylock and Nicholls, 2000; Manton et al., 2001; Frich et al., 2002; Easterling et al., 2003; Klein Tank and Knnen, 2003; Moberg and Jones, 2005; Alexander et al., 2006; Aguilar et al., 2009; Rahimzadeh et al., 2009; Costa and Soares, 2009; Shahid, 2010; Guhathakurta et al., 2011; Chu et al., 2012]. Beyond that, there are only a few studies using regional approaches to test the stationarity of extreme precipitation; among them, we can cite Renard , Pujol et al. , and Neppel et al..
 More recently, parametric approaches have been proposed based on extreme value distributions incorporating time-dependent parameters or time-varying covariates [Coles, 2001; Katz et al., 2002]. Comparing these nonstationary GEV [Re and Barros, 2009; Marty and Blanchet, 2011; Park et al., 2011; Seo et al., 2011] or nonstationary GP distributions [Re and Barros, 2009; Sugahara et al., 2009; Begueria et al., 2011] to their stationary counterpart is a way to assess the significance of the temporal trend in extreme rainfall distribution. The time-dependent distributions have mainly been applied to assess linear trends in extreme series, but, to our knowledge, no attempt has so far been made at adapting them to detect breakpoint changes.
 Among the most recent advances in the modeling of extreme events, original developments have been proposed to model the extreme events at regional scale by taking into account the spatial heterogeneities of the distributions by incorporating spatial covariates [see, e.g., Blanchet and Lehning, 2010; Panthou et al., 2012]. Panthou et al.  show that pooling individual point series in order to fit directly a regional model using a spatial covariate reduces significantly the impact of temporal sampling effects. Some recent studies proposed to model the spatial or spatiotemporal dependence of extremes, either using spatial latent processes [see, e.g., Cooley et al., 2007; Sang and Gelfand, 2009] or max-stable models [see, e.g., Padoan et al., 2010; Blanchet and Davison, 2011]. However, all these references develop and apply one single model. Among those modeling the evolution of extremes, there is no study which compares different possible approaches for describing the temporal evolution of extremes.
 This review of literature points to the need for studying how different approaches behave comparatively to each other. The present paper thus proposes a comparison between three methods: a classical pointwise stationarity test analysis, a pointwise time-dependent GEV model, and a regional time-dependent GEV model. Beyond the comparison itself, original methodological developments are proposed to detect both linear trends and breakpoints in the time series. The integration of these various analytical steps provides a statistically coherent way for a global analysis of extreme rainfall distribution at regional scale and for detecting changes in this distribution.
2 Region and Data
2.1 Sahelian Climatological Context
 The Sahel is a semiarid region in West Africa roughly delimited by latitudes 10°N and 17°N. The Sahelian rainfall regime is characterized by both a few well-organized patterns at large scales and a strong chaotic variability at smaller scales. The well-organized patterns result from the annual cycle being governed by a monsoon regime with a single rainy season alternating with a single dry season (Figure 1a). The length of the rainy season decreases from roughly 7 months in the South (April to October) to 3 months in the North (July to September). Associated with this gradient of the rainy season length is a negative South to North Gradient of the interannual average rainfall in the order of 1 mm/km [Lebel et al., 1992]. Over the period considered here (1950–1990), the average annual rainfall ranges from around 1000 mm at 10°N to around 200 mm at 16°N (Figure 1b); in the Central Sahel, this gradient is in fact slightly tilted to the North-East. It is worth noting that in periods of lower rainfall, such as during the 1970s and 1980s, the gradient remained roughly unchanged, the isohyets being globally shifted to the South by about 180 mm [Lebel and Ali, 2009].
 Superimposed to these globally stable features, associated with large-scale forcing factors, is a strong year to year variability (Figure 1c) as well as a strong spatial variability, resulting from the convective nature of rainfall at the mesoscale. As a consequence, a given rainy season may display a rainfall pattern extremely different from the long-term average pattern, especially when looking at the mesoscale, as shown in Figure 1d.
 The standardized precipitation index of Figure 1b also displays a decadal scale signal. Decadal variability may at time be a rather fuzzy concept; in the Sahelian case, however, it refers to the long-lasting drought of the 1970s and the 1980s, which was still pretty much perceived over the 1990s, making it the most important climatic signal ever recorded in the world [Dai et al., 2004]. It has had dramatic impacts on populations mainly because food production is almost exclusively provided by rainfed agriculture. Paradoxically, the region has also been affected by a significant increase of damages due to extreme hydrological events [Di Baldassarre et al., 2010]. It is thus particularly challenging to understand how the recent regional climate change in this region has affected the distribution of extreme rainfall events.
 While many studies have documented and analyzed the big Sahelian drought, evidencing that a statistically significant break occurred at the end of the 1960s, extreme rainfall did not receive a comparable attention. One exception is the work of New et al.  who examined trends in indices of extremes daily precipitation showing that a roughly equal proportion of the six West African stations used in their study was characterized by either increasing or decreasing trends. This preliminary study points to the fact that attempting to detect nonstationarity in extreme rainfall series might lead to more ambiguous results than when working on total annual rainfall series. This is primarily due to sampling distribution of extreme values being much more dispersed than the sampling distribution of annual totals. In this respect, a comprehensive regional vision is needed in order to decrease the effect of sampling dispersion by incorporating a larger set of observations.
2.2 Rainfall Data
 The core of the data used here comes from a work which was undertaken under the umbrella of Centre Inter-états d′Etudes Hydrauliques in the mid-1980s. This allowed to pool the data from more than 700 stations since their starting date of operation. Much of West Africa was covered by this data set, albeit not in an homogeneous way, the coverage being less dense in Nigeria and Guinea, for instance. A complementary work by Le Barbé et al.  led to extend in time a large number of series, so that the whole critical period 1950–1990 could be covered. Of course, there are gaps in the series, either due to operation problems of the gauges or due to data having been lost or impossible to retrieve. All in all, this led us to select for this research on extreme rainfall an area extending from 10°W to 5°E and 10°N to 15°N, corresponding to the Central Sahel as defined in Lebel and Ali : a total of 126 rain gauges are available in this area (Figure 1b) with less than 2 years of missing data over the entire 1950–1990 period [see Panthou et al., 2012 for a detailed justification of the data selection].
 Three steps are taken to detect if there is a significant break in the extreme rainfall distribution over the period 1950–1990:
 Extracting a relevant sample of so-called extreme rainfall: the choice is made here to work on annual maxima in the framework of block maxima analysis;
 Point statistical analysis: the stationarity of annual maxima series is assessed by applying a battery of seven selected statistical tests and by using the GEV distribution to represent annual maxima distributions;
 Because point statistical analysis is very sensitive to the strong sampling effects inherent to extreme value distributions, a regional perspective has to be taken which involves deriving appropriate methods.
 The rationale behind the choices summarized above is detailed below.
3.1 Extracting Extreme Rainfall Series - Block Maxima Framework
 The series of extreme daily rainfall are extracted through the block maxima procedure [Coles, 2001]. Let X1,...,Xk be a sequence of k independent and identically distributed variables. The block maxima approach consists of defining blocks of n observations and to take the maxima within each block. This leads to obtain a vector of maxima Z(here N=41):
 In this study, each block contains 1 year of data (n=365 observations). For each of the 126 stations, a series of 41 annual maximum daily rainfall is extracted from the daily data.
3.2 Analysis of Extreme Point Rainfall Series
3.2.1 Testing Stationarity Versus Nonstationarity
 Statistical tests of stationarity are used here to detect the possible existence of a linear trend or of a breakpoint in our time series. A test compares the null hypothesis H0 “the series is stationary” to the alternative hypothesis H1“series has a linear trend and/or a breakpoint.” It returns the significance of the alternative hypothesis which is the risk to reject the null hypothesis incorrectly (the error of the first kind). For the present study, seven tests have been chosen (see section A for a brief description of the different tests): four of them test a linear trend, two are designed to detect a breakpoint, and one is able to detect both a linear trend or a breakpoint. All these tests are nonparametric, except for the Pearson test, meaning that they have no assumptions about the distribution of the series tested.
3.2.2 GEV Models
 There are at least two reasons for fitting statistical models to the empirical series, and in the case of annual maxima, a GEV (generalized extreme value) distribution: (i) it allows providing theoretical estimates of return period rainfall, and (ii) it opens the way for looking at the stationarity versus nonstationarity issue by comparing the fitting performances of a stationary GEV model versus that of a nonstationary GEV model.
184.108.40.206 Pointwise Stationary GEV (PGEV)
 If the sample size is large enough, then the appropriate model to describe the block maxima variable is the GEV distribution [Coles, 2001]:
with μ being the location parameter, σ>0 is the scale parameter, and ξis the shape parameter. The shape parameter describes the behavior of the distribution tail: a positive (respectively, negative) shape corresponds to a heavy tailed (respectively, bounded) distribution. When ξ is equal to zero, the GEV reduces to the Gumbel distribution (light-tailed distribution):
 GEV models require that the underlying random variables X1,...,Xk are independent, or at least short-term dependent (i.e., that they follow Leadbetters's D-condition, [Leadbetter, 1974]). The short-term dependence of daily rainfall was verified by Ali et al.  and Gerbaux et al. . The independence of block maxima is ensured by working on annual daily rainfall maxima.
 In the stationary GEV version, the parameters μ, σ, and ξ are assumed to be constant:
 Each stationary GEV models are fitted independently to each of the 126 maxima series by maximizing the log-likelihood function l(θ) [more convenient in practice than maximizing the likelihood function L(θ)] [Coles, 2001]:
where θ denotes the vector of the three GEV parameters (μ,σ,ξ) and g is the GEV density function.
 For a given station, the pointwise stationary GEV is referred to as PGEV, lPGEV denotes the associated log-likelihood value.
220.127.116.11 Pointwise Nonstationary GEV (PNSGEV)
 GEV distribution can be made time-dependent by incorporating a time covariate in the GEV parameters. This has two main advantages: (i) it allows a direct parameterization of the time dependency, if the later does exist, and (ii) it provides an alternative way for testing nonstationarity. It is assumed here that only the location parameter μvaries over time:
 There are two main reasons for limiting the time dependency to the location parameter: (i) it has by far the smallest sampling variance, which means that statistically testing the time dependency will provide much more robust results for the location parameter than for the two other parameters; (ii) previous studies [e.g., Le Barbé and Lebel, 1997; Le Barbé et al., 2002; Balme et al., 2006] stressed that a key pattern of the 1970s–1980s drought was the diminution in rainfall occurrence rather than in rainfall intensity: according to the renewal theory [Cox, 1962], this suggests that the location parameter of the GEV distribution should have been the most affected by this change in the rainfall regime.
 The existence of a linear trend and of change points are alternatively considered as possible sources for the time evolution of μ(t):
 - Linear trend:
 This formulation is the most used in the literature [Re and Barros, 2009; Marty and Blanchet, 2011; Park et al., 2011; Seo et al., 2011]. It is written as follows:
 - Breakpoint:
 The location parameter is modeled by a breakpoint at time t0through the following formulation:
 The method for determining t0is heuristic, consisting in comparing the respective performances of all the possible breakpoint models. Theoretically, N−1 breakpoint models can be fitted by moving t0between 1 and N−1. In practice, to avoid border effects, the five first and the five last t0 values are not considered, leading to test change point models.
 Finally, there are thus N−10 PNSGEV models (1 linear trend model and N−11 change-point models) that are fitted by maximizing the log-likelihood function l(θ):
 The best PNSGEV (among the N−10 tested), hereafter referred as PNSGEV*, is the one yielding the highest log-likelihood value (noted lPNSGEV*).
3.2.3 Stationarity Testing in the Framework of PNSGEV Modeling
 Keeping PGEV as the distribution of maxima means that the stationary hypothesis is accepted; on the opposite, if a PNSGEV distribution is found to be significantly better, then the stationary hypothesis is rejected. Obviously, as the time covariate in PNSGEV* adds an additional degree of freedom, lPNSGEV* is larger than lPGEV. The significance of the contribution of PNSGEV* compared to that of PGEV is evaluated through the computation of the likelihood ratio test [Coles, 2001] defined as follows:
 Since the series of maxima are not autocorrelated and since PGEV is nested in PNSGEV*, this ratio follows a chi-square distribution with 1 degree of freedom which is used to accept or reject the stationary hypothesis.
3.3 Regional GEV Models
Panthou et al.  recently proposed a regional statistical model of extreme rainfall over West Africa. The principle of the model is to gather all station data in one unique sample and to fit the GEV parameters by incorporating spatial covariates. One way to deal with time nonstationarity is then to add a time covariate to the spatial covariates of the regional GEV models. Stationarity can then be tested globally at the regional scale by comparing performance of the reference regional stationary model with that of its time-dependent counterpart. As the GEV models incorporate a spatial covariate, Z becomes Zj, where Zj is the annual maximum daily precipitation series at station j. The spatial covariate at station j will be denoted sj.
3.3.1 Regional Stationary GEV Model (RGEV)
 The regional GEV includes spatial covariates to describe the spatial pattern of the GEV parameters. Among the different spatial covariates tested, Panthou et al.  have shown that the mean annual rainfall (computed over the whole period 1950–1990 represented in Figure 1b) was the most appropriate to represent the pattern of both the location (μ) and the scale (μ) parameters over the study area, ξbeing supposed uniform:
As in Panthou et al. , μand σ are the linear functions of s:
 The stationary regional GEV model, noted RGEV, is fitted over the whole study period according to the maximum likelihood function:
 Figure 2 shows the maps of the two RGEV parameters, μ(s), σ(s). Since they are conditioned by the same spatial covariate, their pattern is similar.
3.3.2 Regional Nonstationary GEV (RNSGEV)
 Temporal nonstationarity is introduced in the regional GEV model through the location parameter becoming dependent of a time covariate. As in the pointwise GEV-based approach, this time dependency of the location parameter can take the form of either a linear trend or a breakpoint. Following the same rules as for the point models (see section 3.2.2), N−11 breakpoint models and one trend model are tested. In order to find the best time-dependent regional model, the procedure described in section 3.2.2 for the point model is adapted as follows:
where s denotes the spatial covariate (mean annual rainfall). The different nonstationary models are the following:
 -Linear trend formulation
 -Breakpoint formulation
 The scale parameter is stationary in time and linearly depends on the spatial covariate s:
 The log-likelihood function becomes the following:
 Among the N−10 RNSGEV models that are tested (N−11 breakpoint models + 1 linear trend model), the one having the largest log-likelihood is considered as the optimal RNSGEV model (denoted RNSGEV* in the following); the performance of RNSGEV* is then compared to the performance of its stationary counterpart (RGEV).
3.3.3 Model Selection: Permutation Procedure
 As for the point models, RGEV is nested into RNSGEV*. However, for the regional models, the computation of the log-likelihood ratio is not meaningful because of the spatial dependency between the point series constituting the regional sample. The assessment of how RNSGEV* is significant as compared to a simple RGEV model has thus to rely on a different approach. The alternative method proposed here is to assess this significance by estimating the probability to obtain the observed log-likelihood value. This is done by using a permutation resampling procedure [Kundzewicz and Robson, 2004], allowing to determine the Empirical Cumulative Distribution Function (ECDF) of the log-likelihood of the RNSGEV* model. This consists in the following:
 For m=1 to M, creating a regional sample ψmby swapping randomly all the years simultaneously for all maxima series. This conserves the spatial dependency of the observations within each year;
 Fitting the N−10 RNSGEV models to this sample ψm;
 Choosing the best model among these N−10 RNSGEV : RNSGEV*(m);
 Storing the log-likelihood of RNSGEV*(m) in a vector of length M;
 Back to step 1 until the number of created series M is sufficient to provide a robust estimate of the log-likelihood ECDF; here M=10,000 was shown to fulfill this requirement;
 Evaluating the probability of the log-likelihood corresponding to the observed RNSGEV* from the log-likelihood ECDF.
 This permutation resampling procedure is an efficient method to assess the significance of the RNSGEV* performance; however, it is in practice relatively time-consuming: almost 1 month of CPU time was necessary to complete the total resampling procedure on an eight-processor (1.86 GHz) computer. For all statistical models proposed in this study, the calibration has been achieved with the package Introduction to Statistical Modeling of Extreme Values [ISMEV, Heffernan et al., 2013] implemented in the R environment [R-Development-Core-Team, 2009]. In order to limit the technical problems of convergence failure and local maxima detection, a numerical protocol has been defined which compares four different optimization procedures (Nelder-Mead, Broyden-Fletcher-Goldfarb-Shanno, Conjugate Gradient, Limited-memory Broyden-Fletcher-Goldfarb-Shanno Box-constraints) fed with five initial values for each distribution parameter. No convergence failure has been detected for our applications.
4.1 Statistical Tests of Stationarity on Observed Series
 As a preliminary step, the seven stationarity tests described in section A are applied to the 126 series of cumulative annual rainfall. In line with results from previous studies on this issue [see, e.g., Hubert and Carbonnel, 1987; Hubert et al., 1989], all the tests are consistently detecting an overwhelming majority of stations (90% or more) as displaying nonstationarity (empty circles) over the 1950–1990 period (Figure 3a) at a 5% significance level. A focus on the results of the Pettitt test is displayed in Figure 3b. This test is one of the most widely used in the literature to detect breakpoint especially because it can localize the likely date of the break. In our case, 115 out of 126 series display a negative breakpoint, occurring between the years 1962 and 1976 with a marked mode in 1968–1970. Apart from confirming the findings of the previous studies cited above, this brief comparison between the seven tests shows that in a situation where nonstationarity is clearly the dominant behavior, all tests are performing very similarly: out of the 126 series, a subset of 117 are assessed as being nonstationary by all the trend-detection tests, and 113 stations are rejected by all tests (trend + breakpoint). Another comment relates to the fact that the existence of a breakpoint lead to the fact the trend tests often reject the stationarity hypothesis [Xiong and Guo, 2004]. This could be interpreted as pointing to the existence of a trend, which is obviously not the most appropriate conclusion. Thus, using trend tests alone may prove to be misleading, and it is recommendable to use trend tests in conjunction with breakpoint tests.
 The seven tests are then applied to the series of annual maximum daily rainfall. As in Figure 3a, the maps of Figure 4a show the stations for which the hypothesis of stationarity H0 is accepted or rejected at the 5% significance level. Here again, even though the balance of accepted/rejected series (75%/25%) is very different from what it is for the annual rainfall series (10%/90%), all the tests provide very similar results, which provides some robustness to our findings. The 20 to 25% nonstationary stations are randomly distributed over the whole study area, indicating that there is no subregion more subject to nonstationarity than another. The percentage of stations detected as being nonstationary (more than 20% for each test) is larger than the level of significance (5%). If the 126 series were statistically independent, then this would provide some ground to conclude that there is a significant nonstationarity in the region; however, given the important spatial autocorrelation of daily rainfall at the scale covered by the study area, it is not possible to conclude on the significance of these statistics.
 Additional interesting information is provided by the distribution of the 23 break dates obtained from the Pettitt test (Figure 4b); the breaks mostly occur at the end of the 1960s, in line with the dates found for the annual rainfall series. This suggests that there might be a link between a possible regional scale break in annual maxima series and the break found in total annual rainfall series. Since no clear conclusion valid at the regional scale can be drawn from applying nonparametric statistical tests to the point series of annual maxima, the issue of nonstationarity will be further explored by fitting stationary and nonstationary GEV models and comparing their respective performances in terms of likelihood.
4.2 Pointwise Time-Dependent GEV Models
 Figure 5 shows the results of the likelihood ratio test applied to the comparison of pointwise nonstationary GEV (PNSGEV*) distributions to their stationary counterpart for the 126 maxima series (PGEV). The map of Figure 5a displays the stations for which the likelihood ratio test indicates a significant break at the 5% level; it indicates that in this parametric context, the stationarity hypothesis H0 is rejected for a much larger number of stations (81 stations out of 126) than when working with nonparametric tests: over the whole network; these 81 rejections correspond to 74 negative breakpoints, seven positive breakpoints, and no linear trend.
 In Figure 5b, the distribution of the dates of negative (left) and positive (right) breaks are reported. The seven positive breakpoints are distributed around 1960, 1974, and 1982 and thus do not display any consistent pattern. The distribution of the negative break dates extends from 1956 to 1984 and has a concomitant median, mode, and mean located in year 1970 with 50% of the stations having a break between 1962 and 1978. A secondary mode also appears in the first years of the 1980s, reflecting the influence of years 1983 and 1984 which have been the driest years during the great Sahelian drought. All this suggests that the break in annual rainfall that has occurred at the end of the 1960s has been accompanied by a significant negative break in rainfall extreme series. The fact that the break dates distribution for maxima series is more dispersed than when dealing with annual rainfall relates to the larger sampling variance of rainfall maxima as compared to that of annual rainfall. This higher sampling variability manifests itself in two ways. First, extreme daily rainfall may occur at places on a globally dry year; conversely, on a wet year, heavy daily rainfall is not guaranteed since a large annual total may simply arise from a greater number of rainy days. Second, the statistical detection of breakpoints is sensitive to both the length of the available series and to the position of the break in the series. These two effects are illustrated in Figure 6.
 The Niamey station provides the longest period of record in our database from 1904 to 2010. Applied on annual rainfall series (Figure 6a), all statistical tests detect a negative breakpoint between 1969 and 1970, whatever the length of the series (1950–1990 period left panel or 1904–2010, right panel). Figure 6b shows the annual maxima Niamey series. On the left panel, the PNSGEV models are applied on the 1950–1990 period; on the right panel, on the whole available series (1904–2010). The date of the most likely break date (PNSGEV* model) is exactly the same suggesting that the break in 1979 is marked enough to not be influenced by the length of the series. One will note however a slight difference in the location parameter after the break mainly due to high values over the recent period (1990–2010). Applying a similar analysis to the Sagabari station (whole period of record from 1950 to 2000) leads to different results (Figure 6c). The break detection is highly sensitive to the sampling effect associated with the length of the series: the date of the break changes from 1983 when working on the 1950–1990 series to 1974 when working on the longer 1950–2000 series. The PNSGEV* parameters are also much different.
 While identifying a breakpoint in an individual maxima series may be of interest in the case of an isolated local case study, the sampling effects illustrated through the two examples above are an incentive to explore whether a global approach—which is not dealing with each series separately—could provide a more robust and synthetic vision of nonstationarity at regional scale.
4.3 Regional Time-Dependent GEV Models
 Following the methodology exposed in section 3.3, testing stationarity versus nonstationarity at regional scale can be done by exploring systematically possible nonstationary and comparing their likelihood to the likelihood of the stationary model (RGEV) obtained by maximizing equation (13). A total of 31 nonstationary models (RNSGEV) are adjusted here, one with a linear time trend [equation (15)] and 30 breakpoint models, each of these 30 models corresponding to a possible year (1955 to 1984) for the break t0[equation (16)].
 Figure 7 shows the negative log-likelihood values obtained with the RNSGEV linear trend model and the 30 RNSGEV breakpoint models. The value of the negative log-likelihood of RGEV is also reported in the figure. It is larger than all the values of the RNSGEV models, indicating a smaller likelihood. However, as explained previously, no direct conclusion can be drawn from this, since the RNSGEV models have an additional parameter as compared to the RGEV. On the other hand, it is meaningful to intercompare the negative log-likelihood values of the various RNSGEV models. The function has a clear minimum in 1969–1970 lower than the negative log-likelihood of the linear trend model. This indicates that if there does exist a breakpoint, then it is located at that date.
 The best RNSGEV (RNSGEV*) is compared to RGEV by using Quantile-Quantile plots (QQ-plots). Since the location parameter μis modeled as a two constant level process with a change point in 1969 in RNSGEV*, two nonoverlapping QQ-plots for the periods 1950–1969 and 1970–1990 are reported in Figure 8. The QQ-plots summarize the regional information by grouping all the stations together. The bulk of the distribution is better fitted by RNSGEV* than RGEV which seems to underestimate the extremes over the period 1950–1969 and overestimate them over 1970–1990.
 The significance of the break detected in 1969–1970, as opposed to the series being stationary, is evaluated by the permutation resampling procedure described in section 3.3.3. Figure 9 shows the empirical probability distribution of the negative log-likelihood values associated with the best RNSGEV models RNSGEV*(m); m=1,10,000. Out of the 10,000 RNSGEV*(m), only one performs better than the RNSGEV* fitted to the observed series. The resampling procedure is tantamount to assuming all the observed values as drawn from a stationary process. The fact that 0.01% of the series generated based on this assumption have a negative log-likelihood smaller than the one computed for the observed series, means that the breakpoint identified in the observed series is extremely significant—in other words, there is about 0.01% chances for such an observed series to occur if it were drawn from a stationary series. This is confirmed when using information criteria such as Akaike information criterion [Akaike, 1973], Bayesian Information Criterion [Schwarz, 1978], or Takeuchi Information Criterion [Takeuchi, 1976], all of them showing values of about 100 to 300 units lower for RNSGEV* than for RGEV.
 Combining the fact that our observed series is highly likely to arise from a nonstationary process and the fact that the nonstationarity of highest significance was found to be a breakpoint in 1969, provides a strong support for the existence of a regional-scale breakpoint in the Sahelian extreme daily rainfall in 1969. This rupture is concomitant with the one known to have occurred in annual rainfall series.
4.4 Sampling Effects on Local Series
 As presented before, the most likely nonstationarity at the regional scale is a break in 1969. Here we look at the impact of this regional break on local series by imposing t0=1969 in the PNSGEV model. Doing this, the sampling effect due to the break date estimation is removed. The underlying assumption is that all stations have been affected by the same phenomenon at the end of the 1960s. The remaining sampling effect is thus due to possible outlier and high (low) value in local series even in a dry (wet) period.
 Figure 10 illustrates this effect by mapping the rela-tive difference of the GEV location parameter between the 1950–1969 period and 1970–1990 periods. Figure 10a shows the relative difference obtained for the RNSGEV* model . Location parameter difference is 5.3 mm (μ0−μ1=−5.3 mm), this leads to a relative difference of 7% in the South-West to 12% in the North-East. Figure 10b illustrates the relative difference in the location parameter obtained with the PNSGEV models having a break in 1969 (t0=1969). The relative differences are computed at each station and next interpolated by ordinary kriging. The North-South gradient is clearly visible with relative differences in the North and 5% in the South. Globally, the pattern is the same as Figure 10a with higher relative differences in the North (around 15%) than in the South (around 7%). Here the map is characterized by a spotty pattern. Most of series are characterized by a negative difference, but locally, some series has positive differences.
 Following the example of Figure 1 which illustrates the main characteristic of the rainfall regime in the region and the high spatiotemporal variability of these features, Figure 10 illustrates the high extreme rainfall variability with some places that have experience higher extreme rainfall despite a strong negative breakpoint in the signal at the regional scale.
5 Synthesis and Discussion
 Assessing nonstationarity in extreme rainfall series is both challenging and of great hydroclimatic importance. The challenge comes from extreme rainfall series displaying a much large sampling variability in space and time than—for instance—annual rainfall series. At the same time, several recent publications, related to the effect of climate change, predict an intensification of the hydrologic cycle, and thus an increased probability of heavy rainfall, associated with global warming. It is thus important for the climate community to share tools that allow studying this nonstationarity issue in a consistent way.
 Acknowledging that, from a climatic point of view, nonstationarity versus stationarity must be assessed at the regional scale, two important factors must be taken into account: (i) the long-term series required for diagnosing nonstationarity of extreme rainfall are point measurements obtained from rain gauges; (ii) these point series are spatially correlated which makes it difficult to assess the statistical significance of a pattern of stationary and nonstationary series observed over a region.
 The method proposed here is thus based on a succession of three analytical steps going from a nonparametric testing of the point series to an integrated regional approach. It is applied to studying possible changes in the distribution of extreme rainfall in the Sahel associated with a very important and long-lasting drought of 20 years, based on the data of 126 daily rain gauges covering the period 1950–1990.
 The first analysis is a very classical application of statistical stationarity tests to the available point maxima series. The application to the Sahelian data set shows a good convergence between the seven nonparametric tests used; however, it is difficult to discriminate between trend detection and breakpoint detection, since the presence of a rupture can artificially lead to the detection of a trend. Furthermore, while 90% of the total annual rainfall series are tested as nonstationary, a strong indication of nonstationarity at regional scale, only 20 to 25% of the extreme daily rainfall data are tested as nonstationary, which makes it difficult to reach a significant conclusion regarding a change in the regime of extreme rainfall at regional scale.
 At a second stage, point series are analyzed in the parametric context of the generalized extreme value (GEV) distributions, by comparing the respective performances of time-constant and time-dependent models in terms of a likelihood function. Using a time-dependent formulation of the GEV distribution provides a new insight into comparing parametric and nonparametric approaches for testing nonstationarity in point series. In the application to the Sahelian case, the proportion of maxima series detected as nonstationary increases to about 65% (81 series out of 126). However, the break dates of these 81 series are rather scattered between 1955 and 1985, and seven of them are diagnosed as displaying a positive break, against 74 displaying a negative break. Setting the study in a parametric context thus leads to asserting that the existence of a nonstationarity in the regime of extreme rainfall is in fact likely. But it remains difficult to obtain a coherent regional vision of this nonstationarity.
 The last step thus makes use of an original regional space-time GEV distribution which is used to detect changes in rainfall extremes directly at regional scale. In addition to providing a synthetic and global testing of nonstationarity, this regional model also provides a direct regional modeling of the extreme value distribution, as well as a straightforward representation of the evolution of its parameters in time—whether as a trend or through a breakpoint. In the Sahelian case study, it was found that the distribution of daily rainfall maxima underwent a significant breakpoint in 1969–1970, the location parameter having decreased by 5 mm, equivalent to a relative decrease of 6% in the South of the study region and of 12% in the North. Statistically, this result is in line with the finding of Le Barbé et al. that the big Sahelian drought was associated with a decrease of rain occurrence rather than with a decrease of the intensity of the large rain events (a decrease of the intensity of the largest rain events would mean a change of the value of the shape parameter and not only a change of the value of the location parameter). It is worth noting that currently, climate models are not able to reproduce this type of change in the rainfall regime. Not all of them are able to simulate correctly the Sahelian drought; those who do show a significant rainfall deficit for this 1970–1990 period cannot realistically attribute it to either a smaller number of rain events or to a decrease of the intensities of these events, because rain events are not properly identified in these models, for both resolution and parametrization reasons. This is why studying the climatology of the extremes in GCMs remains a challenge in terms of significance of the results.
 Beyond the specific result obtained on the Sahelian case study, there are methodological implications of larger bearing, the main one being that there is a strong rationale in going through the three steps described above. The fact that testing point series in a nonparametric context does not provide conclusive results is easily explained by the fact that extreme rainfall series are very noisy and relatively short (41 years in our case); this is a very common situation that will be faced by anyone searching to detect a possible acceleration of the hydrologic cycle. In the Sahelian case study, the small number of stations where the stationary hypothesis H0 was rejected could have well led to accept it. This is especially true since these stations did not present any consistent spatial pattern; therefore, the existence of rejected series could have been attributed to the sole sampling effect and to the existence of isolated values recorded at these stations during the wetter period.
 By adding, in a second step, an information on the point distribution of the rainfall maxima the parametric approach limits the sampling effect associated with potential outliers, which is helpful in situations of prominent outliers.
 The last stage of gathering all the stations in one unique sample, allows detecting the main nonstationarity having affected the extreme series over the whole study region. However, the use of the regional approach requires a preliminary testing of the two-point approaches in order to check that there are no dramatic inconsistencies in the behavior of the point series that would make the regional approach irrelevant.
 The integrated approach proposed above for detecting possible changes of the extreme rainfall distribution at regional scale has been applied to a case study for which a strong climatic signal was known to exist (the Sahelian drought of the end of the XXth century), in order to test its capacity. An obviously interesting result of the study is that the classical testing of point series may not permit to assert any change in the regime of the extremes even in the presence of such a strong signal. We thus look forward to applying the global regional approach to other places in order to determine regions where the probability of extreme rainfall could have increased significantly in the last 2 decades.
 The present study focused on detecting past nonstationarities in observed series by using the most appropriate temporal and spatial covariates over the specific region and period of study. An extension of the regional time-dependent statistical models would be to forecast future extreme distributions. For this purpose, the use of a spatial covariate constant in time, as the mean annual rainfall, can be a limitation as its pattern might change in the future. Further work is required to select alternative covariates able to represent potential changes in extreme rainfall spatial patterns.
Appendix A: Statistical Tests Used
 A brief description of the principle of the seven statistical tests is given here.
 The Pearson test is used to detect linear relationships between variables. For independent pairs from a bivariate normal distribution, the distribution of Pearson's correlation coefficient follows a Student's t-distribution. This distribution is thus used to estimate the p-value of the observed Pearson's coefficient.
 Mann-Kendall [Mann, 1945; Kendall, 1975] and Spearman's rho [Lehmann and D'Abrera, 1998] tests are two nonparametric rank-based tests. They are used to detect monotonic trend in time series. By using a simulation procedure, Yue et al.  demonstrates that these two tests have similar power to detecting trend in time series.
 The Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test [Kwiatkowski et al., 1992] can be used for testing a null hypothesis that an observable time series is level stationary. The series is expressed as the sum of a stationary level, a random walk, and a stationary error. The test is the Lagrange multiplier test of the hypothesis that the random walk has zero variance.
 The Smadi and Zghoul test [Smadi and Zghoul, 2006] is used to detect breakpoint in time series. The statistic of the test is a simple cumulative sum of deviation from the mean. The distribution of the statistic is evaluated by a permutation procedure (in our application 10,000 random permutations are used).
 Pettitt's test [Pettitt, 1979] is a nonparametric rank-based test. It is used to detect breakpoint in time series.
 Lombard's test [Lombard, 1987; Quessy et al., 2011] is able to test smooth-change against stationarity. It also able to detect breakpoint (Pettitt is a particular case of Lombard's test), trend or smooth breakpoint. This test is clearly the most flexible between the seven tests applied in this study.
 This study was funded by IRD, INSU, and French ANR project ESCAPE. It has benefited from the access to rainfall data sets provided by the AMMA international program, DMN Burkina, and DMN Niger: we greatly thank all of them, as well as the people at the LTHE computation center (Véronique Chaffard, Patrick Juen, and Wajdi Nechba) for their technical support. SOFRECO and ANRT are gratefully acknowledged for jointly supporting Geremy Panthou's PhD grant (contract 0054/2010).