How should trends in hydrological extremes be estimated?


  • Robin T. Clarke

    Corresponding author
    1. Instituto de Pesquisas Hidraulicas, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
    • Corresponding author: R. T. Clarke, Instituto de Pesquisas Hidraulicas, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS 91501-970, Brazil. (

    Search for more papers by this author


[1] A comparison of six procedures for estimating the linear trend parameter β in annual maximum 1 day river flows at five sites in southern Brazil showed marked differences between, on the one hand, estimates obtained by incorporating trend into the generalized extreme value (GEV) location parameter with all parameters estimated by maximum likelihood (ML) and on the other hand, estimates found by least squares, trend removal prior to fitting the GEV by ML, boot-strap sampling, and Theil-Sen estimation. ML estimates of trend were considerably smaller than those given by all other procedures. The same was true where trend had been incorporated into the Gumbel location parameter. Where 95% confidence intervals were calculated for the “true” trend β by different procedures, some confidence intervals bracketed zero (indicating that the trend was not “significant” at the 5% level), but there was no consistency between results from different procedures; Theil-Sen confidence intervals always bracketed zero, confidence intervals given by detrending never did. It is concluded that not only do different estimation procedures give different measures of trend uncertainty, as reported elsewhere, but the estimated trends themselves may differ, and the paper suggests an explanation why this may occur. Some philosophical issues relating to estimation of trend in climatological and hydrological extremes are discussed, and it is concluded that selection of a method to estimate trend must depend on context.

1. Introduction

[2] Recent decades have seen a surge in analyses of hydrological and climatological data to seek evidence of trends brought about by anthropogenic influences. In terms of trends in river flows, the international literature reports many recent studies of trends in annual and seasonal flows [e.g., Hannaford and Buys, 2012; Burn et al., 2010; Wilson et al., 2010; Novotny and Stefan, 2007; Hodgkins and Dudley, 2006] some of which have included analyses of trends in high flows [Marsh and Harvey, 2012; Petrow and Merz, 2009; Hannaford and Marsh, 2008; Svensson et al., 2006]. The present paper addresses some issues relating to the analysis of trends in high flows, in a region where land-use change from native forest to arable cropping is likely to be at least as influential on extreme river flows as any existing or potential climate change over the last 80 years. In particular, the paper is concerned with the apparently simple issue of how to estimate trends in annual maximum 1 day river flows, although the methods discussed also have relevance to the estimation of trends in annual maximum rainfalls of any given duration, and trends in annual maxima of climatological variables.

[3] There is a very extensive literature on the use of extreme value distributions for describing the variability amongst “block maxima” (such as the series obtained by abstracting maximum values during periods or blocks, typically years) with the generalized extreme value (GEV) distribution given by

display math(1)

where μ, σ, and ξ are parameters of location, scale, and shape, respectively; q denotes the annual maximum 1 day flow. The methods of Hosking and Wallis [1997] based on estimation of the GEV parameters (μ, σ, ξ) by L-moments have proved simple to use, and software is also widely available for fitting GEV distributions by maximum likelihood (ML). Coles [2001, chap. 6] has described how, in the presence of trend in the series of block maxima, the GEV distribution can be adapted to estimate trends in any of the parameters, typically and most commonly by fitting the modified distribution GEV(μ(t), σ, ξ) where μ(t) = α + βt; extensions to GEV(μ(t), σ(t), ξ(t)) are also possible. In all such approaches, it is assumed that annual maxima are statistically independent. This paper considers only the simpler model GEV(α + βt, σ, ξ), with particular emphasis on the estimation of β, and on the uncertainty in this estimate as measured by its 95% confidence interval. The linear trend parameter β is estimated by ML using the statistical package GenStat [VSN International, 2012]; other packages (e.g., ismev, [Heffernan and Stephenson, 2013], and related packages evd, evdbayes, lmom, POT) allow more general GEV models to be fitted with link functions relating parameters to predictors [Coles, 2001, section 6.1], which may include other predictors as well as time. Also, since the GEV distribution reduces to the widely used Gumbel distribution when the GEV shape parameter ξ is zero, the paper includes this too, using a form in which the Gumbel parameter μ is μ(t) = α + βt.

[4] A GEV distribution with time-variant location parameter is not the only way to estimate trend parameters when annual maxima are statistically independent. The many alternatives include bootstrap estimation [e.g., Venables and Ripley, 1999] and Theil-Sen distribution-free estimation [e.g., Hannaford and Buys, 2012], two methods that are used in this paper. Ordinary least squares also provides a valid estimate of the trend parameter β, but the fact that extreme flows are commonly heteroscedastic, with larger extremes having greater variances, means that confidence limits for β calculated on the assumption that the residual variance σe2 is constant will no longer be valid. Another estimate of β explored in this paper is given by using least squares to estimate the trend coefficient β, then calculating the residuals {εt} to which a GEV (or Gumbel) is fitted, the location parameter then being no longer time-dependent. An advantage of such a procedure is that ML estimation requires a search over the 3-D (μ, σ, ξ) space instead of the 4-D (μ, σ, ξ, β): a considerable advantage if many sequences are to be analyzed for trend. Other methods of trend estimation (M-estimators, least median of squares (LMS), least trimmed squares, S-estimation, MM-estimation) might have been included, but most have drawbacks [Venables and Ripley, 1999] and in any case, are not widely used by hydrologists and climatologists. Nor does the paper include the estimation of trends in “peaks-over-threshold” (PoT) models, which can also be modeled, in very general forms, by the ismev software referred to above.

[5] Thus, the purpose of the paper is to compare estimates math formula of the trend coefficient β obtained from a number of possible estimation procedures and to compare their uncertainties as measured by these estimates' approximate confidence intervals. The following sections describe the data used, the analytical procedures by which they were analyzed, and the results. A discussion follows, and conclusions are stated.

2. Data

[6] The primary data source was measurements of daily mean flow over the period 1940–2010 at five gauging stations in the southernmost Brazilian state of Rio Grande do Sul, a region where clearance of native forest and land-use change to agriculture has been extensive over the period of record. For one site, 14 de Julho, the record was even longer: 1931–2010, but with the two years 2004 and 2005 missing. The five sites were used because they are the longest in the region, and because they are almost complete. Tables 1 and 2 give details of location, upstream drainage area, and mean annual rainfall for the five sites. From the records of daily mean flow, the maximum daily flow in each year of record was abstracted. Figure 1 shows a plot of the five sequences of annual maximum flows for the period 1940–2010, standardized by subtracting each sequence mean and dividing by each sequence's standard deviation, so that the vertical axis is dimensionless. Visual inspection suggests a slight positive trend in annual maxima over the period at some sites, as well as substantial cross correlation between sites. When each record is divided into four roughly equal periods, a Bartlett test shows evidence of variance heterogeneity at four of the five sites (χ2 values 9.32, 8.85, 8.08, 13.32, 6.60; p values 0.025, 0.031, 0.044, 0.004, 0.086, all with 3 degrees of freedom); but where variances differ significantly, they do not show an increasing trend with time. Variance heterogeneity was also found where data were divided into four groups by magnitude, as well as by time. Modeling the variance of annual maxima, and of cross correlation, is not discussed further in this paper, which focuses on the estimation of linear trend in the measured variable.

Table 1. Descriptions of Drainage Basins, Rio Grande do Sul, Southern Brazil
Gauge SiteCodeRiverLatitudeLongitudeArea (km2)Annual Pa
  1. a

    Periods: 11944–2010, 21944–2010, 31957–2010, 41930–1985, 51958–2008.

14 de Julho284R. Antas−29.0648−51.674912,83516131
Castro Alves98R. Antas−29.0056−51.3844778416172
Chapeco94R. Uruguai−27.1416−53.044952,94916503
Ernestina110R. Uruguai−28.5556−52.5456104716464
Ita92R. Uruguai−27.2767−52.382243,95416235
Table 2. Summary Statistics of Annual Maximum 1 Day Flowsa
  1. a

    Q1 and Q3 are the first and third quantiles.

14 de Julho11702045293839106912
Castro Alves5591097149923914507
Figure 1.

Annual maximum 1 day flows at five sites, standardized by subtracting the mean of the sequence and dividing by its standard error.

3. Analytical Method

[7] The analysis is to compare estimates of trend, and the uncertainties in those estimates (as measured by 95% confidence intervals), when different analytical procedures were used for trend estimation. Because the flow sequences at the five gauge sites are annual maxima, it is appropriate to assume that serial correlation is absent, and this was quantified by calculating the correlogram and a portmanteau test for each sequence [Box et al., 2008]. For the site 14 de Julho, Figure 2 gives a Q-Q plot comparing sequence quantiles with quantiles of a standard GEV distribution. Figures for the other four sites are similar and are not shown. In all cases, quantiles of the flow sequence lay comfortably within the 95% confidence interval for the Q-Q plot, suggesting that a GEV distribution, possibly with superimposed linear trend, should be an adequate description of the statistical characteristics of annual maximum flows. ML procedures [Coles, 2001, chap. 6] were used to estimate the in-built linear trend of the form μ(t) = α +βt (Coles also used an exponential trend in the dispersion parameter σ, but this was not explored in this paper). Thus, the following procedure was used to calculate an estimate math formula of a linear trend parameter β, at each of the five sites in Tables 1 and 2:

display math(2)
Figure 2.

Site 14 de Julho: quantile-quantile plot when GEV distribution is fitted.

with the four parameters α, β, σ, and ξ estimated by ML. Standard errors for estimates of the parameters were calculated from the information matrix math formula evaluated at the maximum of the log-likelihood function log L (where math formula are the four parameters taken in pairs) and were used to calculate large-sample confidence intervals for the trend β. The mean of the distribution in (2) is

display math(3)

which provides a linear trend with trend coefficient β. It is shown below that at each of the five sites, the shape parameter ξ was not large relative to its standard error, suggesting that the simpler Gumbel distribution, with linear trend incorporated, might also be an acceptable model for the purposes of this study. Therefore, the GEV reduces to

display math(4)

which was also used to estimate the trend parameter β, and the standard error of math formula, from which approximate 95% confidence intervals were calculated. These estimates are also shown in Tables 3-5.

Table 3. Trends (m3 s−1 yr−1) Obtained When Trend Is Incorporated in Location Parameter of GEV and Gumbel Distributionsa
 14 de JulhoCastro AlvesChapecoErnestinaIta
  1. a

    That is, GEV(μ + βt, σ, ξ) and Gumbel (μ + βt, σ) (Methods 1 and 2), by removing trend and fitting GEV(μ, σ, ξ) and Gumbel (μ, σ) to the residuals (Methods 3 and 4); by bootstrapped linear regression (Method 5) with 600 bootstrapped samples, and by using the Theil-Sen estimate (Method 6) with 600 bootstrapped samples.

3 ]     
4 ]20.8818.9986.503.8560.00
5 ]     
Estimates of Shape Parameter ξ in GEV(μ + βt, σ, ξ), and of ξ in GEV(μ, σ, ξ) After Detrending
Method 10.12510.22490.08940.03720.1426
Method 3−0.07250.05970.0340−0.11330.1202

[8] As a third method, a linear trend was estimated at each site, using the same expression as that used in ordinary linear regression, with

display math(5)

where ti is the ith year and N is the number of years in the sequence (i = 1, …, N). For any linear trend model such that qt = α + βt + (possibly, a function of model parameters which does not involve the data qt) + εt, the expression in (5) is an unbiased estimator of β, whatever the distribution of the residuals εt, whether or not residuals are uncorrelated, and whether or not the residuals εt have homoscedastic variance. After estimating the trend β using (5), the sequence was “detrended,” and the GEV distribution given by (1) was fitted by ML to give estimates of the remaining three parameters α, σ, and ξ. The variance of a random variable Q with a GEV distribution is (σ/ξ)2(g2 − g12) where gk =  math formula(1 + ), k = 1, 2, so that if the N years are consecutive, the standard error of the trend estimate math formula given by (5) is the square root of

display math(6)

[9] As mentioned, an advantage of this procedure is that if the trend can be estimated efficiently by least squares, estimation of the GEV parameters then requires a search over the 3-D space of α, σ, and ξ instead of the 4-D space for α, σ, ξ, and β.

[10] Therefore, the following method was also used:

[11] Method 3: Trend estimated by math formula

with standard error given by the square root of (6), and with σ and ξ estimated from the detrended data sequence.

[12] The variance of a random variable Q with a Gumbel distribution is σ2π2/6, so that a fourth estimator is given by:

[13] Method 4: Trend estimated by math formula

with standard error the square root of var math formula.

[14] In both Methods 3 and 4, the detrended sequences no longer consist of statistically independent values, because the estimated residuals are linear functions of the original data sequence {qt}, t = 1, …, N. It can be shown [e.g., Johnson and Wichern, 2007] that the detrended residuals, denoted by r*, are given by r* = (I − H)q where H is the “hat” matrix H = X(XTX)−1XT, I is the unit matrix, and X is the N × 2 matrix with 1s in the first column and the year numbers in the second. Thus, the covariance between the rth and sth residuals, obtained after detrending when data are from a GEV distribution, is

display math(7)

[15] In the case of the Gumbel distribution, the expression in square brackets is multiplied by σ2π2/6. This covariance structure differs from the covariance structure of (e.g.) a lag-one autoregression, so the type of portmanteau test used to test for serial correlation between values in the original data sequence is no longer appropriate. Thus, while the estimate of trend given by (5) is valid, the presence of correlation between residuals after detrending means that the usual log-likelihood function is incorrect. This may affect estimates of the parameters μ, σ, and ξ of the GEV distribution, and hence the standard error of the trend estimate given by (5).

[16] Methods 1–4 above are all based on probability distributions for the sequences of annual maximum flows. To compare their performance with methods which do not use such assumptions, two distribution-free procedures were used: namely, a bootstrap estimate of trend and the Theil-Sen estimate.

[17] Method 5: For a data set of length N years, N pairs of values (ti, qi) were drawn with replacement, where ti is the year number and qi (i = 1, …, N) is the annual maximum for that year. The slope (given by (5)) and intercept of the sample were calculated, and the procedure was repeated 600 times; all simulations and boot-strap calculations used 600 repetitions, since this number was recommended by Wilcox [2012] for calculating confidence intervals for the Theil-Sen trend estimate described below. Quantiles corresponding to probabilities 0.025 and 0.975 then gave a 95% confidence interval for the trend β. This procedure is equivalent to a randomly weighted regression [Venables and Ripley, 1999]; as an alternative, they suggest [Venables and Ripley, 1999, section 6.6] model-based resampling in which the residuals about the fitted regression are resampled. However, the calculated residuals r* do not then have the correct variance, or even the same variance as each other. Corrections are possible [Venables and Ripley, 1999], but model-based resampling was not pursued further in this paper. Methods 3–5 all estimate trend by the expression in (5).

[18] Method 6 estimated trend by means of the Theil-Sen distribution-free procedure. Trends were calculated as follows: (i) at each site (except for 14 de Julho), the set of M = 70 × 71/2 = 2485 differences (qk − qi)/(k − i) for i < k. The median of these differences gave the Theil-Sen estimate of trend β; (ii) 600 samples of size N annual maxima were drawn, with replacement, from the record of length N years, and the Theil-Sen estimate of trend was calculated from each of the 600 samples. Quantiles corresponding to probabilities 0.025 and 0.975 then gave a 95% confidence interval for the trend β. For the site 14 de Julho, where two years were missing, the longest period of unbroken record (1931–2003) was used, giving M = 2701. This procedure differs from that given by Wilcox [2012] who resampled the M differences with replacement. It also differs from the variant of the Theil-Sen estimator given by Siegel [1982] who determined, for each sample point (ti, qi), the median mi of the slopes (qj − qi)/(tj − ti) of lines through that point and then calculated the overall estimator as the median of these medians.

4. Results

[19] Table 3 shows estimates of the linear trend parameter estimated by these six methods. Table 3 shows that differences between estimates of trend obtained by Method 1 (fitting a GEV by ML, with trend incorporated in the location parameter) and Method 3 (detrending by least squares, before fitting a GEV distribution to the detrended series) are large. Trend estimates from Method 3 are in all cases much larger than the ML estimates and in one case (14 de Julho) are several times larger. The same is true for trend estimates obtained by Method 2 (Gumbel, with trend incorporated in the location parameter) and Method 4 (Gumbel fitted after detrending). Where the location parameters of GEV and Gumbel distributions have trend incorporated, Gumbel estimates of the trend coefficient β are always larger than those from fitting the GEV distribution, despite the fact that the shape parameter ξ is not large, relative to its standard error, at any of the five sites, as shown at the bottom of Table 3. For Site 14 de Julho, Figure 3 compares the trend estimated by fitting a GEV with trend parameter incorporated (Method 1), with the least squares estimate of trend (Methods 3–5); visually, the least squares fit looks the better fit. Both Theil-Sen estimates of trend β and the least squares estimate of β are very considerably greater than estimates of β obtained from both GEV and Gumbel distributions, with or without detrending. As reported [e.g., Frei, 2011], the Theil-Sen estimates of β are smaller than the math formula found by least squares.

Figure 3.

Site 14 de Julho: trends estimated from GEV distribution with trend parameter incorporated (Method 1: broken black line), and by least squares fit (Methods 3–5: broken red line). Symbols: YML, trend by maximum likelihood; YR, year; YLS, least squares; Q, annual maximum discharge.

[20] Table 4 shows 95% upper and lower confidence limits and the width of the 95% confidence intervals, for linear trends estimated by the six methods at the five sites. For fits by ML, it is assumed that large-sample properties hold so that the approximate 95% confidence intervals are given by ±2 × SE, where the standard error SE is found as described above. Where the confidence interval brackets zero, the estimated trends are consistent with a null hypothesis of zero trend; where zero is not bracketed, the null hypothesis of zero trend would be rejected. In terms of statistical significance, therefore, results given by the five methods are very different. For the site 14 de Julho, the confidence interval from Method 1 (GEV with incorporated trend) brackets zero, and so does the confidence interval for Method 2 (Gumbel with incorporated trend), Method 5 (bootstrapped linear regression), and Method 6 (Theil-Sen), while the “detrended” methods (Methods 3 and 4) both indicated that the null hypothesis of zero trend should be rejected. At all five sites, Theil-Sen confidence intervals bracketed zero, showing that Method 6 always gave estimates of trend consistent with the zero-trend hypothesis. Confidence-interval widths often differed substantially between methods at any given site, but no consistent pattern emerges; the widths of confidence intervals where data are detrended are usually greater than where trend is built into a distribution's location parameter, and the differences are sometimes large.

Table 4. Approximate 95% Lower and Upper Confidence Limits (Denoted by L and U) and Width of Confidence Intervalsa
 14 de JulhoCastro AlvesChapecoErnestinaIta
  1. a

    Units: m3 s−1 yr−1.

Method 1
Method 2
Method 3
Method 4
Method 5
Method 6
Table 5. Estimates of Location Parameter α (=µ) and Scale Parameter σ, From Methods With and Without Detrending by Least Squaresa
  1. a

    Values of ξ, where relevant, are shown at the bottom of Table 3.

Method 1 (GEV, Trend Parameter Incorporated in Location Parameter)
Method 2 (Gumbel, Trend Parameter Incorporated in Location Parameter)
Method 3 (GEV, Data Detrended)
Method 4 (Gumbel, Data Detrended)

[21] By comparing the upper and lower confidence limits with the least squares estimates of trend shown in Table 3, it can also be seen that at three of the five sites, the least squares estimate of trend lay outside the confidence interval for the ML estimate; at two of the five sites, the Theil-Sen estimate lay outside the confidence interval for the ML estimate. Interpretation of this result requires caution, however, because both the least squares and Theil-Sen estimates are also subject to uncertainty, and because records at the five sites are to some extent cross correlated.

[22] Table 5 shows how estimates of the location and scale parameters, α (=μ) and σ, differ when they are estimated from GEV distributions with the trend parameter incorporated, and from GEV distributions after linear trend has been removed (i.e., Methods 1 and 3). Estimates of the GEV distribution's α (i.e., the location parameter of the detrended residuals) are consistently greater when data are detrended, but the differences are not large; the same is true of estimates of σ, the scale parameter, but here the differences between Methods 1 and 3 are substantially greater. For the Gumbel distribution, differences between estimates of α with and without trend removal (Methods 2 and 4) are very small; estimates of the scale parameter are greater where data are detrended, but the increases are smaller than those for the GEV distribution. Detrending has a marked effect on estimates of the shape parameter ξ shown in Table 3; these, and their standard errors, were substantially reduced where the data were detrended before fitting a GEV.

[23] The main point emerging from Tables 3-5 is that when a trend coefficient is incorporated into the location parameter (whether GEV or Gumbel) of annual maximum 1 day flows, the estimates of trend were markedly different from any estimates of trend obtained by least squares, or by Theil-Sen, or even fitting by eye (see Figure 3). To explore this in greater detail, samples of size 70 were drawn from five GEV distributions with parameter values equal to those given in Tables 3-5 (in the case of the gauge 14 de Julho, for example, the GEV parameters were α = 2522, σ = 1063, ξ = 0.1251, β = 5.626). Six hundred samples were drawn in each case. For each sample, the trend β was estimated (i) by ML (Method 1), (ii) by ordinary least squares (giving the estimated trend for Methods 3–5), and (iii) by the Theil-Sen estimator (Method 6). Table 6 gives statistics derived from the 600 generated samples. Comparisons by t test between the means of the math formulas over all 600 simulations and the “true” trends shown at the top of Table 6 show that none of the three methods shows evidence of major bias, and indeed theory shows that the ML estimates math formula are unbiased when the sample of years is large. Provided that the trend is linear, least squares estimates math formula are unbiased whatever the length of record. Theory also shows that given certain regularity conditions [Kendall et al., 1983] no other estimation procedure will give estimates math formula with smaller variance than ML estimates when the length of record is sufficiently long (when ML estimates are “asymptotically efficient”). Table 6 also shows standard deviations (SDs) of the 600 samples at the five sites. The SD for ML estimates math formula is substantially smaller than the SDs of either least squares or Theil-Sen estimates, and the SDs of the Theil-Sen estimates are substantially smaller than the SDs given by least squares. Figure 4 shows a scatter diagram for the Site 14 de Julho; the greater variability in the least squares estimate “pulls” the line toward the horizontal, giving a slope (0.393) considerably less than 45°.

Table 6. Statistics Derived From 600 Simulated Samples, at Each of the Five Sites, Drawn From GEV(α + βt, σ, ξ), With Parameters α, σ, ξ, β Given in Tables 3-5a
 14 de JulhoCastro AlvesChapecoErnestinaIta
  1. a

    Means, standard deviations, and standard errors of means are given for 600 maximum-likelihood estimates math formulaML, 600 least squares estimates math formulaLS, and 600 Theil-Sen estimates math formulaTS. Correlations between maximum-likelihood and least squares estimates are denoted by rML,LS. Similarly for rML,TS and rTS,LS.

“True” value5.6269.79558.001.93844.610
Mean math formulaML5.7429.72758.7571.93144.122
SD( math formulaML)±6.083±3.059±21.612±0.659±17.083
Mean math formulaLS6.0949.60856.9011.92944.437
SD( math formulaLS)±9.496±6.489±31.588±0.853±30.797
Mean math formulaTS5.8799.70457.5461.93144.619
SD( math formulaTS)±7.589±4.138±25.469±0.753±22.367
Figure 4.

Estimate b of trend coefficient β in 600 samples of size 70 “years” drawn from a GEV distribution with parameters equal to those estimated for the site 14 de Julho. Red line shows the 45° line. Ordinates along vertical axis are the estimates of β when trend was incorporated into the GEV distribution; abscissa along the horizontal axis are estimates of β found be least squares (as for bootstrapped estimate, and detrended values subsequently fitted to GEV(α, σ, ξ)).

[24] Correlations between the estimates math formula obtained from the 600 samples by ML, least squares, and Theil-Sen are shown at the bottom of Table 6; the correlation between ML and least squares estimates of trend is lowest at all sites, ranging from about 0.5 to 0.75. Correlation between ML and Theil-Sen estimates is considerably greater, while correlations between least squares and Theil-Sen estimates are in the range 0.82–0.92.

[25] These correlations, together with the asymptotic efficiency of ML estimates mentioned above, suggest an explanation for the differences between ML and other trend estimates shown in Table 3. The argument is as follows. Consider the probability distributions of the ML and least squares estimates of trend (given, in the case of Figure 4, by the projections of points onto the vertical and horizontal axes, respectively). The dispersion of the former (ML) distribution will be less than that of the latter because of ML asymptotic efficiency; both distributions will be centered about the true trend value, however, because both ML and least squares estimators are unbiased (the former in large samples, the latter in samples of any size, when the trend is truly linear). Hence, when the sample of years in a record of annual maxima gives a trend that is large and positive when estimated by least squares, the ML estimate of trend will be smaller (closer to the central value common to both distributions) but still positive (because ML and least squares estimates are positively correlated). The converse must also be true: when the sample of years in the record of annual maxima shows a trend that is small or even negative when estimated by least squares (i.e., the least squares estimate lies in the left-hand tail of its probability distribution), the ML estimate will be greater (again, closer to the central value common to both distributions). A similar argument is developed by substituting “Theil-Sen estimator” for “least squares estimator.”

5. Discussion

[26] The trends in the estimation methods described above are linear trends, and it could be argued that changes may be episodic rather than linear. The many methods for modeling episodic changes over time include piecewise or “broken stick” regression, with or without continuity at the breakpoints, the number of which may be known or unknown [Khodadadi and Asgharian, 2008; Toms and Lesperance, 2003]. Erdman and Emerson [2007] following Barry and Hartigan [1993] used Bayesian methods to estimate posterior distributions of parameters θi {i = 1, …, b} where a data sequence {Xi}, i = 1, …, n is modeled as b “blocks” with the ith block specified by θi which may be (e.g.) a distribution mean or a regression coefficient. In the physical context of hydrological extremes, however, results from any analysis of episodic trends will be more acceptable where changes are known to have occurred at specific times, and from specific causes, rather than found by using a computer algorithm to determine the number of change points and the change magnitudes. As far as is known, data used to illustrate methods described above were not subject to any such specific changes; in the absence of such knowledge, a linear trend gives a simple description of the way that annual maxima changed over the period of record.

[27] In an important paper on the calculation of trends and their uncertainties, Cohn and Lins [2005] explored aspects of the statistical significance of trends in hydroclimatological time series, concluding that “while trend magnitude can be determined with little ambiguity, the corresponding statistical significance… is less certain because significance depends critically on the null hypothesis, which in turn reflects subjective notions about what one expects to see.” They also concluded that “it may be preferable to acknowledge that the concept of statistical significance is meaningless when discussing poorly understood systems.” The results presented above suggest that uncertainty is a characteristic not only of the statistical significance of a trend, as Cohn and Lins state, but also of its magnitude. Two sets of procedures for estimating trend in annual maximum 1 day flows, both arguably valid, have been shown to give very different results in some specific cases: one set, based on the almost universal assumption in frequency analyses that annual maxima are statistically independent with some kind of extreme value distribution (GEV or Gumbel), and another set which assumed statistical independence but no distributional form (bootstrap, Theil-Sen). Table 3 shows that estimates of trend given by the two sets of procedures can be markedly different. Furthermore, as confirmation of the statement from Cohn and Lins [2005] quoted above, the two sets of procedures lead (in the case of the five records here considered) to different conclusions about the statistical significance of trends, shown by Table 4.

[28] It can be argued [Koutsoyiannis, 2006] that analyzing time series of hydrological or climatic data for trends is in any case illogical and that “a stochastic approach hypothesizing stationarity and simultaneously admitting a scaling behaviour reproduces climatic trends (considering them as large-scale fluctuations) in a manner that is logically consistent.” Such an approach, based on the well-known Hurst coefficient, “does not require the separation of the time series into two or more components, so it does not attempt to de-trend the original series. It admits that the existence of trends is the normal behaviour of real world time series” [Koutsoyiannis, 2006]. However a careful reading of his text suggests that Koutsoyiannis' criticism is leveled principally at the fallacy of concluding that such “trends” are deterministic. But the term “deterministic” must imply that present behavior determines a system's future behavior; in the present context where annual extreme 1 day flows are analyzed over a period of 70–80 years, changes—where they exist—are almost certainly the consequence of land-use change from native forest to arable cropping, with no “deterministic” interpretation about future behavior possible. It can therefore be argued that in such circumstances, a linear trend coefficient is a useful summary of past behavior (although it can never be more than that; certainly no extrapolation to future years is possible). In a sense, a trend coefficient b and H coefficient are complementary entities: the former is a statistical summary of time series behavior over a specified period of the past, and the latter is a statistical summary of future long-term behavior over an indeterminate (but long) period; neither conveys information about deterministic influences that caused, or will cause, variability observed in the time series.

[29] It has been mentioned that the Methods 3 and 4, which detrended the sequence of annual maxima before fitting GEV or Gumbel distributions to the detrended residuals, introduces a correlation (although not a serial correlation) amongst them. The issue of introduced correlation appears to be a much wider issue than in the present paper. Where, for example, future “scenarios” of annual climatic extremes are produced by Global Climate Models (GCMs), the annual (or seasonal) extremes in the scenario will all be functions of GCM parameters and their initial values and will remain so however many “members” are calculated in the scenario. A similar argument would appear to hold where coupled climate-hydrological models are used to produce future scenarios of extreme flows. Further research is required to explore whether such introduced correlations are important or whether they can be safely ignored.

[30] In conclusion, we return to the title of this paper: how should trends in hydrologic extremes be estimated? The answer is that the method must depend on the context, and no single method can be recommended for all circumstances. We illustrate with three cases: (i) where the context requires exploration of the causes of trend (i.e., determination of whether certain predictors explain the trend fully, partially, or not at all) then a parametric approach is required, with assumptions—to be verified subsequently—about underlying probability distributions that allow the use of likelihood theory for hypothesis testing (or in a Bayesian context, for calculating the posterior probability distribution of a trend). This would rule out the use of distribution-free methods such as bootstrapping or Theil-Sen estimation. If, however, (ii) the context requires the exploration of a regional average trend of, say, annual maximum n day rainfalls measured at P sites within the region, distribution-free methods could be used; the trend could be estimated by either bootstrapping or Theil-Sen at each of the P sites, with standard geostatistical methods used to calculate the regional trend and its uncertainty. Again (iii) if it were required to estimate the difference between trends at two or more sites, so that estimating differences between trends is more important than obtaining good estimates of the trend at each site separately, then any of the methods compared above could be used (although if the uncertainty in the differences were also to be estimated, this would complicate the use of Theil-Sen and bootstrap methods). Whichever methods are used, the estimates of trend may differ from method to method; the uncertainty as measured by confidence intervals may differ; and (with due reminder of the warning of Cohn and Lins [2005] referred to above) if significance-testing is required, conclusions about significance may vary from one method to another.

6. Conclusion

[31] A comparison of six procedures for estimating trend β in annual maximum 1 day river flows at five sites showed marked differences between, on the one hand, estimates obtained by incorporating trend into the GEV location parameters and on the other hand, estimates found by (i) trend removal by least squares prior to fitting the GEV, (ii) boot-strap sampling, and (iii) Theil-Sen estimation. The same was true where trend had been incorporated into the Gumbel location parameter. Comparison of approximate 95% confidence intervals for β given by the different methods showed that 95% confidence limits given by the different methods bracketed zero for some of the six methods, while for others, they did not, so that different conclusions about statistical significance would be drawn according to which method was used to estimate the trend.


[32] The author is grateful to Professor Juan Martin Bravo for making daily flow records available and to anonymous reviewers for constructive comments.