Climate change allowances, non‐stationarity and flood frequency analyses

When considering future adaptation to climate change in UK fluvial flood alleviation schemes, the current recommendation by the Environment Agency (England) is to increase peak design flood flows by a preselected percentage. This allowance varies depending on the period for which the estimate is being made, the vulnerability of the development being considered and its location. Recently, questions have been raised as to whether these percentage uplifts should be kept the same, or whether change has already happened within the baseline period and so uplifts should be reduced. A complicating factor is that changes in flood frequency can occur for reasons in addition to climate change, such as land‐use change or natural variability. This article describes current approaches taken by different stakeholders for catchments in England and Wales to account for climate change, and discusses these allowances where there is already an observed presence of trend in flood regimes. Theil–Sen estimators of trend were used in comparing non‐stationary and stationary flood frequency curves with allowances applied, leading to a recommendation of evaluating non‐stationary models at 1990, the end of the reference period. Examples were explored such as the Eden catchment, which was heavily affected by Storm Desmond in December 2015.


| INTRODUCTION
Flooding is a major hazard both in the United Kingdom and worldwide, threatening life and having huge socioeconomic impacts. This article investigates existing UK guidance on making allowance for the effects of climate change on fluvial flood frequency estimates (FFEs), and how this guidance is applied in practice. This is in the light of recent analysis of non-stationarity in annual maximum peak flow (AMAX) data in England and Wales (Griffin et al., 2019).
The study of non-stationarity in flood peak data is complex, not least because of the impact of multiple and interacting factors such as climate change and variability, land cover and land-use change (particularly urbanisation), hydraulic changes to river channels and the high degree of natural variability in the data. Francois et al. (2019) discussed example catchments in the United States and the Netherlands where the influence of anthropogenic climate change and natural climate variability are difficult to disentangle. The influence of the latter is further confounded in the United Kingdom by the identification of periodicities in the data, so-called flood-rich and flood-poor periods. It is commonly noted (e.g., Macdonald et al., 2014) that a large proportion of the gauged records in the United Kingdom coincide with a flood-poor period which may distort future estimates of trend or flood frequency. Prosdocimi et al. (2015) used change in urban extent in addition to a number of climate-based covariates to model changing flood regime over time in two catchments in England. Increasing urbanisation was shown to have a significant effect on high flows in one of the catchments, particularly in summer. Therefore, while anthropogenic climate change may be a major driver of non-stationarity in peak flow data, it is very important that global change impacts are attributed reliably and the risk of 'climatisation' is avoided by taking non-climatic factors into account (Wine & Davison, 2019). The UK Benchmark Network (Harrigan et al., 2018) was developed to focus on near-natural catchments to try and isolate sources of change in high and low flows. Prosdocimi et al. (2019) looked at a more regional method of looking at trends in peak flow data using areal models and, amongst others, El Adlouni et al. (2007) used covariates other than time, such as changes in the North Atlantic Oscillation or urban extent (as previously mentioned) to fit a non-stationary flood frequency model.

| Guidance on climate change allowances for England
Agencies across the United Kingdom have provided guidance on the potential impacts of climate change on floods for many years, so that these can be accounted for by flood management authorities and local planners aiming to reduce flood risk (Reynard et al., 2017). The guidance has been updated very recently (EA, 2021a(EA, , 2021b, based on research combining the UKCP18 probabilistic projections (Murphy et al., 2009;Met Office Hadley Centre, 2018) with a sensitivity-based approach to the impacts of climate change on peak flows using national-scale modelling (Kay et al., 2020(Kay et al., , 2021. The work presented here uses the previous guidance (EA, 2016(EA, , 2020a, which was also based on research that used a sensitivity-based approach but which combined the UKCP09 probabilistic projections (Murphy et al., 2009) and catchment-scale modelling (Kay et al., 2011(Kay et al., , 2014. Both sets of guidance adopt a regional risk-based approach, but the latest guidance is differentiated by smaller regions. The guidance for flood management authorities (EA, 2016 and Table 1) provided a set of four numbers (Central, Higher Central, Upper and H++) for each of the 10 regions covering England and Wales (Figure 1), for three future time-slices (2020s, 2050s and 2080s). The 'Central', 'Higher Central' and 'Upper' numbers represent the upper range of estimated impacts of climate change on flood peaks from the UKCP09 projections. The H++ numbers represent plausible but unlikely high-end impacts of climate change. In principle, these values are used as percentage multiplicative factors to scale presentday stationary estimates of flood flows.
The guidance (EA, 2020a) recommends that the Central estimate of change should be used to define the risk over the design lifetime, with clear evidence that the analysis included reference to the H++, Upper and Higher Central estimates, depending on the vulnerability classification of the proposed development and its location. This is to manage the fuller range of risk, for example, building flexibility into the plan to allow future adjustments if necessary (Reynard et al., 2017).

| Issues when applying the guidance
Some of the issues that occur when applying climate change allowances for peak flows are complex, and  [Kay et al., 2014]). It is commonly noted (e.g., Macdonald et al., 2014) that this coincides with a flood-poor period in the United Kingdom which may distort future estimates of trend or flood frequency. More up-to-date baselines are available, but the use of this standard period is still widespread. Looking ahead, the impacts for the 2020s time-slice, which we now live in, are based on the potential climate change between the baseline period and the period 2015-2039, thus prompting the question of whether some of the climate change has 'already happened'. If so, there is the question of whether the application of the full allowance is still valid. Applying the full climate change allowances immediately can result in large increases for the earliest time-slices, which could be problematic for current civil engineering projects.
There is still the question of when to apply such allowances. Even if a clear trend is apparent in the AMAX data for a particular catchment, it could be for a range of reasons other than climate change, including natural climate variability. For example, the AMAX records for different catchments in the United Kingdom are very variable in length; some only have data for more recent years, while a small number of catchments have very long records. In the United Kingdom, the recommended method for statistical flood frequency estimation for ungauged sites (Kjeldsen et al., 2008) pools the flood peak data from a network of hydrologically similar sites, and therefore the pooling-group may contain data for different time periods with differing amounts of variability and trend.

| Current methods of applying allowances
Current guidance on the application of climate change allowances (CCAs) is somewhat open to interpretation. The allowances themselves were derived from climate projections from a 1961 to 1990 baseline. This article discusses various ways in which the climate change allowances could be interpreted. It compares how different extrapolations to 2025, 2050, and 2080 are affected on a regional scale, depending on which baseline is chosen and whether the baseline is assumed to be stationary. A set of case studies is also highlighted. Methods for the regional analysis and the case studies are outlined in Section 2, and results are presented in Section 3. Finally, Section 4 discusses the findings and the implications of the different approaches to CCAs.

| Data
In this article, annual maximum instantaneous flow data (AMAX) were taken from 381 stations from the National River Flow Archive (NRFA, 2021; nrfa.ceh.ac.uk) from England and Wales, restricted to those stations determined by England's Environment Agency to be 'suitable for non-stationary flood frequency analysis' (EA, 2020c, Section 2.2), based on the accuracy of the largest AMAX values, and having at least 30 years of record. The median record length of the stations is 48 years. In 80 cases, flow was corrected due to the updating of rating curves, the exclusion of some events, or the re-inclusion of previously rejected events.
The case study catchments were chosen as stations with different types of estimated trend in peak flow, and are highlighted in Figure 1. The Little Ouse at Abbey Heath is a catchment of 688 km 2 with 48 years of record in Anglia, with some rejected and missing data between 2000 and 2002. Land use in the catchment comprises predominantly arable agricultural land with an urban development, Thetford, located just upstream of the gauging station. It is a fairly dry catchment with an average annual rainfall (AAR) between 1961 and 1990 of 607 mm. The AMAX series shows a small but significant (p < 0.05) negative trend in peak flow.

| 48,007: Kennal at Ponsanooth
The Kennal at Ponsanooth is a small catchment of 26.5 km 2 with 48 years of record with high data quality in South West England. It shows no significant (p > 0.05) trend in the AMAX series. The station is affected by exports from Stithians Reservoir 4 miles upstream and abstraction for public water supply, leading to high attenuation of flow. It is mostly grassland with low urban extent and high percentage of baseflow. It is subject to relatively heavy rainfall, with AAR of 1294 mm.

| 76,005: Eden at Temple Sowerby
The Eden at Temple Sowerby is a large, steep catchment in Cumbria (616 km 2 ) with 54 years of record in North West England. It features low anthropogenic influence, especially above low flows. It has been subject to a number of highly extreme rainfall events in the last 20 years, leading to exceptional flood events. It is almost entirely rural with no significant land-use change and AAR of 1142 mm. There are no large water bodies affecting storage, and the percentage of baseflow is moderate. The AMAX series shows significant (p < 0.05) positive trend according to a Mann-Kendall test.

| Method: Climate change allowances
CCAs and statistical extrapolation were applied using five methods, which either used the 1961-1990 baseline or the whole record, and may or may not have applied nonstationary fitting to this baseline. These are: • Full Record Stationary (STFULL): Stationary flood frequency estimates (FFEs) were computed based on AMAX series for the whole period of the record, then CCAs were applied as a multiplicative factor. This is the method to which the other four were compared, and is the currently recommended approach for single at-site stationary flood frequency estimation (EA, 2020b and trend in location) evaluated at the three horizons to calculate Q20 (i.e., the flood flow with a 20-year return period, and 5% annual exceedance probability).
In all cases the generalised logistic (GLO) distribution was used as the standard distribution in England and Wales for flood frequency analysis (Kjeldsen et al., 2008). In all non-stationary cases, trend was introduced through a linear trend over time in the location parameter of the fitted GLO distribution using the Theil-Sen estimator. This trend was selected as it has fewer parameters to fit than, for example, a quadratic trend, improving the ability to robustly fit non-stationary distributions. Trend was not included in the shape or scale parameters due to the increased uncertainty inherent in such models based on time series of this length. For ST6190 and STFULL, parameters were estimated using standard L-moment methods (Hosking & Wallis, 1997) as in the UK Flood Estimation Handbook (FEH; Institute of Hydrology, 1999). For NSTEXT and NSTREP, the trend was computed using the Theil-Sen estimate based on the whole period of record. Parameter estimates (ξ(t), α, κ) were then computed using the maximum likelihood method. Scale and shape parameters were left stationary for the period of record. For NSTEXT, ξ(2019) was used, and for NSTREP, ξ(1990) was used. Coles and Dixon (1999) note that there can be differences between L-moment and maximum likelihood estimates. L-moments are used where possible to match current practice, but cannot yet be extended to non-stationary methods (Jones, 2013).
For NSTEXT, a constant trend out to 2080 was assumed for this article, though this is not necessarily realistic, and is just generally indicative of one possible way of considering future change by extrapolating historical changes. A pooled approach is not considered here as to manually check and adjust all the pooling-groups, as recommended in IH (1999), was not feasible.
For the purpose of example, only the 20-year return period events (5% annual exceedance probability, denoted Q20) are discussed here. In each case, Q20 was computed for the five cases above for all of the 447 stations; see Table 2 for a breakdown by region. Then the CCA Central estimates were applied to obtain estimates for 2025, 2050and 2080. For 1961-1990 baselines, estimates are given for values as in 1990. Preliminary work suggested that these results were generally invariant to this choice of year.
To compare these approaches, ST6190, NST6190, NSTREP and NSTEXT (NSTEXT extrapolated to 2025, 2050 and 2080) were compared with STFULL. See Figure 2 for an illustrative example of the different values of Q20. Percentage differences between STFULL and the alternatives were computed. These percentage differences were summarised regionally, taking the median over each of the river basin districts (Figure 1).

| Case studies
The three catchments outlined in Section 2.1 were chosen for their long records and on the basis of trends fitted in the preliminary analysis. The AMAX data for Little Ouse show a negative trend with only the scale parameter changing over time. The record at Kennal displays a slight positive trend, while the data for the Eden show a positive trend with only the location parameter changing over time. Additionally, a pooled flood frequency curve is estimated using WINFAP 4's Enhanced Single-Site analysis (Wallingford Hydrosolutions, 2019), and these curves can be seen in Figures 6, 10 and 11.
Stationary flood frequency curves were fitted to the AMAX data using the GLO distribution for each of the three case study catchments and the existing climate change allowances were applied as for STFULL. The uplifted frequency estimates were then compared with those derived from a non-stationary frequency analysis as in NSTFULL. The importance of period of record was explored by fitting stationary flood frequency curves to data from 1961 to 1990 (reflecting the baseline used in the climate change impact modelling studies that underlie the existing guidance) and comparing them with stationary flood frequency curves fitted to all the AMAX data at each site.  F I G U R E 2 Illustrative example of baseline periods and Q20 modified by CCAs seems to provide small percentage decreases in some places, with up to 25% decreases in Wales and South West England. However, NSTREP shows more consistently smaller percentage differences across all regions. This is consistent across horizons, which is not surprising given that the same CCAs were used for STFULL, ST6190 and NSTREP. NST6190 (as evaluated in 1990) seems to be similar to the stationary equivalent, but all the regions show a greater percentage reduction. For NSTEXT, which does not have CCAs applied, there is on average less difference than seen for NST6190 and the overall spatial pattern is more consistent. The increase over time in difference between NSTEXT and STFULL is more evident than for the other approaches. Figure 4 restricts the set of stations to those with a significant trend according to Mann-Kendall (p < 0.05), and that trend is positive according to the Theil-Sen estimator. Table 2 shows that there are many fewer stations which satisfy this condition, and many regions contain very few stations with a significant and positive trend (e.g., 9% of stations in South West England have significant and positive trend). For this subset of stations, one can see a different picture. The ST6190 estimates actually produce larger estimates compared with STFULL in South West England and Wales, though the large difference in South West England is only based on seven stations, so this is not necessarily representative of the region. For NST6190, the pattern of negative difference occurs in nearly all locations, except for South West England. For the NSTEXT the difference is very small in all regions, but is positive in Wales, and this is more pronounced for 2080. Overall, this suggests that those stations with significant trends may not follow the same patterns as those with less significant trend. It should be noted that, at the 95% significance level, 12 stations stopped showing a significant trend without their first value, and 15 stations stopped showing a significant trend F I G U R E 3 Q20 percentage differences for various horizons and baseline calculations compared with STFULL (positive values indicate STFULL is smaller) without the 2015 value (Storm Desmond). This sensitivity to period of record is a known issue in trend detection (Griffin et al., 2019). Figure 5 shows the AMAX data for Little Ouse, together with the effect of period of record on the estimate of QMED. The data show a negative trend, with QMED estimated over the 1961-1990 period (early in the AMAX record) indicated by a red line lying well above the value of QMED estimated over the entire record. This negative trend is not common in the UK catchments, where most sites with significant trend are positively trending (Griffin et al., 2019).

| Case study 1: Little Ouse at Abbey Heath
Stationary single-site and pooled flood frequency curves are shown in Figure 6 fitted to the full AMAX record and an additional single-site curve is plotted based on data from 1961 to 1990. The AMAX data points were plotted with empirical return periods according to the F I G U R E 4 Q20 percentage differences for various horizons and baseline calculations, restricted to stations with positive trend compared with STFULL F I G U R E 5 AMAX data and Q T estimates for 1961-1990 and for the whole period of record (Little Ouse) Gringorten plotting position without accounting for any non-stationarity. The negative trend detected in the data is reflected in the position of the single-site frequency curve for the full record which lies below that of the 1961-1990 curve. The even flatter pooled frequency curve suggests that this site has more extreme flooding than similar stations. The currently climate changes allowances (Central) for the 2020s, 2050s and 2080s are plotted relative to the pooled frequency curve since this represents a commonly used practice in England and Wales. Because of the negative trend in the data, the percentage uplifts for the 2080s bring the frequency estimates roughly into line with the 1961-1990 frequency curve in this particular case. There is a single AMAX value that is plotted above all three stationary frequency curves, and this represents the highest AMAX value recorded in 1968 at the beginning of the gauge record, which has a dominant influence on the trend in the data series.
However, the negative trend is still present if this event is removed.

| Case study 2: Kennal at Ponsanooth
This time the trend in the data in Figures 7 and 8 is relatively small and there is little difference between the QMED estimate for the 1961-1990 baseline period and that for the complete period of record. Figure 8 compares the non-stationary estimate for the 20-year return period (Q20) with the current climate change allowances applied to the stationary flood frequency curve. It shows that the non-stationary Q20 estimates are broadly in line with the equivalent stationary estimates uplifted by the Central climate change allowance for the 2080s, assuming a linear extrapolation from the present day. This is a feature that is seen across many stations in the dataset that exhibit slight, possibly non-significant, levels of positive trend, which aligns with the precautionary principle from which the CCAs were developed. However, if applied to the pooled estimate (dot-dashed in Figure 8), the CCA uplifts would far exceed the at-site nonstationary estimate.

| Case study 3: Eden at Temple Sowerby
For the Eden catchment, there is little difference between QMED values calculated over the 1961-1990 period and the full period of record (Figure 9). The very high AMAX values recorded in the catchment in recent years are largely responsible for the marked positive trend apparent in the time series. Figure 10 highlights the possible changes in flood frequency curves when trends were included in location F I G U R E 7 AMAX data and QMED estimates for  and for the whole period of record (Kennal) F I G U R E 8 Comparison of non-stationary Q20 estimates with stationary estimate plus climate change allowance (Kennal) parameters. The dashed lines show the evolution of the non-stationary flood frequency curve over time, showing snapshots of it in 1990, 2020, 2050 and 2080. Since the trend is positive, the non-stationary flood frequency curve moves 'up' the graph over time. Compared with the stationary model, it can also be seen that the nonstationary flood frequency curves are 'flatter', suggesting a shape parameter closer to zero. This is because the stationary model has to account for all the points at once equally, so has to fit both the new, smaller extremes with the older, larger ones using a single set of stationary GLO parameters (see Griffin et al., 2019 for a discussion of this effect). The non-stationary model can, in some sense, exchange 'variance' for 'change over time' in a way that the stationary distribution cannot. One can think of the non-stationary distribution being fitted by looking at the start, then the middle, then the end of the data; since there is less difference in the most extreme events over these shorter periods, the curve is flatter. However, it is noted that the pooled flood frequency curve is at least as 'flat' as the non-stationary fit, which suggests that this difference in performance may be less pronounced in the rest of the region, something which is corroborated by Figures 3 and 4, showing little difference between methods in North West England.
Despite the strength and significance of the trend, the non-stationary estimate lies below that of the stationary estimate with the Central allowance for climate change added into the 2080s period. In this example, the highest observations are much closer to the stationary frequency estimates when the Central climate change allowances are added and they exceed the non-stationary estimates ( Figure 10).
To assess the uncertainty associated with the flood frequency curves, and the effect that this uncertainty may have on the appropriateness of climate change allowances, 95% confidence intervals were computed using non-parametric bootstrapping as developed in Yan et al. (2017). Figure 11 shows this confidence interval for the stationary flood frequency curve based on the whole period of record for the Eden catchment, using 1000 bootstrap resamples. Here it can be seen that the confidence interval exceeds the climate change allowances by some margin, especially as the return period increases. This suggests that, although the point estimate gives plausible allowances, there is still some chance that these allowances could be exceeded, and that such extreme floods could possibly occur during future engineering design lives. Figure 10 also shows the 95% confidence interval for the non-stationary flood frequency curve as it appears in 2020. Here one can see that the flatter curve has a narrower confidence interval, but note that for 2050 and F I G U R E 1 0 Comparison of stationary and non-stationary models with confidence interval (Eden) F I G U R E 1 1 Stationary flood frequency curve for the full record shown with 95% confidence interval (Eden) F I G U R E 9 AMAX data and QMED estimates for  and for the whole period of record (Eden) 2080, this whole confidence region was lifted up to make more extreme flood frequency curves more plausible given the data. One key point to observe however is that, although the allowances are larger than the nonstationary estimates (the point estimates which give rise to the plotted flood frequency curves), the confidence interval greatly exceeds them, offering the possibility that the flood magnitudes in 2020 (or 2050/2080) may be much greater than predicted by these statistical models.

| DISCUSSION
The aim of this article was to consider how climate change allowances to flood frequency estimates should be applied in catchments where non-stationarity has been detected in gauged flow records. This was analysed regionally, and for three specific case studies, comparing stationary and non-stationary estimates with and without CCAs.
On average, all the methods investigated in this article led to smaller estimates of the 20-year event than compared with stationary estimates based on the whole record (STFULL). For ST6190, this may be due to the effect of the flood-poor period observed in the United Kingdom during this period (Macdonald & Sangster, 2017). Note that this difference is very slight in the non-stationary representative method (NSTREP). Restricting the analysis to only those stations with positive trend does not change this general observation, although the message is more mixed; some positive percentage differences are seen in South West England and Wales. It is possible that there are other variables, such as urbanisation, which may be causing this, but this regional averaging is not sensitive enough to identify such causes. South East England stands out as a region for which there is the most variability between approaches. This could, as mentioned above, be because of the greater impact of urbanisation, or due to surface and groundwater abstractions.
Compared with STFULL, using a linear extrapolation of a non-stationary model (NSTEXT) has greater negative differences for more distant horizons, suggesting the two approaches are diverging for far-future estimation, which is not surprising. However, compared with ST6190 and NST6190, NSTEXT gives more spatially consistent differences, which are also smaller on average for the nearfuture horizon. This may be due to the stark jumps in CCA factors between adjacent regions, which the NSTEXT estimates do not include.
NSTREP, which aims to give a non-stationary representation of the 1961-1990 period but by making use of surrounding data to give a broader picture, shows high similarity with STFULL, and so could be considered as a method of compromise between ignoring more recent trends, while focusing on the period of record used to develop CCAs. The flood-rich or flood-poor periods which are another way of interpreting the longest records could have negative effects on the overall trends observed, especially if the observation period moves from a flood-poor period (such as the 1970s in the United Kingdom) to a flood-rich one (2000s).
It would be interesting to combine the climate change allowance regions with the use of regionalised trends as discussed in Kjeldsen and Prosdocimi (2021), as these may reduce the sensitivity of at-site trend estimates which can vary significantly within a region. However, such a method might not account for small scale variance due to land-use change affecting runoff.
For an approach which follows a precautionary principle, the present method of applying climate change allowances to a stationary estimate based on the whole period of record gives larger values of Q50 and Q100 on average in the northern regions, compared with the other methods examined except for NSTREP. Central (50th percentile) climate change allowances appear to be appropriate in the examples presented here where no negative trend is observed. However, the uncertainty associated with statistical flood frequency is high, as indicated by the plotted confidence intervals in Figures 10 and 11. The guidance (EA, 2020a) is not explicit about whether CCAs should be applied to flood estimates derived from the whole record, or a sub-period, however, the use of just the 1961-1990 period seems to lead to smaller flood frequency estimates, which may not be suitable for more risk-averse engineering projects.
However, the detailed analysis of only three case study catchments with different degrees and directions of trend cannot be easily generalised. Making use of additional data such as rainfall statistics, can reduce uncertainty in flow, but the uncertainty in rainfall would remain as a problem to be solved. Additionally, there is no clear way for policy makers to make assessments on future engineering projects without needing high quality models of future rainfall; though it would be fruitful if this were available. For example, applying the climate change allowances to the upper bound of the 95% confidence interval may be useful in circumstances requiring a highly precautionary estimate; compare this to the H+ + uplift mentioned above. Some places use difference approaches such as El Salvador which varies uplift by project type (Wasko et al., 2021).
Overall, the different methods lead to broadly similar messages in many parts of England and Wales, but the small pockets of stations with negative trend may not be being well represented in these allowances.
It is recommended that, given the patterns observed, an appropriate method of applying existing climate change allowances is to fit a non-stationary distribution to the AMAX values but evaluate this non-stationary fit in 1990, as in the NSTREP method used here. Obviously, this should be compared with applying the CCAs to estimates derived from a stationary distribution, especially in the case where the trend is not significant, or where a significant trend is due to the influence of one or two extreme events at the start or end of the record.
It is hoped that this information and research can feed back into direct guidance for policy makers in a way which is both well researched but also developed to make it reasonable for practitioners to act on and implement.