Using lake sediment archives to improve understanding of flood magnitude and frequency: Recent extreme flooding in northwest UK

We present the first quantitative reconstruction of palaeofloods using lake sediments for the UK and show that for a large catchment in NW England the cluster of devastating floods from 1990 to present is without precedent in this 558‐year palaeo‐record. Our approach augments conventional flood magnitude and frequency (FMF) analyses with continuous lake sedimentary data to provide a longer‐term perspective on flood magnitude recurrence probabilities. The 2009 flood, the largest in >558 years, had a recurrence interval larger (1:2,200 year) than revealed by conventional flood estimation using shorter duration gauged single station records (1:1,700 year). Flood‐rich periods are non‐stationary in their correlation with climate indices, but the 1990‐2018 cluster is associated with warmer Northern Hemisphere Temperatures and positive Atlantic Multidecadal Oscillation. Monitored records rarely capture the largest floods and our palaeoflood series shows, for this catchment, such omissions undermine evaluations of future risk. Our approach provides an exemplar of how to derive centennial palaeoflood reconstructions from lakes coupled well with their catchments around the world. © 2019 The Authors. Earth Surface Processes and Landforms Published by John Wiley & Sons Ltd.


Introduction
Extreme flooding is the world's most damaging natural hazard affecting >100 million people each year (Di Baldassarre et al., 2013;Guha-Sapir et al., 2014). Simulations of 21 st century climate change project for many regions an increase in the frequency of occurrence of threatening floods, which currently cost the global community approximately US$1 trillion per year (Hallegatte et al., 2013;Guha-Sapir et al., 2014). A challenge for flood hazard managers and the insurance industry is to identify accurately the magnitude and risk of flooding to develop adequate protection. Flood magnitude and frequency (FMF) analyses are used by industry to estimate the magnitude (design specification for flood protection) and risk (probability of recurrence) of floods. Typically short duration (<50 years) gauging station data underpin most FMF analyses, either directly for single stations or by pooling station series with data from other gauged stations, but neither approach presents a full picture of past flooding (Wilhelm et al., 2013;Macdonald et al., 2014;Wilhelm et al., 2018a;Wilhelm et al., 2018b).
Incorporating additional extreme events from historical or palaeoflood archives can extend the temporal range of FMF analysis (Wilhelm et al., 2018a;Evin et al., 2019) if information on both event timing and magnitude can be secured.
Under the right circumstances, essentially basins well coupled to their catchments, discrete coarser sediment layers record past floods in the sedimentary records of lakes. To date sedimentary palaeoflood analyses have focused typically on event timing with limited estimation of event magnitude (e.g. Schillereff et al., 2014;Wilhelm et al., 2018a;Wilhelm et al., 2018b). Whilst flood frequency data are useful for exploring the relationship between flood phasing and other climate indices (Schulte et al., 2015;Schillereff et al., 2016;Wilhelm et al., 2018a;Wilhelm et al., 2018b) without quantification of event magnitude it is challenging to calculate reliable recurrence intervals (NERC, 1999). Here, we report the first lacustrine palaeoflood record series for a large catchment in the UK and one of few applications of quantitative FMF analyses to sedimentary palaeoflood data (Evin et al., 2019).

Study Site and Methodology
Bassenthwaite lake Bassenthwaite Lake, NW England ( Figure 1A), operates as a large (5.1 km 2 ) sediment sink receiving drainage and trapping sediments from more than 15% of the upland Lake District National Park (Thackeray et al., 2010). Bassenthwaite Lake is relatively shallow (maximum 20 m and average 5.3 m) and the major inflow rivers enter the south of the lake from the Derwent (254 km 2 ) and Newlands (49 km 2 ) catchments. These flows cross a shallow (< 3 m) low gradient littoral zone before descending to a profundal basin that is 200 m wide, 0.5 km from the inflows, 10-12 m deep, and has a pronounced 2-3 m high lip before descent to the deeper basin (19 m) that dominates the remainder of the lake ( Figure 1B) (Thackeray et al., 2010). The large catchment, high catchment relief (maximum elevation 950 m) and catchment-to-lake ratio (57:1) combine with average annual precipitation~2,077 mm to produce a 19-day average water retention time (Thackeray et al., 2010) and conditions ideal for preserving regional palaeoflood data. These conditions are a strong coupling between catchment and lake, a large catchment-to-lake area ratio, a relative simple lake basin geometry that reduces potential for remobilisation, abundant sediment in the catchment and stability to the configuration of inflow streams (Schillereff et al., 2014). Skiddaw Group silt/mudstones to the north and Borrowdale Volcanic Group (andesitic lava and tuff) series to the south form the catchment bedrock. The catchment landscape reflects the impacts of repeated glaciation resulting in steep slopes mantled with glacial debris, both in situ glacial sediments and those modified by periglacial activity. The regional bedrock has been subject to two phases of mineralization producing substantial quantities of metal ore, with a Lower Devonian (Caledonian) age 'chalcopyrite-pyrite-arsenopyrite' phase and early Carboniferous 'galena sphalerite' phase (Stanley and Vaughan, 1982). The catchment was mined for metals (Cu, Zn, Pb, Ba and Co) from the 1560s at numerous locations ( Figure 1A), with mining ceasing in the 1970s at Force Crag in the Bassenthwaite catchment (Tyler, 2003;Tyler, 2005a;Tyler, 2005b).
There are lakes upstream of Bassenthwaite Lake that form sediment traps for 120.1 km 2 of the catchment ( Figure 1A). The extensive pre-lake floodplain also moderates the downstream flux of sediments, but also is a source of sediment during floods from bank erosion. The dominant wind direction is towards the delta zone and so negates the effects of wind-driven re-suspension to our core sites (Thackeray et al., 2010). The cores were sampled from up-fetch of theoretically modelled regions of wind driven re-suspension in the lake (Schillereff et al., 2014). Boat generated suspended sediment is absent as boating is banned from the southern half of the lake. Thus, the overwhelmingly dominant sediment trajectories to the core site are via river inflows from the River Derwent and Newlands Beck catchments (Johnson and Warburton, 2006;Warburton, 2010).
The Bassenthwaite catchment has experienced a series of recent extreme floods, 2005, 2009 and 2015, with each of these associated with warm, moist south-westerly airstreams and with deep Atlantic depressions tracking north-eastwards across the UK Parry et al., 2016). During the largest event from 4 to 6 December 2015, NW England was hit by exceptional and record-breaking precipitation (Storm Desmond) in the Derwent catchment headwaters. The event set new records for 24 hour rainfall totals e.g. Honister Pass 341.4 mm and 322.6 mm at Thirlmere, with rainfall frequency models estimating a local return period of about 1300 years, a 0.08% annual probability Burt et al., 2016;Parry et al., 2016). Flood levels for the River Derwent at Portinscale evidence a high magnitude event (365 m 3 s -1 ), impacting catastrophically on the population and infrastructure.
In November 2009, a weather front became stationary over Cumbria, resulting in prolonged (36 hour) and heavy rainfall (Miller et al., 2013). High altitude areas received more than 400 mm of rainfall (72-hours), setting a then 24 hour rainfall record for the UK in the Bassenthwaite catchment (316 mm, Recurrence Interval~1,862 years) (Stewart et al., 2012). The resulting peak flow into Bassenthwaite was estimated as 402 m 3 s -1 for the River Derwent, with a single-site return period of 228 years (95% confidence limit of >40 and <50,000) (Miller et al., 2013;Wong et al., 2015). At the lake outflow (Derwent at Ouse Bridge: UK Environment Agency), discharge Figure 1. A. Location of Bassenthwaite Lake; the core sites, river gauging stations, metal mines and the active and passive (hatched) sediment supply sectors. B. Core locations, bathymetry and topographical cross-sections (20x vertical exaggeration). C. Line-scan photographs, X-radiographs and 90 th percentile (D90) of the PSD data for the three cores (grey: raw D90 data, black: D90 smoothed 25-period centrally weighted function). [Colour figure can be viewed at wileyonlinelibrary.com] 2367 EXTREME FLOODING IS THE WORLD'S MOST DAMAGING NATURAL HAZARD was estimated at 378 m 3 s -1 with a single-site return period of 311 (95% confidence limits of >50 <50,000) years (Miller et al., 2013). In January 2005 a westerly airstream generated a near stationary weather front across northern England and southern Scotland that contributed heavy rainfall (<180.4 mm) across Cumbria with an estimated return frequency of <200 years (Stewart et al., 2012) resulting in an estimated peak flow of 219 m 3 s -1 on the Derwent (Portinscale) (Miller et al., 2013).
In summary, the river discharges recorded over 40 years for the Derwent at Portinscale show the most three extreme events in the last two decades. These floods have resulted in economic damages totalling £276 million in 2009, and £1.6 billion in 2015 (Storm Desmond) (Environment Agency, 2018) loss of life, extensive flooding of property, the destruction and disruption of infrastructure and the largest river channel adjustments on record (Miller et al., 2013, Wong et al., 2015.

Field and Laboratory Sampling
Sediment cores (BASS-2, BASS-2010 and BASS-2016) came from multiple adjacent sites in the gentle gradient profundal basin that fronts both the Derwent and Newlands river inflows to Bassenthwaite Lake ( Figure 1B). The basin has flanking slopes at less than 1-2°, which produces minimal potential for debris flows, wave resuspension and sediment remobilization. BASS-2 (0.8 m: Figure 1C) was sampled collecting the sediment water interface using a piston mini-corer and a gravity corer. Longer sediment records at BASS-2010 and BASS-2016 (both 2.8 m; Figure 1C) were taken from a stable floating platform and comprise overlapped 1.5 × 0.07 m core lengths and the intact sediment water interface by gravity corer (Boyle, 1995). BASS-2 and BASS-2010 locations (originally sampled 2010) after the November 2009 flood were revisited after the winter storms of 2015/16 (e.g. Storm Desmond) to collect gravity cores and extend the temporal record. Extrusion at 5 mm contiguous intervals produced subsamples from the majority of our piston and gravity cores, with the recent cores after Storm Desmond (December 2015) sampled at 2.5 mm contiguous intervals to characterise better this most recent extreme flood and the preceding flood lamination from November 2009. The stratigraphy in the Russian-style cores is largely uniform in colour comprising grey-brown limnic muds, whereas x-radiographs (BASS-2010, BASS-2016) reveal some finer than 5 mm laminations that reflect subtle variations in sediment density (Figure 1C). Given the lack of a consistent and clear visual stratigraphy ( Figure 1C); the palaeoflood records for the sediments were generated using the particle size statistics measured by laser granulometry on the subsamples.

Particle Size and Geochemical Analyses
Sediment geochemistry was determined via X-ray fluorescence (XRF) using two approaches. BASS-2 and all gravity cores were measured using 2.5-5 mm interval discrete samples on a dry-mass basis using a Bruker S2 Ranger energydispersive instrument Schillereff et al., 2015). The BASS-2010 and BASS-2016 cores were cleaned, with the BASS-2016 core photographed at 15 μm resolution using the Line-scan camera fitted to the Liverpool Geotek Multi-sensor Core Logger (MSCL) ( Figure 1C). Both cores were covered with 5 μm polypropylene film and measured at 5 mm intervals on a wet sediment basis using an Olympus Delta energy dispersive μXRF mounted on the Liverpool Geotek MSCL . They were also measured on the Itrax-Core scanner (BOSCORF, National Oceanography Centre, Southampton) using a step size of 300 μm and a measurement dwell time of 30 seconds. A 22 mm wide radiographic image was also collected using the Itrax-CS as a digital, continuous-strip down the core for both BASS-2010 and BASS-2016 ( Figure 1C). Wet sediment element concentrations were converted to dry-weight equivalent using a training set of dried samples measured on the S2 Ranger . The XRFs undergo regular laboratory consistency checks using certified reference materials (e.g. Boyle et al., 2015).
The palaeoflood record was generated using particle size data measured by laser granulometry, given the strong sediment hydrodynamic relationship with river discharge (see reviews in Schillereff et al., 2014, Wilhelm et al., 2018b and also the patchy visual signature for the coarse grained units at Bassenthwaite ( Figure 1C). Analysis using a Coulter LS 13 320 Single-Wavelength Laser Diffraction Particle Size Analyser, which determines the dimensions of individual particles 0.375-2000 μm, measured particle size distributions (PSD) for the subsamples. Digestion in H 2 O 2 removed the organic component, with the samples then dispersed and sonicated in Na 6 O 18 P 6 and analysed under sonicating measurement conditions. Results are the average of three repeats after elimination of outliers, a process that minimises intra-sample noise in the laser granulometry. The Coulter LS 13 320 undergoes regular calibration checks using samples with known size distributions. Particle size frequency statistics were calculated using standard geometric formulae (Folk and Ward, 1957) using the GRADISTAT 8.0 software (Blott and Pye, 2001). Further data exploration involved a statistical end-member analysis using the EMMAgeo package (Dietze and Dietze, 2016) to identify and quantify the proportions of significant PSD end members (EM) using a principal components methodology implemented in R (R Core Team, 2017).

Geochronology
Sediment core chronologies were generated by Bayesian agedepth modelling (Blaauw and Christen, 2011) of radioisotope dating ( 210 Pb, 137 Cs and 14 C) and using a metals geochemical stratigraphy (Pb, Zn, Ba and Cu) that reflects the catchment history of mineral ore production. Generating robust chronological control for sediment sequences spanning the last 500 years is difficult and compounded by significant changes in sedimentation rate (SR) rendering 210 Pb dating (Appleby and Oldfield, 1978;Appleby, 2008;Appleby, 2013) challenging. However, the rich record of historical (CE 1560(CE -1950 metal pollution for the Bassenthwaite catchment ( Figure 2B) provides data for an age-depth model generated for the lake sediments (Tyler, 2003;Tyler, 2005a;Tyler, 2005b). We recovered repeatable geochemical records for the metals Cu, Pb, Zn, and Ba from multiple core sites ( Figure 2C) and these data show a strong match to the history of metal ore production collated for mines in the active catchment ( Figure 2B) (Tyler, 2003, Tyler, 2005a, Tyler, 2005b. Coherence of the inflow-derived metal stratigraphic record supports the lack of major resuspension of sediments by wind driven stress in this region of the lake (Thackeray et al., 2010;Schillereff et al., 2014). From the metals data ( Figure 2C: repeatable across these three and other cores) a series of chronological marker horizons (Table I) were identified for the sediments spanning the last 400 years. Pb concentrations have been plotted against both linear and log axes for BASS-2010 and BASS-2016, because the high concentrations from CE 1800 dwarf smaller in magnitude patterns in the earlier record ( Figure 2C). 2368 R. C. CHIVERRELL ET AL.
Radiometric dating, using 137 Cs (half-life of 30 yr) and 210 Pb (half-life of 22.4 yr) was carried out for BASS-2 using established methods (Croudace et al., 2012;Miller et al., 2014;Sjögren et al., 2017) at the GAU-Radioanalytical Laboratories (National Oceanography Centre, Southampton). Bulk dried and homogenised samples were counted for 100,000seconds in scintillation vials using Canberra well-type HPGe spectrometers (Canberra Industries, Harwell, UK). Gamma spectra were processed using FITZPEAKS gamma deconvolving software (JF Computing, Stanford in the Vale, UK). The photopeak areas for 210 Pb (46 keV), 137 Cs (661 keV) and other gamma radionuclides were calibrated using a certified NPL radionuclide mixture dispersed in a sediment matrix. Detection limits are nominally 5 Bg/kg for 210 Pb and 2 Bq/kg for 137 Cs. The 210 Pb activity vs depth profile ( Figure 3A) shows a general exponential form implying a uniform sediment rate for the sequence down to approximately 40 cm depth, below which patterns in the measured 210 PB concentrations make age estimation unreliable. The anthropogenic radionuclide 137 Cs profile for Bass2 ( Figure 3B) shows two key datable features: the CE 1963 'main bomb pulse' 137 Cs from atmospheric nuclear testing and the CE 1986 Chernobyl spike. These features Cs) in 1963 and 1986, the modelled weighted mean age (red pecked line), the 2 σ age probability (greyscale), and 2 σ uncertainty age range (grey pecked lines). B. Metal ore production in tons from mines in the catchment (Tyler, 2003;Tyler, 2005a;Tyler, 2005b). C. Metal concentration data measured by XRF plotted against age derived from agedepth models for BASS-2016, BASS-2010 and BASS-2. Pb (solid red), Zn (blue), Ba (green) and Cu (grey fill). Pb is plotted for BASS-2016 and BASS-2010 with a log axis (red dashed line) to show patterns at lower concentrations. Time markers and their age extend across the plot (see Table I).
[Colour figure can be viewed at wileyonlinelibrary.com] identify an average sedimentation rate of 0.87 cm/y. A linear regression of the Ln 210 Pb data (less a supported 210 Pb of 10 Bq/kg; Simple Model) gives a sedimentation rate of 0.90 cm/y which is consistent with the average from 137 Cs dating ( Figure 3D). The markers for mined heavy metals and 137 Cs horizons ( Figure 3C-D) incorporated and show a strong fit to the independent 210 Pb radioisotope chronology (Figure 3). The single 14 C age determination of 329 ± 24 BP (SUERC-41877) from a terrestrial leaf fragment at 2.68 m depth on BASS-2010 is a good fit to the longer metals stratigraphy (Figure 2A).
The final age models were produced independently for each core profile, using all the geochronological control using the Bayesian package 'rBacon' (Blaauw and Christen, 2011;Blaauw and Andres Christen, 2018) operating in R (R Core Team, 2017). Markov Chain Monte Carlo repetitions in the age-depth modelling were constrained by a gamma distribution with mean 2 yr mm -1 and shape 1.5 and a beta distribution with mean 0.4 and memory strength 25. The lack of a consistently clear visual stratigraphy for the core negated the incorporation of geologically instantaneous event laminations in the agedepth model. Sedimentation rates throughout the profile average 6 mm yr -1 , and vary from to 4-10 mm yr -1 with the exception of the high water content surface sediments. Thus the 5 mm contiguous sampling implicit in the PSD analyses are close to an annual resolution. The timing of events discerned in the palaeoflood series uses the probability weighted mean age and 95% uncertainties from the Bayesian age-depth model (Figure 2A).

Developing the Sedimentary Palaeoflood RecordT
he age-depth model displays a series of variations in SR ( Figure 4C) and these are largely mirrored in increased flux of lithogenic elements (e.g. K: Figure 4C), and these are interpreted as responses to catchment soil/sediment erosion conditioned mostly by agricultural intensification and a minor contribution from metal mining. Timescales for patterns present in regional palaeo-vegetation data (Pennington, 1981;Pennington, 1991;Chiverrell, 2006) support linkage of these erosion phases to agricultural intensification. The PSDs (Figure 4) show repeated and regular laminations that contain a coarser component. End-Member (EM) analyses (Dietze et al., 2012) for an integrated dataset from the cores decomposes the PSDs into two groupings that explain 87% of the variance in the dataset ( Figure 4F). EM1 comprises fine silt (7 -8 μm), whereas EM2 is coarser and comprises a fine sand (mode = 63 μm). Where present EM2 forms more than 30% of the PSD and occurs as discrete laminations or beds ( Figure 4B). The 90th percentile of the PSDs (D90) is a strong measure of this coarse EM2 and characterises the falling limb of the distributions. Our process interpretations for the end members is that EM1 reflect background or normal accumulation of fine grained gyttja and the EM2 coarser deposits lain down under flood conditions. The absence of further statistically significant PSDs end members, we interpret as reflecting that other process end members are not a component of the Bassenthwaite Lake record. At other deeper British lakes, with steeper gradient lake beds, we have encountered multimodal poorly sorted PSDs as layers, which are probably subaqueous debris flows and have characteristics similar to equivalent deposits in other environments (Håkanson and Jansson, 1983;Mulder and Alexander, 2001). Peaks in the D90 and EM2 form repeated subcentimetre coarser laminations that display poor sorting and right-skewed distributions. Lower-frequency down core changes in EM2 and the D90 show the sediments are coarser grained when there are increases in the SR and/or the flux of lithogenic elements ( Figure 4C), e.g. K which is abundant in the catchment bedrock, sediments and soils. These changes reflect subtle variations in sediment availability to the fluvial system, typically driven in the longer term by patterns of land use change and high magnitude erosion events enriching the channel system with coarser sediment. The smoothed D90 curve describes these changes in the baseline grain size regime and they are subtle varying in the range 60-40 μm ( Figure 4D). Coarse laminations dominated by EM2 punctuate the smoothed record with the increases in D90 much larger than the background variations. These coarse laminations are present throughout the record therefore the inflow channels and pre-lake floodplain appears to have provided abundant sediment to Bassenthwaite Lake (Orr et al., 2004;Warburton, 2010). Changes in the D90 smoothed baseline suggest that the catchment sedimentary regime is not stationary. To correct for this we calculated i) a centrally weighted 25-point (~125 mm) smoothing function (D90smoothed) and ii) the residuals of D90 against D90smoothed. This D90residual expresses the PSDs in terms of their magnitude (μm) within a 40 year moving temporal window that is similar in length to gauged recorded river flow records (Miller et al., 2013). We undertook a sensitivity analysis of the effect of our sampling resolution on this D90residual. Samples were taken at 2.5 mm resolution over the period containing the most recent known extreme floods (2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016) and measured for PSD. The PSD were then combined to intervals equating to 5, 10, 15 and 20 mm contiguous sampling, and the D90residuals were recalculated. The resulting variance across known discrete extreme events was~5%. Applying this variance to the entire palaeoflood dataset suggests our sampling resolution has no impact on the estimation of relative magnitudes for extreme events.
We are cautious about allocating specific flood layers to historical documented floods; rather we assign each sedimentary flood a ±2σ age range derived from the Bayesian age-depth models (Figure 2A). The precise event lamination is regarded therefore as constrained to a defined period within which historic floods may fall. BASS-2016 extends the duration of this palaeoflood record and shows cluster of events at 1840-1925 and 1930-2016 ( Figure 4E) that match flood-rich periods in both the regional and local historical data (Black and Law, 2004;Macdonald and Sangster, 2017). The magnitude of the largest floods, 12/2015 (364 m 3 s -1 ), 11/2009 (402 m 3 s -1 ), 8/2005 (219 m 3 s -1 ) and 1/1995 (159.6 m 3 s -1 ) (Derwent at Portinscale), broadly matches the magnitude ranking in the D90residual ( Figure 5). We are therefore confident that our reconstructed palaeoflood series from lake sediments is recording the structure and individual flood hydrology of the catchment. Extending the flood record back to 1460 reveals distinct earlier flood-rich phases (c.1460-1575, c.1630-1720), which are concomitant with events present in historical datasets for northwest England (Macdonald and Sangster, 2017).
For the past 43 years the sediment flood record shows a significant correlation (r 2 = 89%) with a gauged discharge (m 3 s -1 ) flood series for the Derwent (Portinscale) (Figures 5C).
To generate a training data set for flood frequency analysis (FMF) including assessment of flood magnitude we allocated peaks in the D90residual from the recent gauged period  to flood discharges ( Figure 5). The gauged period contains a range of floods including four recent extreme events. Best-fit regression was used to define an exponential function (r 2 = 89%, p<0.0001) describing the relationship between recorded discharges and D90residual ( Figure 5C). Applying this function to the entire D90residual time series produced a sedimentary palaeo-discharge series ( Figure 6A). The precision of the reconstructed discharges are +/-the uncertainty inherent the exponential function as with any model, but we are confident in the relative magnitudes that are important for using the time series in subsequent FMF. To assess the impact of calculating the D90residual as a palaeoflood proxy, an equivalent best fit (linear) regression model was generated for the relationship between recorded discharges and raw D90 (r 2 = 70%, p<0.0001), and the application of that to the entire D90 time series is virtually identical to the discharge reconstruction using the D90residuals. Thus, whilst there is some non- Events entries for the Derwent/Cocker catchment (http://www.cbhe.hydrology.org.uk/) (Black and Law, 2004). F. PSDs of the statistically significant particle size end members. [Colour figure can be viewed at wileyonlinelibrary.com] 2371 EXTREME FLOODING IS THE WORLD'S MOST DAMAGING NATURAL HAZARD stationarity due to changes in catchment state arising from agricultural land-use intensity; it appears to have little or no impact on the grain sizes delivered to Bassenthwaite Lake in flood events over the last 558 years. Detrending the flood series has no negative impacts on the flood series, for example, there is no exaggeration of the extremes of the past two decades, with  the extreme peaks reducing in magnitude in the calculation of the D90residual.

Flood Magnitude Frequency (Fmf) Analysis
Here we explore the impact of including the palaeoflood series in FMF analyses, with the analysis involving fitting of a Generalised Logistic (GLO) distribution to the reconstructed series of peak flow events (m 3 s -1 ) based on the method of L-moments (Robson and Reed, 1999;Castellarin et al., 2012). The logistic reduced variate generated permits the cumulative probability distribution to be plotted in linear relationship to the discharge, assuming a logistic (no skewness) distribution. No conclusive method has been developed within the L-moment framework for combining gauged data with censored (either historical or sedimentary) events in the period pre-dating the installation of a gauging station. A probabilistic model  has been proposed for censored annual maximum series formulated as a maximum-likelihood function (NERC, 1999). This model assumes that events from the gauged and historical/sedimentary period are independent and follow a GLO distribution with a Probability Density Function (PDF) and Cumulative Density Function (CDF).
The FMF analyses drew upon four flood time series: 1 a single station (Portinscale) gauged series 2016 to 1970 ( Figure 5A), 2 the sedimentary proxy discharge data for BASS-2016 ( Figure 6A), 3 the sedimentary series expressed as Peaks Over Threshold (POT) with events <115 m 3 s -1 excluded , and 4 the gauged series 2016 to 1970 augmented by the sedimentary series from CE 1970 to 1460 with events <115 m 3 s -1 excluded.
The 115 m 3 s -1 threshold was identified to ensure a similar number of high magnitude floods were included in both the gauged and sedimentary data series, and was based on no two events within any given year in either the sedimentary or instrumental series exceeding the >115 m 3 s -1 threshold . Using a high threshold ensures that it can be reasonably concluded that the two data series (gauged and sedimentary) are both realizations of the same underlying distribution (Macdonald and Sangster, 2017).
Confidence intervals for the different flood frequency curves are presented ( Figure 6A-C), using the Delta method adopted to obtain a confidence interval for a given annual threshold event (e.g. 1 in 100 or 1 in 1,000 years) (as shown in Kjeldsen and Jones, 2006). The fitted GLO models ( Figure 6B) are plotted with 95% confidence intervals against the observed AMAX showing: the GLO distribution fitted directly to the 43 annual maximum events in the gauged record (Gringorten plotting position); and the GLO distribution fitted to the sedimentary data series (for assumptions of known and unknown peak discharges) . In applying a censored approach (>115m 3 s -1 ; Figure 6B-C) the censored sedimentary series is statistically treated as Peak-Over-Threshold (POT) data . These differing series highlight an important difference between 1) water dominated floods reflected in gauged records and 2) the sediment dominated floods implicit to the lake palaeorecord. More often than not for the larger floods water and sediment dominated overlap, but in identifying event magnitude from the sediments then we are also capturing the importance of sediment supply in the flood event, which is an aspect not present in the gauged record.
The single site (gauged) FMF analysis ( Figure 6A) compares well with previously published FMF analysis at Camerton, further downstream in the Derwent catchment, with single site and a data pooling approach used (Miller et al., 2013). Pooling is a conventional approach in flood frequency estimation used when a single discharge series is of insufficient length in determining a required estimate for a design flood (Robson and Reed, 1999). Pooled FMF categorized the November 2009 flood as the largest on record with a return period of between 50 and 500 years (Miller et al., 2013). Here, our single station FMF analysis using the gauged annual maximum series for Portinscale suggests a return period of 1,700 years for the 2009 flood ( Figure 6A). Incorporating the full palaeoflood record into the FMF analysis increases this return period tõ 10,000 years reducing to~2,200 years when the series was censored to the peaks >115 m 3 s -1 threshold ( Figure 6B). Incorporating our threshold censored palaeoflood data into the FMF analysis reduces the magnitude of the estimated 100-and 1,000-year events compared to that generated using the 43 year gauged series alone, reducing from~215 to~182 m 3 s -1 and from~365 to~300 m 3 s -1 respectively ( Figure 6B). Our palaeoflood data ( Figure 6B) show that~2,200 year (1,050 to >10,000 years: 95% confidence interval) return period for 2009 flooding was rarer than indicated by the short gauged record ( Figure 6A) and the 50-500 year return period calculated in FMF analysis conducted before Storm Desmond for the Derwent (Camerton) using a data pooling approach (see Miller et al., 2013).
FMF analysis using the palaeoflood series returns ã 750 year (300 to >10,000 years) return period for the 2015 flood (Storm Desmond), which is slightly shorter than the 1,000 year (375 to >10,000) indicated by equivalent single station analysis of the Derwent (Portinscale) gauged series ( Figure 6A). The discrepancy for the 2015 event reflects the lower discharge reconstructed from the sediment palaeorecord. FMF analysis using the gauged series augmented by sedimentary palaeoflood data applying a censored approach (>115 m 3 s -1 ; Figure 6C), shows the impact of including a further high magnitude event on the analyses. Combining gauged with the sedimentary series produces estimated magnitudes for 100-and 1,000-year events of 170 m 3 s -1 and 250 m 3 s -1 respectively, and the return periods for the 2009 and 2015 to~20,000 and~14,000 years respectively. The sedimentary palaeoflood series reveals that the magnitudes of floods over the last 20-25 years form a cluster without equal and unprecedented in~558 years ( Figure 7A).

Analysis of Climate Indices
Comparing records of past precipitation and associated fluvial flooding to a range of climate and ocean indices can improve our understanding of the forcing of extreme weather. Instrumental and modelled data (Sutton and Dong, 2012;Huntingford et al., 2014) record a relationship between winter (DJF) North Atlantic Oscillation (wNAO) and rainfall over NW England. Positive wNAO indices are associated with higher precipitation and river flows via increased cyclonicity and enhanced zonal flow over the region (Hannaford and Marsh, 2008;Lavers et al., 2011). Since wNAO tends are positively correlated with Mean Annual Temperature, it is likely that the moisture content of westerly air masses will also increase when the wNAO index is strongly positive, amplifying the impact of increased airflow Marsh, 2008, Lavers et al., 2011). Thus, the effects of warming northern hemisphere 2373 EXTREME FLOODING IS THE WORLD'S MOST DAMAGING NATURAL HAZARD temperatures (NHT) could potentially increase the moisture content of westerly airflows during positive wNAO. There is observational and modelling evidence of increases in rainfall over NW Europe associated with a positive wNAO index and a switch to a more positive phase of Atlantic Multidecadal Oscillation (AMO) (Sutton and Dong, 2012).
In an attempt to move beyond visual comparison of curves (Figure 7), we have explored the strength of the relationship between our palaeorecord of POT and a series of climate indices linked with the occurrence of storms and river floods in NW Europe (Blöschl et al., 2017). The duration of our POT series extends beyond the measured wNAO and AMO (red lines: Figure 7B, C), and so we have used a composite series (black lines: Figure 7B, C) that is integrated with published longer reconstructed time series. These series were: i) a tree ring based reconstruction of the AMO (Gray et al., 2004) back to 1567 ( Figure 7B); ii) a multi-proxy reconstruction of wNAO (Cook et al., 2002) back to 1400 ( Figure 7C); and iii) an annuallyresolved reconstruction of northern hemisphere temperatures for the last millennium (Shi et al., 2013) from annually-resolved palaeo-data ( Figure 7D). Each of these proxy records performs well against the equivalent measured data for a >150 year duration calibration period (Cook et al., 2002;Gray et al., 2004;Shi et al., 2013), and is of sufficient duration for comparison with our POT series that extends to 1460.
There is no significant correlation (Pearson product moment) on a whole record basis for the D90residual or for the D90residual record smoothed with a 40 year moving average with the selected climate indices ( Figure 7A). The calculated correlation analyses for a time-stepped 50 year moving window through the datasets highlights periods with significant positive (red) and negative (blue) correlation using 99% significance limits (right panel: Figure 7B-D). Limitations to this analysis stem from comparing the sedimentary palaeoflood series with an average chronological uncertainty of ±18.5 years (2σ range 1.4 to 50.1 years) with annually resolved reconstructions, alongside the uncertainties inherent in the selected reconstructions (Cook et al., 2002, Gray et al., 2004, Shi et al., 2013, and should be viewed in the context that a phase with a significant correlation does not implicitly mean causation. Thus, we do not attempt a mechanistic interpretation; rather our data ( Figure

Conclusions
We show the value of lake sediments in extending the flood series used to estimate the probability of high extreme magnitude events. Short flood records and non-stationarity in climate time series can hamper flood design estimation. Our results illustrate that increasing the record length, here using sedimentary palaeoflood data, has lengthened the return period estimates for these events produced by FMF analyses. Reconstructions of flood series have focused less on variability in flood magnitude and more on the phasing of flood-rich and poor periods (Hall et al., 2014). This focus reflects the challenges in reconstructing event magnitude rather than just frequency, and that flood phases rather than individual events are easier to attribute to climatic forcing. UK-scale assessments of recent flooding (Wilby and Quinn, 2013) show that during the last 140 years the recent flood-rich decade is unusual, but not unprecedented. Our continuous 558-year palaeoflood time series that incorporates both event magnitude and frequency shows that the extreme floods (top 1%) whilst infrequent occur in groups and that the most extreme floods in our series clustered between 1990 and 2018 ( Figure 7). Potentially, FMF analyses derived using short gauged time series during a cluster of extremes could produce recurrence probabilities that are too frequent; which might lead to higher estimates of flood risk and costly over-design of flood defences. Thus, being able to identify the longer-term context for the shorter gauged records is critical in helping refine plans for flood protection. Furthermore, our focus on magnitude frequency relationships for palaeofloods targets the geomorphologically important events, and provides an assessment based on the sediment-floods associated with societal flood impact (e.g. erosive floods). Calibrated reconstructions of long-term flood magnitude can also help more accurately parameterize statements about the "unprecedented" nature of floods and long-term data that our approach can yield require reconsideration of the definitions of unprecedented flooding. Our approach is widely applicable at lakes that are well-coupled to the hydrodynamics of their catchments (see reviews in Schillereff et al., 2014; n=152 such lakes in UK alone) offering the possibility of quantitative long term flood series for different hydroclimatic regions of the world. Future research should focus on developing such records for other hydroclimatic areas to understand better the drivers and responses to regional variability and extension of records back into periods characterized by warmer NHT such as the Medieval Climatic Optimum.