Developing Low‐Likelihood Climate Storylines for Extreme Precipitation Over Central Europe

Heavy precipitation and associated flooding during the cold season, such as the 1993 flood in central Europe (CEU), are a major threat to society and ecosystems. Due to the lack of long homogenous climate data and methodological frameworks, it is challenging to estimate how extreme precipitation could get and what the physical drivers are. This study presents two complementary strategies to extrapolate beyond the precipitation records: (a) statistical estimates based on fitting generalized extreme value distributions, providing their probabilistic information on return periods and, (b) ensemble boosting, a model‐based re‐initialization of heavy precipitation in large ensembles, providing a physical coherent storyline in space and time, however, with no direct quantification of its probability. Both show that 3‐day accumulated precipitation maxima can be substantially exceeded over CEU of around 30%–40%, but even higher magnitudes cannot be ruled out in the near future. An empirical orthogonal function analysis reveals that certain sea level pressure patterns, partly reminding of atmospheric rivers are more often associated with heavy precipitation than more moderate events. Additionally, ensemble boosting is a suitable tool for case studies to analyze how extreme heavy precipitation as for the event in 1993 can be simulated. By boosting a 1993‐analog, one‐quarter of the resulting storylines show increased rainfall than observed, due to a stronger north‐south pressure gradient that may have exacerbated the flooding. Overall, the precipitation estimates demonstrate that ensemble boosting is a complementary method to statistical tools and suitable for stress testing, for example, infrastructure protection measures against potentially unseen heavy precipitation events.

. They can be understood as footprints of cyclones on their poleward travel from the subtropics rather than actual rivers with direct water vapor transport (Dacre et al., 2015;Zhu & Newell, 1998).The water vapor is transported along the cold front through a continuous cycle of evaporation from the sea surface and local water vapor convergence.Many annual maximum precipitation days in western and CEU are caused by such atmospheric rivers (Lavers & Villarini, 2013, 2015;Pasquier et al., 2019).
Taking into account the anthropogenic warming of the atmosphere with a higher water vapor holding capacity, Kunkel et al. (2013) estimate that the greatest accumulation of precipitation (probable maximum precipitation) will increase due to a higher moisture transport into storms.The frequency of atmospheric rivers are projected to increase over Europe during all season (Gao et al., 2016;Lavers & Villarini, 2013;Ramos et al., 2016;Warner et al., 2015;Zavadoff & Kirtman, 2020), connected to the alterations in cyclone activity (Zappa et al., 2013).Indeed, observations show a positive trend in the intensity of daily precipitation extremes over CEU for warm and cold seasons during the last decades (Coumou & Rahmstorf, 2012;Lehmann et al., 2015;Madsen et al., 2014;Masson-Delmotte et al., 2021;Moberg et al., 2006;Scherrer et al., 2016;Van den Besselaar et al., 2013) as a consequence of climate change (Fischer & Knutti, 2016;Hegerl et al., 2004;Min et al., 2011;Pall et al., 2011).Future changes in heavy precipitation, however, are uncertain, mainly due to uncertain alterations of the atmospheric circulation (O'Reilly et al., 2021).Nevertheless, it is imperative for the flood risk management of, for example, city planners and insurance companies to accurately estimate how extreme heavy precipitation can get, because more intense precipitation increases the risk of flood damage (Barredo, 2007;Blöschl et al., 2017;Brunner et al., 2021;Ramos et al., 2016).
Extreme precipitation events are rare and observational records are often too short to obtain a large enough sample of extreme events to reliably estimate their return period or even extrapolate into higher levels.In general, to increase the sample size of very rare precipitation events in climate data, there are two methods: (a) direct sampling from long climate model simulations (van den Brink et al., 2005) and (b) applying statistical tools to extrapolate into the tails of precipitation distribution (Coles, 2001;Katz, 1999).Statistical tools, based on the extreme value theory (Brown, 2018;Frei et al., 1998;Furrer & Katz, 2008) are computationally cheap to estimate the rareness of an event.The sampling uncertainty is reduced by parametric descriptions.However, extrapolating beyond the observations, based on statistical assumptions yields large uncertainties (Jones et al., 2014;Thompson et al., 2017), since the precipitation estimates are not necessarily physically realistic.Moreover, observations of precipitation extremes are uncertain as well.Undercatch errors in high latitudes and mountainous regions are typically between 3% and 20% but can also be up to 80%, where the spatio-temporal variability of precipitation is high and the density of point stations is low (Goodison et al., 1997;A. Prein & Gobiet, 2016).Additionally, measurement errors may occur during the conversion from radar reflectivity to rain intensity (Marra et al., 2017).In contrast to statistical approaches, direct sampling of climate extremes from dynamical climate model simulations provides physically coherent storylines, even though essential physical mesoscale, or smaller processes, for example, convection, are limited by the model's resolution and parametrization accuracy.On daily or sub-daily time scales, in the summer season, for complex terrain and high intensities, convection-permitting models show more accurate amounts of rainfall as the events happen on smaller spatial and temporal scales, depending on the region and the model (Ban et al., 2014;Fosser et al., 2014;A. F. Prein et al., 2013a).Nevertheless, the various convection parametrizations show realistic climatological precipitation (Ban et al., 2014;Fosser et al., 2014;Maher et al., 2018;A. G. Prein et al., 2013b).However, to catch very rare precipitation extremes that occur only once in a millennium through direct sampling, climate model simulations must run sufficiently long, which is expensive (Kent et al., 2022;Webber et al., 2019).Recent studies overcome this drawback by generating so-called storylines, that is, physically self-consistent unfoldings of past or plausible future events (definition after Shepherd et al. (2018)).Storylines benefit by providing a physical basis to climate events for exploring their upper bounds.This can help to raise people's risk awareness and improve stakeholder risk management through stress testing in an event-oriented framing (Hazeleger et al., 2015;Sillmann et al., 2021).This study employs both types of methods, a statistical and a storyline method to extrapolate beyond the observational record.The main research questions are: (a) whether extreme precipitation of much higher magnitude than observed is plausible today and in the near future and, (b) what are the respective physical drivers that amplify the rainfall intensity over CEU?
In recent years, multiple studies started to explore climate extremes beyond observations.The UNprecedented Simulated Extremes using ENsembles (UNSEEN) approach comprises large ensembles of initialized climate model simulations to increase the sample size of rare climate events, for example, used to estimate plausible rainfall extremes in the UK (Kent et al., 2022;Thompson et al., 2017) and return periods of precipitation extremes over Norway and Svalbard (Kelder et al., 2020).Bouchet et al. (2019) present rare event sampling methods, following the idea of performing independent climate simulations, selecting and cloning the realizations that evolve in a promising direction and omitting the realizations that deviate from it.In doing so, very intense heat waves (Ragone & Bouchet, 2021;Ragone et al., 2018) and tropical cyclones (Webber et al., 2019) are simulated computationally much cheaper with respect to direct sampling.Another storyline-method for the simulation of very rare climate extremes is ensemble boosting (Gessner et al., 2021), which is effectively used to explore unseen heat waves and easily extendable to simulate multi-year droughts (Gessner et al., 2022) by re-initializing extreme events in climate model simulations a few days to weeks before the peak of the event is reached.Perturbing the initial conditions through a round-off error ensures physical consistency within the laws of the implemented physics.In fact, ensemble boosting is a versatile method that is not only used to estimate how extreme unrelated precipitation events could get, but in this study, it is also used to perform a case study on a specific event like the flood-related heavy precipitation in 1993.At first, the study evaluates daily precipitation and 3-day accumulated precipitation extremes in the near future climate model simulation over Europe during the cold seasons to conclude whether the simulated precipitation extremes are plausible.Then, we investigate the tails of precipitation distribution over CEU from a statistical point of view and determine prevailing circulation patterns.In the last part, the storyline-based method of ensemble boosting is applied to the most extreme precipitation events in the near future simulation and to an analog of the 1993 heavy precipitation event, to simulate the worst possible outcomes and demonstrate that ensemble boosting as an appropriate tool for stress testing of flood protection resilience, for example, the sewer system, flood barriers and rainwater catch basin.

Data and Methods
This study focusses on very rare large-scale precipitation extremes over central Europe (CEU, in near future climate.The climate model simulations are run with the fully coupled Community Earth System Model version 2.1 (CESM2), consisting of ocean, atmosphere, land, sea-ice and river models (Danabasoglu et al., 2020).The atmosphere is described by the Community Atmosphere Model version 6 (CAM6), using a nominal 1° horizontal resolution with 32 vertical levels.A historical large ensemble simulation with 12 members is branched from the pre-industrial control run every 100 years and running freely from 1850 to 2015 under prescribed historical forcing.For the near future climate scenario, a large initial condition ensemble of 30 members runs freely from 2005 until the end of the simulation, using the same prescribed historical forcing from 2005 until 2014 and assuming SSP3-7.0 from 2015 to 2034.For this study, only the period from 2015 to 2034 is considered.The sample size of 20 years × 30 members = 600 years near future climate simulation improves statistical confidence in estimating return periods.However, for events with large return periods, which also have a higher damage potential, not even simulations of many hundreds of years are necessarily sufficient.Therefore, we here use the ensemble boosting method to directly simulate storylines of very rare precipitation events in the tail of the precipitation distribution.
The ensemble boosting method (Gessner et al., 2021) is implemented for the CESM2 large ensemble near future simulations by perturbing the initial field of specific humidity by a round-off error of a magnitude of 10 −18 to 10 −19 kg/kg in each ensemble member.The time period between the perturbation and the climate event of interest is called lead time.The chaotic nature of the atmosphere makes the initial perturbations grow, also known as the butterfly effect or theory of chaos (Lorenz, 1963) so that after a few days, the ensemble members differ from the unperturbed simulation and from each other, providing an alternative of the unperturbed climate storyline.
Precipitation events in CEU are dominated by convective storms during warm months, whose small-scale processes of moist (deep) convection are parametrized in climate models and may lead to deficiencies in the representation of convective precipitation (Meredith et al., 2015;Wilcox & Donner, 2007).During the cold months, large-scale precipitation is more dominant than convective precipitation (Berg et al., 2009).Therefore, this analysis only considers the months from October to April.Thereby, summer precipitation events, which might be even more intense than winter events might not be included in the analysis.
The model performance of daily precipitation, daily mean 2-m temperature and daily sea level pressure is evaluated using the gridded European observational data set E-OBS from 1950 to 2020 (Cornes et al., 2018).Since E-OBS data does not provide data over the oceans, atmospheric circulation patterns are also evaluated using the reanalysis ERA5 (Hersbach et al., 2020).
To quantify the return periods of very rare heavy precipitation and extrapolate to unseen return levels, we fit the stationary generalized extreme value (GEV) distribution, using the R package "extRemes" (Gilleland & Katz, 2016).The fitted GEV parameters are based on the area-weighted global mean annual surface temperature in CESM2 over the cold months of the near future period 2015-2034.Bootstrapping is used to compute the uncertainty range by drawing 500 random subsamples from all 30 initial condition members, each containing 80% of the total near future CESM2 data set.For the E-OBS data set, we are limited to the mean surface temperature of continental Europe.To characterize the sea level pressure patterns of extreme precipitation events, we conduct the empirical orthogonal function (EOF) analysis that resolves the spatial component patterns, that is, orthogonal modes of maximum variability.The first EOF component (EOF1) reflects the largest part of the variance between the Rx3d events.The other components follow in descending order.We use the NCL (National center for atmospheric research Command Language) function "eofunc_Wrap" to apply the EOF analysis on the sea level pressure fields during extreme precipitation over CEU.
The 1993 rainfall event (Figure 1a) is analyzed by using an analog of the original event in the CESM2 near future simulations for which the ensemble boosting method is implemented.The selection of the analog is based on two criteria: (a) the amount of 3-day accumulated precipitation over CEU is about the same as observed (57 mm, Figure 1a), (b) the associated atmospheric circulation regime resembles the dipole pattern in the observed event (Figure 1a) with low sea level pressure around Iceland and high pressure around the Azores.Figure 1b shows the selected 1993-analog with 54 mm rainfall over 3 days, averaged over CEU and 0.99 spatial correlation of sea level pressure with the ERA5 reanalysis over Europe and the North Atlantic Ocean (70W-50E, 20N-70N).Slight differences between the observed event and its best analog exist in the seasonality and background climate, since the analog occurred in November 2030, simulated under the SSP3-7.0scenario, whereas the observed event took place in December 1993.Also, large-scale conditions may differ, for example, the sea surface temperature (SST) anomalies and the static stability of the troposphere.

Model Evaluation
First, we evaluate precipitation and sea level pressure over Europe to examine whether precipitation events are reliably represented in the CESM2 simulations.To that end, the CESM2 historical and near future simulations are compared to the gridded observational data set E-OBS and the reanalysis ERA5.We consider precipitation events only in the cold months (October-April), where large-scale precipitation is dominant.Figure 2a shows that mean daily precipitation is larger in CESM2 than in E-OBS for large parts across Europe during the period 1950 to 2014.Only the Mediterranean region is drier in CESM2.This study focuses on precipitation over CEU (black rectangle in Figure 2a) that is overestimated by 0.7 mm per day (32%) on average.The standard deviation of daily precipitation is also overestimated by 0.5 mm per day (14%, not shown).Changes in mean daily precipitation from the historical to the near future CESM2 simulations are characterized by a slight increase in the north and a slight reduction in the south (Figure 2b), which is consistent with the literature and gives confidence that the precipitation changes in the near future scenario are consistent with multi-model average response (Douville et al., 2021).
In a next step, we evaluate how the wet bias in the CESM2 simulations affects the intensity of multi-day precipitation extremes over CEU, which is analyzed in this study.Does the wet bias lead to a higher number of heavy precipitation events in CESM2?To get more insight into the tails of the precipitation distributions, Figure 2c illustrates a quantile-quantile plot of monthly maximum 3-day accumulated precipitation (Rx3d).Adding the linear regression of the quantile-quantile combinations (dashed line) visualizes that monthly Rx3d is approximately parallel to the angle bisector (solid line), which would be the perfect agreement of the quantiles.Therefore, almost all quantiles have about the same bias relative to E-OBS.Only the smallest and some of the highest Rx3d events show a smaller bias.The annual Rx3d show an increasing bias for higher Rx3d events for which the simulated events are about 10% wetter than the observed events (Figure S1 in Supporting Information S1).
What might cause the systematic bias of too high amounts of precipitation in CESM2?We analyze whether the wet bias in precipitation could arise from biases in the atmospheric dynamics, one of the main physical drivers.
From here, we only analyze the near future CESM2 simulation and we use the full available E-OBS data set from 1950 to 2020. Figure 3 shows that the mean sea level pressure field during the most extreme monthly Rx3d only (>99th percentile, 42 events) in CESM2 resembles the corresponding fields in E-OBS (>90th percentile, 50 events) and ERA5 (>90th percentile, 28 events).Here, we show ERA5, because the reanalysis also provides 10.1029/2023EF003628 5 of 16 data over the ocean.In fact, the largest pressure differences between CESM2 and ERA5 occur over the Atlantic Ocean, which might contribute to the wet bias over CEU.However, the wet bias cannot be fully attributed to the circulation differences, as it extends over all percentiles of monthly Rx3d extremes.Note that we compare a different number of events so that unforced internal variability might also cause sampling errors.Nevertheless, in terms of magnitude, the mean pressure anomalies agree well between the three data sets.In accordance with literature (Bandhauer et al., 2022), the wet bias for the most extreme Rx3d events is even larger in ERA5 than in CESM2 with respect to E-OBS (Figure S2 in Supporting Information S1).
Reasons for the wet bias in the CESM2 simulations might be diverse.It is well-known that climate models contain a drizzling bias, that is, too frequent too light rainfall (Jing et al., 2017;Kay et al., 2018;Stephens, 2010), which might be part of the wet bias identified in the CESM2 simulation.Another part may relate to deficiencies in the observation-based products.For instance, Bandhauer et al. (2022) find underestimated high percentiles of precipitation over mountainous as well as flat areas in ERA5 and E-OBS, likely due to an effectively coarser resolution than the grid spacing.Precipitation undercatch is also observed during winter and early spring over mid-and high-latitude land areas (Chen et al., 2002), due to, for example, wind and evaporation effects and the selected interpolation algorithm (Hulme, 1995;Sevruk, 1982Sevruk, , 1989)).Note that this study analyses very rare climate extremes with amounts of precipitation that have not been observed so far and thus, cannot be directly evaluated.
In summary, the large ensemble CESM2 simulation shows a constant wet bias over all percentiles of monthly Rx3d for CEU during the cold months relative to E-OBS.Since the atmospheric dynamics during the most extreme Rx3d events are reasonable in CESM2, compared to E-OBS and ERA5, and the focus is rather on the methodology to extrapolate precipitation extremes than on the exact value of rainfall, we have confidence that the large-scale precipitation is sufficiently well represented, even in the tail of the precipitation distribution.In

Results
Just before Christmas in 1993 a heavy precipitation event over northern France and western Germany caused severe flooding of the same area (Fink et al., 1996), even though this event is not the most extreme observation in E-OBS of monthly maximum 3-day accumulated precipitation (Rx3d) over CEU during the cold seasons (October-April).This raises the question how much more intense Rx3d could get and hence how much the flood damage might rise.This is particularly of interest to city planners to increase the resilience of sensitive infrastructure, and insurance companies to price their products.The results use and discuss two methods to estimate how much more precipitation is possible for this event and for heavy precipitation over CEU in general and what are the associated atmospheric circulation regimes of such events.

Statistical Estimates of Extreme Precipitation Over Central Europe
In the first part, statistical frameworks are used from the extreme value theory to extrapolate beyond the observed amounts of heavy precipitation (Ban et al., 2020;Coles, 2001;Papalexiou & Koutsoyiannis, 2013;Serinaldi & Kilsby, 2014).In Figure 4, stationary GEV distributions are fitted to annual Rx3d during the cold seasons, averaged over CEU in the observational data set E-OBS and the large ensemble near future CESM2 simulation, respectively.The uncertainty of the return period is smaller in CESM2, because the data set is more than eight times larger than in E-OBS (600 vs. 71 years).Larger data sets as the CESM2 simulations are particularly important for more confidence in the tail of the Rx3d distribution.The larger data set increases the chance to catch very rare precipitation events that describe the tail and perhaps have not even been observed yet.Indeed, the observed area-average maximum of 68 mm in October 1986 is exceeded by four events in the larger CESM2 data set.Here, the maximum Rx3d reaches 76 mm, averaged over CEU, and up to 100 mm on grid cell level.Adding these Rx3d maxima to Figure 4 visualizes that much more intense Rx3d is estimated by the fitted GEV distributions for CEU, which is in accordance with studies on 5-day accumulated precipitation extremes in winter over the Alps (Ban et al., 2020) and in northern Europe ( Van den Besselaar et al., 2013).The course of the GEV distribution is given by the fitted shape parameter, which is close to zero for both CESM2 and E-OBS, that is, the distributions have no heavy tail.We tested the shape parameter to be robust against small changes of the selected region.For short return periods of about 20 years or less, the fitted GEV distributions agree between CESM2 and E-OBS.However, for longer return periods, the best estimate of annual Rx3d is larger in CESM2 than in E-OBS for the same return period, which was to be expected by the wet bias found in the evaluation (see previous chapter) but might also be the consequence of intensifying extreme precipitation between the observational period (1950-2020) and the CESM2 simulation covering a present-day and near future period (2015)(2016)(2017)(2018)(2019)(2020)(2021)(2022)(2023)(2024)(2025)(2026)(2027)(2028)(2029)(2030)(2031)(2032)(2033)(2034).
By adding the 1993 rainfall event in E-OBS to Figure 4, the return level plot shows that the estimated moderate return period ranges from 13 to almost 2000 years.Considering that already this moderate event caused severe flood damage on buildings and infrastructure under the given boundary conditions (Ionita et al., 2020), higher amounts of rainfall might cause even larger damage.But how extreme could this event have become?Our statistical analysis reaches some limits in the confidence of the return levels at this point, because the GEV distribution does not take the boundary conditions into account to estimate how a perfect storm could have built up from the associated circulation regime.Moreover, no physical laws are considered and no physical processes that drive the estimated events are provided.Consequently, the GEV distributions cannot guarantee plausible amounts of precipitation, especially in the extrapolated range.In the following, we focus on the atmospheric circulation patterns to understand the drivers of Rx3d extremes over CEU and how unseen events could develop.

The Atmospheric Pressure Patterns of Heavy Precipitation Over Central Europe
We investigate the characteristics in the circulation patterns that favor heavy precipitation by applying the EOF analysis to the 3-day mean sea level pressure fields of annual Rx3d events over CEU.The EOF components present patterns that explain the pressure difference between the very rare and more moderate Rx3d events.The first component (EOF1) reflects the climatological mean state as EOF1 represents the dominant pattern of variance between the Rx3d events.The other components follow in descending order.Figures 5a-5d illustrate the resulting fields of pressure variance for monthly Rx3d over CEU, based on the CESM2 near future simulations.The patterns correspond to the results of the same analysis, based on ERA5 (Figure S3 in Supporting Information S1), particularly for the most dominant components EOF1 to EOF3, which increases our confidence in the accuracy of the pressure patterns.Note that the sign and the variability strengths of the EOF components do not reflect pressure patterns for specific precipitation events.The associated anomaly patterns are superpositions of all EOF components.The strength of their impact is weighted by related EOF scores.
The EOF scores in Figures 5e-5h reveal the impact of the respective EOF component for each precipitation event.
The EOF scores are sorted by the amount of precipitation per event with more moderate Rx3d on the left and most extreme events on the right in the subfigures (e-h).Interestingly, the EOF2 and EOF3 score (Figure 5f) are significantly positive (negative) for the highest (smallest) amounts of precipitation.The significance is determined by a comparison with the 95th percentile of the number of positive (negative) EOF scores.By bootstrapping 1000 subsamples of 100 successive EOF2 respectively EOF3 scores each, we find the confidence interval (5-95th percentile) of 47-61 positive (39-53 negative) scores for EOF2, which is exceeded by the top 100 (bottom 100) events that include 68 positive (65 negative) scores.For EOF3 (Figure 5g), the top 100 (bottom 100) includes 64 positive (64 negative) scores, which also exceeds the 95th percentile, counting 55 positive (57 negative) EOF3 scores.Note that the asymmetry in the confidence thresholds results from the binary classification of continuous EOF scores into positive and negative values.The average over all continuous EOF scores is zero.Across all events, the correlation between the Rx3d intensity and the EOF2 score respectively the EOF3 score is low (0.23 and 0.22) but significant, tested with the p-value of a two-sided hypothesis test.This indicates that natural variability is high and there are multiple pressure variance patterns associated with heavy precipitation over CEU.The EOF1 and EOF4 scores are not correlated with Rx3d.Therefore, heavy precipitation over CEU is favored by a positive EOF2 and EOF3 component in the CESM2 near future simulations, that is, low pressure anomalies over western and north-eastern Europe, and high-pressure anomalies over Greenland and North Africa.This connection cannot be found in ERA5 (Figure S3 in Supporting Information S1), however, the data set is much smaller and hence, the behavior might not be visible and significant in the EOF scores.Also, other studies found specific pressure patterns and cyclone track types for heavy precipitation over CEU (Hofstätter et al., 2018;Müller et al., 2009).In the next section, we gain more process understanding by using a storyline-method in contrast to the statistical approaches.

Estimates of the 1993 Heavy Precipitation Using Ensemble Boosting
The previous section illustrates, based on a statistical method how much more intense high return level Rx3d events would be than most extreme observed events over CEU.For instance, for planning it may be important to understand how much more extreme events like the flood-related heavy precipitation in 1993 could become (Figure 1a).To gain process understanding of heavy precipitation and estimate high amounts of rainfall on a physical basis, we apply the model-based ensemble boosting method.In contrast to the statistical methods, the boosted members do not represent an independent sample and do not allow for a direct quantification of their return period, because they are generated by initial condition ensembles of selected precipitation events.However, ensemble boosting is a complementary method to the statistical approaches that provide the associated physical processes of the alternative precipitation events, which gives us confidence that the estimated Rx3d maxima are plausible.
To estimate how extreme an event like the 1993 heavy precipitation event could become under the SSP3-7.0near future scenario, we generate alternative storylines of an analog of the 1993 heavy precipitation in CESM2 (see Figure 5. Empirical orthogonal function (EOF) analysis of sea level pressure during the most extreme precipitation events over central Europe.The four dominant EOF components (a-d) and the EOF scores (e-h) of 3-day mean sea level pressure during annual maximum 3-day accumulated precipitation (Rx3d), averaged over central Europe during the cold months (October-April) in CESM2.The EOF scores are sorted according to the amount of Rx3d in ascending order (left to right, 600 events in total).The black line is the running mean over 100 events.methods for selection criteria).Then, ensemble boosting is applied to the 1993-analog, initializing a 100-member IC ensemble each day from 7 to 17 days before the peak of precipitation (see methods for more details).As an example, Rx3d in the boosted ensemble members for the lead time of 14 days is shown in Figure 6a during the unperturbed 1993-analog, averaged over CEU.The Rx3d of the unperturbed run is exceeded by 24 out of 100 members at day 0. The maximum reached in the ensemble (87.7 mm, dashed line) is about two thirds higher than the unperturbed simulation, which even exceeds the highest amounts in any of the 30-members of CESM2 near future simulations.In terms of return periods, the rareness increases from around 21 years of the unperturbed analog to a range of about 6000 to 80,000 years (Figure 4).
The spread of Rx3d in the boosted ensembles depends on the unperturbed reference event and on the lead time.The latter is shown in Figure 6b by histograms of the Rx3d peak around day 0 in the boosted ensembles of the 1993-analog, for all lead times.For short lead times of less than 10 days, the spread of Rx3d is small around a high mean value that is similar to the unperturbed maximum (red vertical line).The longer the lead time the larger the spread and the lower the mean value (Figure 6b, top to bottom), because the initial perturbations have more time to grow and influence the precipitation event.Since the initial perturbation of the boosted members is so small, the signal of the precipitation event remains in the atmospheric circulation even for lead times longer than 14 days.Interestingly, in most of the members, the signal changes to a less intense event with about 70% of the original intensity and the distribution in the ensembles is negatively skewed.For even longer lead times, we expect that the precipitation distribution approaches the climatological distribution, but the time scale for this development is different for each event.Note that for other boosted reference events, the degree of exceedance can be different, that is, for some members the unperturbed amount of rainfall is exceeded by around 30%-40%, but for other events, they hardly do or the atmospheric circulation is quasi-stationary and hence the Rx3d peak in the members is about the same as for the original event even for long lead times of weeks.This is shown by the boosted ensembles for other extreme precipitation events in CESM2 (Figure S4 in Supporting Information S1).Difference in the spread of the boosted ensembles from event to event were also found for heat waves in Gessner et al. (2021).In case of the 1993-analog, 328 of all 1100 storylines (30%, 100 members for 11 lead times) have more Rx3d than in the unperturbed simulation and 33 (3%) exceed the unperturbed event significantly, that is, by more than the standard deviation of Rx3d in the CESM2 near future large ensemble (10.3 mm).These storylines suggest that the amount of Rx3d in 1993 could be substantially higher over CEU under the near future scenario SSP3-7.0,given the initial conditions of the analog.(a) 3-day running accumulative precipitation in the 1993-analog precipitation event (dark red line) and the 100 ensemble members (light red lines), initialized 14 days before the peak in precipitation (day 0).The black dashed line marks the member with highest 3-day accumulated precipitation.(b) Histogram of maximum 3-day accumulated precipitation (Rx3d) in the boosted ensembles around day 0 for different lead times.The axis at the bottom shows the normalization to the Rx3d in the unperturbed analog (vertical red line at 100%).Both gray vertical lines show the 100 members with highest and lowest Rx3d.(c) Rx3d (color shading) and 3-day mean sea level pressure (contour lines, 5-hPa spacing) averaged over the 100 most extreme boosted ensemble members of the 1993-analog.(d) Same as (c) but averaged over the 100 least extreme boosted ensemble members of the 1993-analog.
In the following, we use the boosted ensembles of the 1993-analog and of other precipitation extremes to analyze the impact of the atmospheric circulation on heavy precipitation over CEU.Ionita et al. (2020) classifies the original event in 1993 as an atmospheric river event (Figure 1a).Figures 6c and 6d show the 3-day mean sea level pressure and concurrent Rx3d for the 100 boosted members of the 1993-analog (Figure 1b) with highest and lowest amounts of Rx3d, respectively, over CEU.The initial conditions of the members are nearly identical and the lead times are short enough so that the storylines of highest as well as lowest amounts of precipitation show similar main features in the pressure pattern, including low pressure in higher latitudes and high pressure in lower latitudes with a precipitation band along a front over the Atlantic that reaches from low latitudes to CEU.The mean field of the most intense storyline is similar to the unperturbed 1993-analog.The variance within both 100 member sets is small and the storylines between the tails of the precipitation distribution in the ensembles merge those fields.The sea level pres sure differences between both fields are visualized in Figure 7e.Accordingly, higher amounts of precipitation over CEU relate to the large pressure gradient along the zero meridian and west of it, which resembles the pressure patterns during atmospheric river events over the North Atlantic.Due to missing model output to calculate the vertically integrated water vapor transport we cannot unequivocally identify the most extreme events in the CESM2 simulations as atmospheric river events, but it points out that the patterns of sea level pressure and precipitation over the Atlantic and Europe agree closely with previously identified atmospheric rivers (Ionita et al., 2020;Lavers et al., 2011).
To examine whether the pressure patterns change in the same way for other even more intense rainfall events, we apply ensemble boosting to the four events of highest Rx3d over CEU (Figure S4 in Supporting Information S1) as done for the 1993-analog.Note that only these four events exceed the observed record (Figure 4), which is surprising, due to the much larger number of events in the near future CESM2 data set.The reason is that the maximum observed precipitation in E-OBS is an outlier with an anomaly of 4.7 times standard deviation with respect to the mean Rx3d (Figure S1 in Supporting Information S1).Although these events are already very rare, a large fraction of 29% of the boosted ensemble members exceeds even these reference return levels, reaching almost 100 mm (+32% from the unperturbed Rx3d event in CESM2) over CEU, locally maybe higher and a return period of at least 50,000 years (Figure 4).These storylines support the statistical estimations that much more intense Rx3d events are possible over CEU.5b) for the presented sea level pressure fields.Orange bars are the scores for pressure fields, averaged over the 100 members with highest amounts of precipitation.Blue bars display the scores for the average pressure field over the 100 lowest amounts of precipitation.The gray background marks the standard deviation of EOF2 of monthly Rx3d.(g) The same as (f) but for the third loading (EOF3 from Figure 5c).
Figures 7a-7e show the corresponding mean sea level pressure fields of the 100 most and 100 least intense Rx3d storylines.Even though, all events show different pressure patterns over Europe and the Atlantic Ocean, they share a quasi-meridional pressure dipole, including high pressure over the Iberian Peninsula and a low-pressure trough over the Atlantic or northern Europe.These pressure patterns resemble the circulation pattern during atmospheric river events as found in the literature (Dacre et al., 2015;Ionita et al., 2020;Pasquier et al., 2019).During the events, CEU shows a characteristic strong sea level pressure gradient.In general, the differences between the more and less extreme precipitation storylines show changes in the position and intensity of the pressure dipole so that rainfall is more concentrated over CEU and less rain falls over the surrounding area (Figure S5 in Supporting Information S1).The pressure patterns differ individually, depending on the selected event.
For the first event (Figure 7a), the most extreme boosted members show stronger pressure gradient upstream of CEU, due to more pronounced low pressure and further northward expanded high pressure, on average.The highest amounts of precipitation in the boosted members of the second most extreme event (Figure 7b) can be found when the low pressure system moves southwards toward CEU, on average.In contrast, the most extreme boosted members of the third and fourth event (Figures 7c and 7d) show a less pronounced pressure gradient over CEU and the in the 1993-analog (Figure 7e), the low pressure system moves further northwards.To better characterize these changes in the circulation patterns, we apply the EOF analysis to the 3-day mean sea level pressure fields of highest and lowest precipitation.To that end, the EOF analysis is repeated, adding the boosted ensemble members of interest to the data set of annual Rx3d events (Figure 5), which barely affect the dominant loadings.Figures 7f and 7g display the related EOF2 and EOF3 scores, which were found to be more positive for the most extreme heavy precipitation events over CEU (Figures 5b, 5c, 5f, and 5g).Indeed, Figure 7f reveals that all EOF2 scores are larger for the mean patterns of the most extreme storylines, that is, lower sea level pressure over north-eastern Europe and higher sea level pressure over north-western Europe, compared to the least extreme storylines.Surprisingly, the results for the EOF3 scores are less clear, which describe lower sea level pressure over western and CEU for larger scores.In general, the differences only slightly exceed the standard deviation of monthly Rx3d, if at all, since the boosted members do not vary strongly from each other.Additionally, the differences in 2-m temperature between the boosted members are highly correlated with the sea level pressure differences (not shown).Note that already small changes in the sea level pressure pattern can cause large differences in the amount of precipitation over CEU.
Altogether, we show that ensemble boosting is a complementary method to statistical approaches, which generates physically consistent storylines of very rare and even unseen precipitation extremes over CEU, but does not provide probabilistic information on return periods and return levels.However, ensemble boosting allows to conduct case studies and analyze the physical drivers.Here, we demonstrate that observed heavy precipitation as the event in 1993 could be more intense and possibly more damaging, based on an analog in the CESM2 near future simulations.The most extreme Rx3d events over CEU are associated with circulation patterns that resemble the patterns found for atmospheric rivers.

Conclusions
In this study, two methodological frameworks are presented to estimate possible extreme precipitation levels beyond the observational record for CEU during the cold seasons (October-April).In case studies on heavy precipitation in a flood-related event like 1993 and other events, we examine how extreme these events could have become, based on ensemble simulations.Addressing this question is important to increase the resilience against natural hazard, which are projected to become more frequent and more intense over CEU with increasing greenhouse gas concentrations (Seneviratne et al., 2021).Here, we analyze precipitation in a large ensemble simulation with CESM2 under the SSP3-7.0scenario for present-day and near future conditions (2015)(2016)(2017)(2018)(2019)(2020)(2021)(2022)(2023)(2024)(2025)(2026)(2027)(2028)(2029)(2030)(2031)(2032)(2033)(2034).Even though the model shows a moderate wet bias relative to the observational data set E-OBS over all monthly maximum 3-day accumulated precipitation (Rx3d) percentiles, the associated atmospheric dynamics are well captured and a suitable analog of the 1993 event was found.
In a first statistical approach, GEV distributions are fitted to annual Rx3d over CEU in CESM2 and E-OBS, respectively, estimating that much more intense Rx3d is possible than observed.Statistical approaches, however, do neither characterize the physical mechanisms nor the spatial and temporal structure, and the uncertainty range increases strongly beyond the observational range in the GEV distribution.Therefore, we use a second storyline approach, called ensemble boosting, which provides physically coherent realizations of selected events.By re-running the most intense precipitation events, this approach reveals how the selected events could have developed in alternative storylines.We find that even the highest simulated amounts of precipitation are exceeded by 30%-40% in the boosted ensembles, but also higher magnitudes might be possible.However, ensemble boosting does not allow to directly estimate the return period of the storylines, since their initial fields only differ by a round-off error so they cannot be considered as independent.Therefore, both methods should not be understood as competing but as complementary approaches.Moreover, ensemble boosting is proposed as an efficient and flexible tool to address "what if" questions, which can build a bridge between scientific (basic) research and safety-relevant application for the society.Here, we do not only use ensemble boosting to support the statistically estimated return levels by providing consistent precipitation levels in the boosted ensembles.We also use ensemble boosting for a case study of a 1993-analog to investigate how much more extreme the heavy precipitation event and possibly the flood damage could have turned out in the CESM2 near future simulations by generating alternative pathways.We find that 24% of the boosted 1993-analog members exceed the original Rx3d and reach Rx3d return levels that have not been observed.The return periods of the most extreme simulated cases increases from 21 to at least multiple thousand years through ensemble boosting.Since the storylines are based on climate model simulation, they provide the physical processes, causing the climate extremes of interest.Boosting the four most extreme Rx3d events over CEU in CESM2 and the 1993-analog, we find that the highest amounts of precipitation are associated with the most pronounced variations of sea level pressure patterns that resemble the fields during atmospheric river events.Even though some changes in the atmospheric circulation are small, the effect on the amount of precipitation can be large.Additionally, the EOF analysis is applied to the 3-day mean sea level pressure fields of monthly Rx3d, revealing that also for independent events, certain pressure variances are more often associated with very rare heavy precipitation than with more moderate events.Since an increased amount of rainfall could have aggravated the economic damage, this tool might be of interest for stakeholders, city planners and insurance companies to stress test protective measures and increase resilience against a large spectrum of potential event realizations.Overall, ensemble boosting is a flexible tool that can be applied to any climate event of interest in the CESM2 simulation.By selecting the storylines with the most desired development, for example, highest amounts of precipitation or frequent precipitation events over a certain period, this method is suitable to test the resilience of protective measures for many different extreme climate events.

Figure 1 .
Figure1.The 1993 precipitation event in ERA5 and its analog in CESM2.Maximum 3-day accumulated precipitation (Rx3d, color shading) and 3-day mean sea level pressure during the flood-related precipitation event in 1993 over central Europe (enclosed area) in ERA5 (a) and for the best analog found in the CESM2 near future simulations (b), given the listed criteria.

Figure 2 .
Figure 2. Evaluation of precipitation in CESM2.(a) Difference of mean daily precipitation between the historical 12-member ensemble simulation with CESM2 and E-OBS from 1950 to 2014, averaged over October to April.The black rectangle encloses the region of interest in central Europe.(b) Same as (a) but for the difference between the near future (2015-2034, 30 members) and historical CESM2 simulations.Note the altered color scale.(c) Quantile-quantile plot of monthly maximum 3-day accumulated precipitation (Rx3d), averaged over central Europe (enclosed in a and b) from October to April in CESM2 and E-OBS.The dashed line is the linear regression, and the solid line is the angle bisector.

Figure 4 .
Figure 4. Generalized extreme value (GEV) distribution fit to annual maximum 3-day accumulated precipitation over central Europe.GEV distribution fits to annual block maxima of 3-day accumulated precipitation, averaged over central Europe during the cold months October to April in the CESM2 simulations (2015-2034, red shading, 30 ensemble members × 20 years per member = 600 years) and in E-OBS (1950-2020, gray shading, 71 years).The color shading is the respective 5-95th percentile of the uncertainty range, and the dashed lines are the median estimates of the GEV distribution.The top red horizontal line shows the maximum event in CESM2.The black horizontal line below is the maximum in E-OBS and the lowest black line marks the amount of the flood-related precipitation event from 19 to 21 December 1993.

Figure 6 .
Figure 6.Ensemble boosting of the 1993-analog precipitation event in CESM2 over central Europe.(a) 3-day running accumulative precipitation in the 1993-analog precipitation event (dark red line) and the 100 ensemble members (light red lines), initialized 14 days before the peak in precipitation (day 0).The black dashed line marks the member with highest 3-day accumulated precipitation.(b) Histogram of maximum 3-day accumulated precipitation (Rx3d) in the boosted ensembles around day 0 for different lead times.The axis at the bottom shows the normalization to the Rx3d in the unperturbed analog (vertical red line at 100%).Both gray vertical lines show the 100 members with highest and lowest Rx3d.(c) Rx3d (color shading) and 3-day mean sea level pressure (contour lines, 5-hPa spacing) averaged over the 100 most extreme boosted ensemble members of the 1993-analog.(d) Same as (c) but averaged over the 100 least extreme boosted ensemble members of the 1993-analog.

Figure 7 .
Figure7.Sea level pressure differences and empirical orthogonal function (EOF) scores for boosted precipitation extremes.(a-e) Sea level pressure difference between the 100 boosted ensemble members with highest and with lowest amounts of precipitation (color shading) over central Europe (CEU) (black rectangle).The contour lines show the mean sea level pressure field (5-hPa spacing), averaged over the 100 members with lowest amount of precipitation in CEU.(a-d) Shows the boosted events with highest amounts of precipitation over CEU in the CESM2 simulation.(e) Is the boosted analog of the precipitation event in 1993.(f) The EOF scores of the second loading (EOF2 from Figure5b) for the presented sea level pressure fields.Orange bars are the scores for pressure fields, averaged over the 100 members with highest amounts of precipitation.Blue bars display the scores for the average pressure field over the 100 lowest amounts of precipitation.The gray background marks the standard deviation of EOF2 of monthly Rx3d.(g) The same as (f) but for the third loading (EOF3 from Figure5c).