Changes in the Climate System Dominate Inter‐Annual Variability in Flooding Across the Globe

Extreme flood events have regional differences in their generating mechanisms due to the complex interaction of different climate and catchment processes. This study aims to examine the capability of climate drivers to capture year‐to‐year variability in global flood extremes. Here, we use a statistical attribution approach to model seasonal and annual maximum daily discharge for 7,886 stations worldwide, using season‐ and basin‐averaged precipitation and temperature as predictors. The results show robust performance of our seasonal climate‐informed models in describing the inter‐annual variability in seasonal and annual maximum discharges regardless of the geographical region, climate type, basin size, degree of regulation, and impervious area. The developed models enable the assessment of the sensitivity of flood discharge to precipitation and temperature changes, indicating their potential to reliably project changes in the magnitude of flood extremes.


Introduction
In the last two decades, flooding affected 1.6 billion people globally and incurred $651 billion in economic costs, the highest and the second highest of any other environmental hazards, respectively (CRED & UNDRR, 2020).As warming trends are globally expected to continue over the 21st century, flood damages are expected to increase in the future (e.g., Alfieri et al., 2017;IPCC, 2021) in light of climatic and socio-economic changes (Merz et al., 2021;Wasko, Sharma, & Pui, 2021).Therefore, addressing questions about changes in flooding has important socio-economic consequences for our adaptation and mitigation strategies.
The detection of trends in flood extremes has received much attention in hydrology.However, it remains a challenge to achieve consensus on the detection of changes because of its lack of evidence (e.g., Archfield et al., 2016), sensitivity to data length (e.g., Do et al., 2017), and strong variations by basin alteration (e.g., Hodgkins et al., 2019), climate region (e.g., Slater et al., 2021), and catchment size (e.g., Bertola et al., 2020).Furthermore, extrapolating flood extremes directly from these findings for future projection is not reliable because the historical trends might not be representative of future conditions, especially if we do not understand the physical processes responsible for these changes (e.g., Do et al., 2020;Kim & Villarini, 2024;Mallakpour et al., 2018;Schlef et al., 2018;Villarini & Wasko, 2021;S. Zhang, Zhou, et al., 2022).
With this background, there is a growing interest in the attribution of changes in flood extremes based on understanding the driving flood processes (e.g., Alifu et al., 2022;Kim & Villarini, 2023a;Merz et al., 2022;Prosdocimi et al., 2015).Flood generating mechanisms vary geographically and seasonally depending on the climate drivers (e.g., summer monsoon, snowmelt, rain-on-snow, tropical cyclone).Together with various physical processes, antecedent soil moisture is an important factor in flood generation worldwide (e.g., Berghuijs et al., 2019;Stein et al., 2021;Tramblay et al., 2022;Wasko et al., 2020a).As such, climate variability and its feedbacks with the land surface play a significant role in determining flood mechanisms.Furthermore, climate variability can affect changes in flood magnitude not only by itself (Liu et al., 2022), but also by shifting floodgenerating processes (Jiang et al., 2022;Kemter et al., 2020;Tarasova et al., 2023).
This study aims to examine how climate drivers can describe changes in the magnitude of flood extremes worldwide.Despite the complexity of the flood generating mechanisms, Kim and Villarini (2023a) showed that seasonal precipitation and temperature are the key predictors as proxies for different physical processes to describe the changes in flood extremes across the conterminous United States (U.S.).Here we extend the statistical attribution framework developed by Kim and Villarini (2023a) to the global scale and assess how well the seasonal climate predictors describe seasonal and annual maximum daily discharges worldwide.

Streamflow Databases
We use four data sources providing streamflow observations and their basin boundaries: (a) Global Streamflow Indices and Metadata archive (GSIM) (Do et al., 2018a;Gudmundsson et al., 2018a); (b) African Database of Hydrometric Indices (ADHI) (Tramblay et al., 2021); (c) Australian Bureau of Meteorology's Hydrologic Reference Stations (HRS); and (d) U.S. Geological Survey (USGS).The GSIM is a global collection of metadata and indices derived from daily streamflow time series observed between 1806 and 2016 at more than 35,000 streamgages from 12 databases (Do et al., 2018a).It provides quality-controlled indices including mean, maximum, and minimum daily streamflows for monthly, seasonal, and yearly temporal scales (Gudmundsson et al., 2018a).The ADHI is an African database of indices computed from quality-controlled daily discharges at 1,466 stations over the period 1950-2018, including monthly time series of mean, maximum, and minimum daily discharges.The HRS provides a high-quality, unimpacted, daily streamflow series observed to early 2019 at 467 stations across Australia.The USGS also provides daily streamflow series over the U.S. and its territories.
From the above-mentioned data sources, we obtain the seasonal maximum mean daily discharge series between 1949 and 2019 and screen the stations as follows.
1.Only seasonal maxima based on at least 85 daily observations are considered (Gudmundsson et al., 2018a).2. Only years with seasonal maxima for all four seasons are considered.Here we consider the water year that starts in the season with the lowest median of seasonal maxima among the four seasons, similar to Wasko et al. (2020b).3.Only stations for which four seasonal maxima are available for at least 30 years are considered.
Additionally, we excluded the stations whose basin is located within a "polar" region based on the Köppen-Geiger climate classification (Beck et al., 2018).Although GSIM provides observations from streamgages worldwide, we use ADHI and HRS data sets for Africa and Australia, respectively, since they have more up-to-date observations with higher quality compared to GSIM.For the conterminous U.S., we use the USGS seasonal maximum discharges up to 2019 from Kim and Villarini (2023a).As a result, a total of 8,211 stations are selected (as mentioned below, this number is further reduced based on the modeling results).Figure S1 in Supporting Information S1 illustrates locations, basin boundaries, basin sizes, and record lengths of the selected stations worldwide, divided into seven regions (North America, NA; South America, SA; Europe, EU; Africa, AF; Northern Asia, NAS; Southern Asia, SAS; and Oceania, OC).

Meteorological Data
We consider gauge-based global precipitation and temperature data for modeling flood extremes.The Global Precipitation Climatology Center (GPCC) provides the Full Data Product (V2020) of gridded (0.25°× 0.25°) monthly precipitations from 1891 to 2019 by interpolating 85,000+ rainfall gauges worldwide (Becker et al., 2013;Schneider et al., 2020).From the GPCC V2020, we obtain monthly precipitation data.For temperature, we use the GHCN_CAMS gridded (0.5°× 0.5°) 2 m temperature data set (Fan & Van Den Dool, 2008), a global monthly mean land surface air temperature from 1948 to present.We calculate basin-averaged values of monthly precipitation and temperature by taking the mean of the grid cells belonging to the basin area, and then compute the seasonal means of these basin-averaged monthly series (hereafter referred to as basin-and seasonaveraged precipitation and temperature).

Method
To model the inter-annual variability in flood extremes, we apply the statistical attribution approach developed by Kim and Villarini (2023a).For each streamgage and each season, we first develop a statistical model for seasonal maximum mean daily discharge using climate predictors (DJF: December-February; MAM: March-May; JJA: June-August; SON: September-November).For this, we use the nonstationary gamma distribution (or zeroadjusted gamma distribution for observations with zero discharge values), widely used for modeling extreme flood magnitude (e.g., Kim & Villarini, 2023a, 2023b;Miniussi et al., 2020).To attribute the inter-annual variability in seasonal maxima, basin-and season-averaged precipitation and temperature in the season of interest are used as covariates for the distribution parameters.We also consider the season prior to the season of interest to capture lagged effects (e.g., antecedent wetness).Based on the parameterization in the Generalized Additive Models for Location, Scale, and Shape (Rigby & Stasinopoulos, 2005), we can define location (μ), scale (σ), and the probability of zero discharge occurrence (ν) parameters of the gamma and zero-adjusted gamma distribution as below: where P con (T con ) and P lag (T lag ) are basin-and season-averaged precipitation (temperature) in the concurrent and previous seasons, respectively.Depending on the linear combination of predictors, we fit all possible models to the observed seasonal maximum discharge series for the whole period.We then select the best one based on the Schwarz's Bayesian criterion (Schwarz, 1978).If the distribution parameters are not estimated due to no convergence of the fitting procedure (i.e., maximum likelihood estimation) or the maximum value of μ during the observation period is more than 10 times the observed highest seasonal maximum discharge, we consider the modeling not successful.
After selecting the best model for each season and station, we employ a Monte Carlo approach to simulate annual maximum daily discharge series by combining four seasonal models.Since the driest and wettest seasons are different geographically worldwide (Figures S2 and S3 in Supporting Information S1), we consider the water year whose beginning season is the season with the lowest median of seasonal maxima.For every year, we first generate one seasonal maxima from each of the four selected seasonal models and select the largest one as the annual maximum value.By iterating this step 1,000 times, we obtain the distribution of annual maximum daily discharge series over the whole observation years (see Kim & Villarini, 2023a for details).To assess the model performance in describing the inter-annual variability of flood extremes, we calculate the Spearman correlation coefficient between observations and the median (i.e., 50th percentile) of the modeled seasonal and annual maximum discharges.Furthermore, we conduct leave-one-out cross-validation to evaluate the model performance.For each season, we estimate the parameters of the best model excluding one seasonal maximum value at the time and predict its value based on the values of the predictors for the given year.We iterate this process for each year to obtain a time series of seasonal and annual maxima for the period of record.We then compute the Spearman correlation coefficient between the cross-validated time series and observations.Finally, we perform sensitivity analyses to examine how the changes in seasonal climate predictors affect the changes in the magnitude of flood extremes.This data-driven approach is relevant to assess flood elasticity to climate drivers (Y.Zhang et al., 2023).We set changing conditions, adding climate drivers one by one, starting with the most important one.Because the tendency of temperature change is generally similar throughout seasons, we consider both T con and T lag changes at once.Consequently, we set three conditions for changing climate drivers: (a) only P con changes (Con1); (b) both P con and P lag change (Con2); and (c) all four climate drivers change (Con3).Considering the range of projected changes in globally averaged annual land precipitation (Lee et al., 2021), we limit the change in precipitation to 10%, and do the same for temperature.We conduct Monte Carlo experiments by varying the climate predictors between 10% and 10% for each condition and season, allowing us to explore the sensitivity of the annual maximum discharges to changes in the drivers.Here we repeat the simulations 10,000 times to obtain more reliable flood quantiles with a lower annual exceedance probability (AEP).For a given AEP (e.g., the 0.01-AEP refers to the 0.99 quantile), we obtain the median of the flood quantiles over all the observation years (Q AEP med ); we then calculate its relative changes (r∆Q AEP med ) based on the Q AEP med with no changing condition (Q .nc AEP med ) as below.

Results and Discussion
Out of 8,211 stations, we evaluate the model performance for 7,886 stations with no model failure in all seasons.
Our statistical approach performs well in reproducing the inter-annual variability of annual maximum discharges worldwide, having a correlation coefficient larger than 0.56 for half of the stations for all seven regions (Figure 1).OC shows the best performance, with 75% of the stations having a correlation coefficient larger than 0.7, followed by AF and NA, with the median correlation coefficient of 0.67 and 0.66, respectively.For the performance of the seasonal maxima, there are differences across the regions (colored boxplots in Figure 1).While OC and AF show similar performances regardless of season, NAS and SAS show limited performances especially in the DJF season, with the median correlation coefficients of 0.36 and 0.23, respectively.Given that the DJF season is the driest season for most stations in these regions (Figure S2 in Supporting Information S1; e.g., Yao et al., 2010), we want to examine if the climatic conditions of the basins affect the model performance.To do this, we stratify the results depending on four climate types (Tropical, Arid, Temperate, and Cold) (Figure S4 in Supporting Information S1).Although the DJF season shows the lowest performance among the four seasons regardless of climate type, our seasonal models generally perform well, with a median correlation coefficient larger than 0.53 for all climate types and seasons.This good seasonal performance leads to a good performance in simulating annual maxima, with 75% of the stations having a correlation coefficient larger than 0.5 for all climate types.Crossvalidated annual maxima also exhibit good skills, with median correlation coefficients in the range of 0.49-0.76and 0.54-0.61depending on regions and climate types, respectively (light gray boxes in Figure 1 and Figure S4 in Supporting Information S1).These results indicate the robustness of our statistical approach in capturing the year-to-year variability of annual maximum discharges regardless of climate types.
To identify what climate drivers are responsible for the changes in the magnitude of seasonal maximum discharges, we investigate the selected covariates for the location parameter of each seasonal model.Figure 2 shows the percentage of stations where the climate covariates are selected for the location parameter depending on climate type and season.P con is the most selected covariate with a positive coefficient at more than 67% of stations for all seasons and climate types, indicating that seasonal precipitation in the concurrent season of interest is without doubt the most important climate driver for flood extremes.P con is followed by P lag , which is selected consistently worldwide with a positive coefficient.On the other hand, temperature is selected less often and shows a different pattern across the globe compared to precipitation.T lag is generally selected at more stations than T con and has a negative coefficient.Given that precipitation and temperature clearly have positive and negative relationships with soil moisture, respectively, the selection of P lag and T lag highlights their significant role in flood generation as proxies of antecedent wetness; T lag is more frequently selected in the MAM and JJA seasons.These results point to the role of lagged climate predictors as a proxy for snow accumulation and melt in the snowmeltdominated basins such as the northern U.S. (Welty & Zeng, 2021) and Europe (Berghuijs et al., 2019).NAS and SAS present the lowest covariate selection ratio in the DJF season (Figure S5 in Supporting Information S1), with no covariate selected for the location parameter at more than 30% of stations.In other words, at many of the stations, the best DJF model has a time-invariant location parameter, resulting in the poor performance of the model for this season in describing the year-to-year variability in flood magnitudes.The lack of covariate selection for the location parameter may be due to the limited suitability of our selected climate predictors in explaining the physical processes responsible for flood extremes in DJF in these regions.Nevertheless, given that the DJF season rarely contributes to the annual maxima in NAS and SAS (Figure S6 in Supporting Information S1), the limited performance of the DJF models does not impact the overall modeling of the annual maxima, which is the focus of this study.
We further examine if there is any difference in the climate covariate selection between concurrent and lagged seasons depending on basin-averaged aridity calculated from the Global Aridity Index and Potential Evapotranspiration Database-Version 3 (Global-AI_PET v3) (Zomer et al., 2022).Figure S7 in Supporting Information S1 shows the total covariate selection ratio of the location parameter for the four seasonal models depending on the four aridity classifications (UNEP, 1997).P con is selected at 79.4% of the stations in the arid regions, and from 76.5% to 90.3% as the basins become wetter (i.e., from semi-arid to humid).On the other hand, P lag is selected only at 31.3% of stations in the arid region, which is only slightly more than half as many as in the other aridity classes.Compared to precipitation, temperature does not exhibit a strong dependence on aridity.
Only in the arid region, however, T con is selected more often than T lag .These results indicate that the drier the region, the smaller the influence of the lagged effects of precipitation and temperature (e.g., antecedent wetness) compared to their concurrent ones.
The flood response to climate drivers can be affected by physical characteristics of a river basin such as topographic (Edokpa et al., 2022) and anthropogenic (Konrad, 2016) factors closely related to rainfall-runoff processes (e.g., time of concentration, soil infiltration).Therefore, we examine how the performance of our climateinformed models is affected by the basins' physical characteristics.Here we consider the basin size, dam density, and average depth of dam capacity (i.e., dam storage capacity/basin size) available from the streamflow databases we considered (Figures S1 and S8 in Supporting Information S1).We also consider the ratio of impervious areas from the global impervious-surface data set (GISD30) (X.Zhang, Liu, et al., 2022).When we stratify the results depending on these four characteristics (Figure 3), it is difficult to find any clear pattern, suggesting that the climate drivers we considered have a greater influence on the inter-annual variability in the magnitude of flood extremes compared to the effect of basin size, dam density, average depth of dam capacity, and impervious area.
The above-mentioned results demonstrate the robustness of our statistical models across geographical regions, climate types, and basins' physical characteristics.Based on these models, we examine the sensitivity of flood extremes to changes in the climate drivers.Figure 4 shows the relative changes in the median of the 0.01-AEP discharge (r∆Q 0.01 med ) for changes of 10%, 5%, +5%, and +10% in the seasonal climate drivers, stratified by climate types.When only P con changes (Con1), the Q 0.01 med values change in the same direction, with ∼95% of stations showing changing ratios smaller than those in P con for all seasons and climate types.Con2 (i.e., both P con and P lag change) shows larger changes than for Con1, even though they are generally below the percentage change used for the sensitivity.The r∆Q 0.01 med under these two conditions result from the selection of P con and P lag at most stations with a positive coefficient (Figure 2).When changing all four climate drivers (Con3), the overall patterns (i.e., median of boxplots) are similar to those under Con2, but there is much more variability among stations (orange boxes in Figure 4).To understand these results, we need to remember that T con and T lag can affect the Geophysical Research Letters 10.1029/2023GL107480 changes in flood extremes both positively and negatively, causing either a compounding or offsetting effect of the precipitation changes.When we stratify the results depending on the seven regions, there are much more regional and seasonal differences in the distribution of r∆Q 0.01 med (Figure S9 in Supporting Information S1), suggesting that temperature changes may modulate the effect of precipitation on flood extremes by modulating seasonally and locally different flood generation processes (e.g., moisture-holding capacity, antecedent conditions, snowmelt).This sensitivity analysis allows not only the identification of the key seasons controlling the changes in flood extremes, but also understanding the potential response of annual flood peaks to changes in seasonal precipitation and temperature.

Conclusions
We investigated how well season-and basin-averaged precipitation and temperature can explain the inter-annual changes in the magnitude of flood extremes at 7,886 basins worldwide.Despite using at most only four predictors averaged across the entire season and basin, our models showed robust performance in capturing year-to-year variability in seasonal and annual maximum daily discharges regardless of the geographical region, climate type, basin size, and dam density.These results suggest that seasonal precipitation and temperature are suitable to characterize the complex interplays of different climate and catchment processes leading to peak discharges.The suitability of seasonal precipitation is consistent with Liu et al. (2022), who showed that precipitation with longer durations (e.g., 30 and 90 days) dominates the changes in global flood magnitude, while precipitation with shorter durations (e.g., 1 and 7 days) has more limited impacts.
In addition to the robustness of our climate-informed models, they enabled a sensitivity analysis of flood extremes to changes in precipitation and temperature.Based on our results, we should not expect that a certain percentage of changes in seasonal precipitation would lead to the same magnitude changes in annual maximum discharge, notably due to the strong influence of soil moisture storage in many regions (Wasko, Nathan, et al., 2021) and the change/differing of dominant flooding mechanisms (Quintero et al., 2022;S. Zhang, Zhou, et al., 2022).Although this study considered only 12 scenarios (i.e., 3 conditions × 4 changing rates) of changing climate drivers, these nonlinearities in the system can be further assessed by considering more combinations and changing rates of climate drivers.Another advantage of this modeling framework is that we do not assume a fixed seasonal contribution to the annual maximum discharge; instead, the mixing varies from year to year to reflect the intraannual variability in precipitation and temperature.Finally, given that precipitation and temperature are commonly available from numerous global climate models, we expect that our models can enable to reliably project changes in flood extremes as well as in their generating processes.

Figure 1 .
Figure 1.Maps and boxplots (dark gray) showing the model performance in simulating annual maximum discharge for 7,886 stations classified into seven regions.Red crosses represent the 325 stations removed due to the failure of the seasonal models.The boxplots also show the correlation coefficient based on leave-one-out cross-validation (light gray boxes) and seasonal models for DJF (blue boxes), MAM (green boxes), JJA (orange boxes), and SON (yellow boxes).In the boxplot, the limits of the box (whiskers) represent the 25th and 75th (5th and 95th) percentiles, while the line inside of the box indicates the median.

Figure 2 .
Figure 2. Covariate selection ratio of location parameter (μ) of the seasonal models depending on four climate types.

Figure 3 .
Figure 3. 2D histogram of the Spearman correlation coefficients calculated between observations and the median of the fitted and simulated distributions of seasonal (top-four rows) and annual maxima (bottom row), respectively, as a function of the basin size (first column), dam density (second column), average depth of dam capacity (third column), and ratio of impervious area (last column).Based on data availability, the results for dam density include stations belonging to the Streamflow Indices and Metadata (GSIM), African Database of Hydrometric Indices, and U.S. Geological Survey (USGS) databases, while those for average depth of dam capacity only to the GSIM and USGS ones.

Figure 4 .
Figure 4. Sensitivity of the median of 0.01-annual exceedance probability discharge (Q 0.01 med ) to changes in only concurrent precipitation (Con1; light blue boxes), both concurrent and lagged precipitations (Con2; blue boxes), and all four climate drivers (Con3; orange boxes) depending on climate types.Each column shows the results for four different changes ( 10%, 5%, +5%, and +10%) in climate drivers, while each season's results are presented in a row.