A Method to Assess and Explain Changes in Sub‐Daily Precipitation Return Levels From Convection‐Permitting Simulations

Reliable projections of extreme future precipitation are fundamental for risk management and adaptation strategies. Convection‐permitting models (CPMs) explicitly resolve large convective systems and represent sub‐daily extremes more realistically than coarser resolution models, but present short records due to the high computational costs. Here, we evaluate the potential of a non‐asymptotic approach, the Simplified Metastatistical Extreme Value (SMEV) to provide information on the future change of extreme sub‐daily return levels based on CPM simulations. We focus on a complex‐orography area in the North Eastern Italy and use three 10‐year time periods COSMO‐crCLIM simulations (2.2 km resolution) under RCP8.5 scenario. When compared to a block r‐maxima approach currently used in similar applications, the proposed approach shows reduced uncertainty in rare return level estimates (about 5%–10% smaller confidence interval) and can improve the quantification of future changes from CPM simulations. We evaluate these changes and their statistical significance in return levels for 1–24 hr durations. The changes show an interesting spatial organization associated with orography, with significant positive changes located at high elevations. These positive changes tend to increase with increasing return period and decreasing duration. Because SMEV can separate the roles of event intensity and occurrence, it allows for physical interpretations of these changes. We suggest that non‐asymptotic approaches permit the quantification of change in rare extremes within available CPM runs.


Introduction
Short-duration extreme precipitation is often caused by convective processes and can trigger several water-related hazards, such as flash-floods, urban floods, debris flows and landslides, which, in turn, can cause severe damages and numerous victims (e.g., Formetta & Feyen, 2019;Paprotny et al., 2018).Convective activity is projected to increase in a warming climate (e.g., Fowler, Lenderink, et al., 2021;Fowler, Wasko, et al., 2021;IPCC, 2019;Prein et al., 2017), leading to increased hazard, which is not accounted for in current design standards for floodprotection and risk management practices.Thus, we urgently need accurate and reliable projections of future extreme short-duration precipitation to improve adaptation strategies.
Convection-Permitting climate Models (CPMs) are characterized by a fine spatial resolution (a few kilometers) that can explicitly resolve deep convective systems.This is a significant improvement with respect to coarser resolution models, such as Regional Climate Models, or RCMs, in which convection is parametrized as a subgrid-scale process.Due to the explicit representation of convective physics, CPMs provide a more realistic representation of sub-daily precipitation, in terms of the probability distribution of event intensity and of their spatial structure (e.g., Ban et al., 2020;Berthou et al., 2020;Kendon et al., 2014;Lind et al., 2016;Prein et al., 2015).Especially in mountainous areas, the finer representation of orography together with the ability to resolve convection results in a reduction of model bias with respect to observed precipitation (Fosser et al., 2015;Lind et al., 2016;Reder et al., 2020) and in an improvement of the characterization of extreme sub-daily precipitation (Ban et al., 2020).The use of CPMs was also shown to improve the estimation of precipitation return levels in orographically complex regions (Poschlod, 2021;Poschlod et al., 2021), although some biases remain (Dallan et al., 2023).The improved realism in CPM-modeled extremes, and their reduced model uncertainty, allows greater confidence in their projections, especially for short-duration high-intensity precipitation (Fosser et al., 2020(Fosser et al., , 2024;;Kendon et al., 2014Kendon et al., , 2017)).
The use of CPM projections for the estimation of future changes in extreme precipitation in complex terrain is still in its infancy, but could help us devise and implement effective adaptation strategies to cope with rainfall-driven hazards.Several studies estimated extreme precipitation from CPM by using percentile methods (e.g., Ban et al., 2014Ban et al., , 2015)), but practitioners require estimates of rare extremes, corresponding to high return levels, which are much rarer than the typical percentile thresholds used in climate studies.Because of the high computational costs, CPMs simulations are, on the other hand, commonly available only for relatively short periods (up to 30 years, but typically around 10-20 years), which can lead to high uncertainties in return level estimates when using traditional extreme value approaches (e.g., see Poschold, 2021).To the best of our knowledge, just a few studies estimated future changes in short-duration precipitation return levels based on CPMs (Chan et al., 2014(Chan et al., , 2018;;Ban et al., 2020;Rybka et al., 2022) instead of just analyzing changes in high percentiles.Chan et al. (2018) and Rybka et al. (2022) used a peak-over-threshold approach, and Ban et al. (2020) used a modified blockmaxima (3-largest) approach, in which three maxima per year are retained instead of one.Due to the large associated uncertainties, however, they could provide only mean estimates of the changes in hourly and daily return levels over large domains (e.g., for the entire Europe in Ban et al., 2020, in 12 "natural" regions in Rybka et al., 2022) or for relatively low return periods (e.g., Chan et al., 2018 focused on 5-year return levels and provided spatially averaged results up to 30-year return levels).
Here, we propose the use of a non-asymptotic statistical method for the analysis of extremes that may be effectively applied to the CPM short time-slices to reduce the stochastic uncertainties in the return level estimates.A new non-asymptotic method, known as the Metastatistical Extreme Value Distribution (MEVD; Marani & Ignaccolo, 2015;Zorzetto et al., 2016) has recently been presented, followed by a simplified formulation, the Simplified MEV (SMEV; Marra et al., 2019Marra et al., , 2020)).These non-asymptotic methods use information from a large proportion of the available data, instead of just using one maximum per year or a few values exceeding a high threshold (Marani & Ignaccolo, 2015).It has been demonstrated that both methods can reduce the stochastic uncertainty in the estimation of rare rainfall return levels with respect to classical extreme value analyses, especially when short data records are available (e.g., Marra et al., 2018;Miniussi & Marani, 2020;Zorzetto et al., 2016).While many methods were developed to quantify, and possibly reduce the uncertainty associated with traditional estimation approaches, applications of existing statistical models to CPM simulations did not prove satisfactory.Poschlod (2021) explored different extreme value methods for estimating daily extreme precipitation from 30 years CPMs data set, including the novel MEVD.In terms of uncertainty, he demonstrated that MEVD is preferable over block maxima and peak-over-threshold methods, although it was found subject to a systematic underestimation of the return levels.
The accuracy of non-asymptotic methods is directly linked to the accuracy of the assumption about the ordinary events distribution.In some regions of the globe, the two-parameter Weibull typically used to describe the distribution of ordinary precipitation events seems to be a good model only for the tail ordinary events distribution, rather than the entire body (Marra et al., 2023;Wang et al., 2020).Indeed, the analytical derivations by Wilson and Toumi (2005) that underpin the use of Weibull distributions for precipitation relate to the tail of the distribution.Using a left-censoring threshold, one can use the two-parameter Weibull to describe the right tail of the ordinary events distribution, overcoming the above issue (e.g., see Miniussi and Marra, 2021 that focused on the same area as Poschlod, 2021).This however decreases the amount of data points available for parameter estimation, especially when parameters need to be estimated on yearly basis like in MEVD.Compared to MEVD, the Simplified Metastatistical Extreme Value (SMEV) approach neglects the inter-annual variability.Although not exact, this approximation increases the number of observations available for parameter estimation, and thus allows robust estimates in presence of left-censoring thresholds.
The SMEV has been recently applied to study the orographic impact on precipitation extremes at different durations (Amponsah et al., 2022;Formetta et al., 2022;Marra et al., 2021;Marra, Armon, & Morin, 2022).By effectively exploiting the relatively short rainfall records usually available in mountain regions, SMEV does not require regionalizations (e.g., Buishand, 1991) or duration-scaling approaches, which may smooth existing orographic effects or other gradients in the statistics of the extremes.SMEV has already been successfully used on CPM short time-slice in few recent works.Dallan et al. (2023) applied SMEV to a 10-year reanalysis-driven CPM for analyzing the CPM's ability to represent the observed orographic effect on hourly precipitation return levels up to 100 years return time.They showed that CPM reproduces the observed decrease of rainfall intensity with elevation (reverse orographic effect) for 1 hr extreme rainfall, although with weaker magnitude with respect to observations.Shmilovitz et al. (2023) used SMEV-estimated statistics of extreme precipitation up to 100-year return levels by CPMs to explain some observed differences in the geomorphological evolution of a steep desert cliff.One interesting advantage of non-asymptotic approaches, such as MEVD and SMEV, is the explicit separation of the intensity distribution from the occurrence frequency of the ordinary events (i.e., the number of occurrences of the independent processes), which allows linking directly the extremes to the properties of the underlying ordinary events.This ability was for example, used to explain changes in the statistics of extreme hurricanes (Hosseini et al., 2020) and, more recently, of extreme precipitation (e.g., Dallan et al., 2022;Marra et al., 2021;Marra, Armon, & Morin, 2022;Vidrio-Sahagún and He, 2022).
In this work, we apply SMEV to quantify and explain projected changes of sub-daily to daily precipitation return levels based on 10-year CPM simulations.Specifically, we exploit this non-asymptotic formulation to attempt, for the first time, an explanation of the projected changes in extreme precipitation in terms of variations in intensity and occurrence frequency of the storms.This allows us to provide a physical interpretation of the projected changes.We focus on an orographically complex region in northeastern Italy, where past trends in extreme shortduration (hourly) precipitation over the last decades were found significantly positive (Libertino et al., 2019).This makes it a challenging and interesting study case for the estimation of future changes in extreme precipitation from a CPM model.The specific objectives are the following: (a) evaluation of SMEV uncertainty compared with a modified Generalized Extreme Value (GEV) 3-largest approach recently applied to CPM simulations; (b) assessment of the bias with respect to estimates based on long rain-gauge records; (c) estimation of future changes in rare precipitation return levels (up to 100 years) at different sub-daily durations for two future time-slices; (d) demonstration of the potential of the SMEV approach for providing a physical interpretation of projected changes.

Study Area and Data
We focus on an Alpine area of about 32,000 km 2 in northeastern Italy, characterized by complex orography with elevation ranging from 5 to 3,990 m a.s.l.(Figure 1) and a high climatic heterogeneity.The mean annual precipitation is about 800 mm yr 1 in the south-eastern part of the domain, mostly flat, and increases toward the central part of the domain (2300-2500 mm yr 1 ), where the Prealps represent the first orographic obstacle to the dominant atmospheric systems (e.g., Isotta et al., 2014).Drier conditions are found in the north-western part of the area (about 500 mm yr 1 , on average) which is shadowed by the surrounding mountains against prevailing moisture winds from south-east.Synoptic-scale precipitation events at the 24 hr, or longer, timescale are mainly generated by large-scale patterns associated with the Atlantic storm track and Mediterranean circulation and mostly occur in fall and winter.Shorter-duration extreme events in summer are convective in nature in the coastal zone and convective-orographic in the pre-alpine region and in the northern part of the domain (Norbiato et al., 2009).In the mountainous part of the area, Libertino et al. (2019) reported a significant positive trend in observed annual maxima (AM) for 1-24 hr rainfall durations in the period 1928-2014.Dallan et al. (2022) estimated a significant positive trend in return levels for 15 min to 6 hr durations in the period 1991-2020, and associated the stronger increase at the sub-hourly durations to an increased proportion of convective events in the summer.This tendency is also confirmed by a projected increase in lightning activity (Kahraman et al., 2022).

Climate Model Simulations
The model simulations used in this work were performed by ETH Zurich with the state-of-the-art weather prediction COSMO (Consortium for Small-Scale Modeling in Climate Mode) model, running on GPU in climate mode, here called COSMO-crCLM (Rockel et al., 2008).The non-hydrostatic limited-area COSMO-crCLM solves numerically on a three-dimensional Arakawa-C grid (Arakawa & Lamb, 1977) the fully compressible governing equations using finite difference methods (Förstner & Doms, 2004;Steppeler et al., 2003) and a thirdorder Runge-Kutta time-stepping scheme (Wicker & Skamarock, 2002).The model uses for horizontal advection a fifth-order upwind scheme, and an implicit Crank-Nicholson scheme in the vertical direction, discretized in 60 stretched model levels ranging from 20 m to 23.5 km (Baldauf et al., 2011).The physical parameterizations adopted here include a delta-two-stream radiative transfer scheme according to Ritter and Geleyn (1992) and a single-moment bulk cloud microphysics scheme with five categories of hydrometeors, that is, cloud water, cloud ice, rain, snow, and graupel (Reinhardt & Seifert, 2006).Shallow convection is parameterized using a modified version of the Tiedtke mass flux scheme with moisture convergence closure (Tiedtke, 1989), while deep convection is explicitly resolved at convection-permitting scale.COSMO-crCLIM employs a turbulent kinetic energy-based parameterization for the planetary boundary layer and the surface transfer (Mellor & Yamada, 1982;Raschendorfer, 2001), and the TERRA-ML soil-vegetation-atmosphere-transfer model, with a 10-layer soil and a maximum soil depth of 15.24 m in the lower boundary (Heise et al., 2006).More details on the physical parameterizations used can be found in Leutwyler et al. (2016).
The CPM at ∼2.2 km resolution is nested within the convection-parameterized RCM covering the Coordinated Regional Climate Downscaling Experiment (CORDEX) European domain at 12 km, which is in turn driven by the Earth System Model of the Max-Planck-Institute (MPI-ESM-LR; Stevens et al., 2013)   The same CPM, but driven by ERA-Interim for the period 2000-2009, was evaluated over the greater Alpine domain in Ban et al. (2021) and over our study area by Dallan et al. (2023).Ban et al. (2021) found that the bias compared with several observational data sets is limited and similar to the other CPMs from the CORDEX-FPS project.Dallan et al. (2023) found that the COSMO-crCLM can capture the observed orographic effects (decrease in the intensity of extreme hourly precipitation with elevation) but tends to overestimate extreme hourly precipitation at high elevations.

Observational Precipitation Data
We use observational 5-min resolution precipitation data from 174 rain gauges already used in Dallan et al. (2023) to evaluate the bias in the historical simulation (Figure 1).The rain gauges cover an elevation range from 3 to 2235 m a.s.l. and a time period from 1983 to 2020, with record lengths varying from 14 to 37 years (28 years on average).We use here the complete time series to have the most robust estimation of the observed extreme precipitation as a benchmark for the CPM evaluation.Rain gauge data is aggregated at 1h to match the CPM temporal resolution, while years with more than 10% of missing data are excluded from the analysis.

Methods
This section describes the methodology to: estimate return levels and their uncertainty; assess the CPM biases with observations; estimate the projected changes in return levels, and evaluate their statistical significance.For the CPM, precipitation times series at each grid cell are treated independently.

Simplified Metastatistical Extreme Value Approach
The SMEV is an approximation of the Metastatistical Extreme Value first introduced by Marani and Ignaccolo (2015) (MEVD, see also Zorzetto et al., 2016) in which the inter-annual variability is neglected (Marra et al., 2019).As opposed to traditional extreme value theory, in which the maximum values of asymptotically large blocks (n→ ∞) or the Poisson exceedances over an asymptotically high threshold (θ→ ∞) are examined, these methods are based on the analysis of all the independent realizations of the (rainfall, in this case) process.These realizations are usually termed ordinary events.MEVD and SMEV are thus non-asymptotic methods, because they do not hinge on asymptotic behaviors, and are closely related to ordinary statistics (Marani & Zorzetto, 2019;Serinaldi et al., 2020).In addition to the reduced parameter estimation uncertainty (e.g., Zorzetto et al., 2016), these methods may help associating the emerging statistics with the underlying processes (e.g., Araujo et al., 2023;Dallan et al., 2022;Hosseini et al., 2020).Detailed background on the method can be found in Marani and Ignaccolo ( 2015 The main idea underlying non-asymptotic methods is that, once the tail behavior of the parent distribution of the ordinary events F(x) is known, the distribution of the emerging block maxima can be written, in the simplified SMEV form, as: where n is the average number of ordinary events in a block.In the case of rainfall, F(x) , or its tail, is usually assumed to be a two-parameter Weibull distribution, an assumption emerging from thermodynamic reasoning (Wilson & Toumi, 2005) and supported by observations (e.g., Marra et al., 2020Marra et al., , 2023;;Zorzetto et al., 2016) and stochastic modeling results (Papalexiou, 2022).The two-parameter Weibull distribution is written as: Water Resources Research where λ is a scale parameter and κ is a shape parameter that defines the "heaviness" of the tail (i.e., how quickly the cumulative distribution function goes to 1).In general, larger shape parameters are associated with lighter tails, with κ = 1 corresponding to exponential tails, κ < 1 to tails heavier than exponential and κ > 1 to tails lighter than exponential.
Here, we define the ordinary events as described in Dallan et al. (2023): (a) we identify the independent storms in the time series as consecutive wet periods separated by dry hiatuses of at least 24 hr, (b) ordinary events are computed for each duration of interest (1, 3, 6, 12, 24 hr) as the maximal intensities within each storm, using running windows of that duration moved with 1 hr steps.We define the tail of F(x) (i.e., the portion of F(x) which is well approximated by a two-parameter Weibull distribution and from which block maxima are likely extracted)  2022).We define different percentile thresholds at different durations: 90th for 1h and 85th for longer durations.Once the threshold is defined, the corresponding parameters of the Weibull distribution are estimated using a Weibull coordinate transformation and a least-square linear regression, as proposed in Marani and Ignaccolo (2015).Return levels associated to a desired probability level can then be derived inverting Equation 1.All codes are made available in Marra (2020).

GEVr: r-Largest Block Maxima
The GEV analysis (Coles, 2001;Katz et al., 2002;Wilks, 2011) is an established method to estimate return levels from the block maxima of precipitation time series.The GEV's cumulative distribution function G GEV can be written as with location μ, scale σ, and shape ξ.
Here we use a modified version of this approach recently applied on 10years-long CPM runs in Ban et al. (2020).Specifically, two modifications to the classical block maxima approach method were implemented in order to improve the estimation of the distribution parameters and to avoid unrealistic estimates of the shape parameters.First, a modified maximum-likelihood estimator is used, which incorporates a Bayesian prior distribution for the shape parameter (Frei et al., 2006;Martins & Stedinger, 2000).Second, a r-largest approach (Coles, 2001) is applied, where r independent maxima per year (3 in our case, as in Ban et al., 2020) are considered.We ensure independence of the maxima by extracting them from the series of the independent ordinary events as identified in the SMEV method.Return levels associated to a desired probability level can then be derived inverting Equation 3. In the following, we refer to this modified GEV with the r-largest approach as GEVr.

Uncertainty Assessment Between SMEV and GEVr
The uncertainty associated with the return level estimates is quantified using a bootstrap procedure (Efron & Tibshirani, 1993).On the CPM data sets, 1,000 bootstrap surrogates are created by randomly sampling 10 years with replacement (Overeem et al., 2008) using the same random sequences for all the grid cells and time slices.SMEV and GEVr distributions are then fitted on each of the bootstrapped surrogates, obtaining 1,000 estimates of return levels for each duration and return period.At each grid point, the uncertainty in the return levels estimates X is then expressed as the normalized 90% confidence interval CI 90 evaluated from the distribution of the 1,000 return levels, with X 95 and X 5 representing the 95th and fifth percentile: We compare the uncertainty of SMEV and GEVr for the three CPM time slices and for different return periods representing different levels of extrapolation from the available 10 years of data record.

Water Resources Research
10.1029/2023WR035969 Then we evaluate whether 10 years of data is a sufficiently large pool to assess the uncertainty or a longer time series is required to estimate the "true" uncertainty.Thus, we use a Monte Carlo experiment to generate a synthetic population consisting of 10 4 years of data with 70 events per year from a Weibull distribution with shape parameter 0.7, typical of the study area, while the scale parameter set to 1 does not affect the results.We then considered different record lengths L ranging from 1 to 50 years and computed the CI 90 from both SMEV and GEVr, based on: (a) 1,000 bootstrap samples with replacement of L years out of the synthetic population (the 10 4 years), and (b) 1,000 bootstrap samples with replacement of L years out of L years; this second step is repeated 500 times for each length L. From the first option we obtain the true uncertainty, from the second one the estimated uncertainty, for both SMEV and GEVr.

Assessment of Model Bias and Future Changes
From the previous analysis, at each location and for each duration, the following quantities (generically denoted with X in the following equations) are obtained: (a) AM and their mean value for the whole series, (b) average yearly number of ordinary events n, (c) scale and shape parameters of the Weibull distribution for SMEV, (d) return levels up to 100 years return period.For these quantities, we analyze bias and future changes.

Computation of Model Bias and Future Changes
The biases with respect to observational data are evaluated for the historical CPM (control period).As in Dallan et al. (2023), the bias assessment is based on the CPM hourly precipitation data extracted at the grid point closest to the rain gauge.The bias is assessed by comparing observation (OBS) and station co-located historical CPM (SC_hist), for all the quantities X obtained from the frequency analysis.The relative bias B X is computed as the relative percentage difference of the CPM result X SC hist with the observed result X obs : In this work, we evaluate the bias in the model, but no bias correction is applied to the climate simulations to avoid adding additional uncertainty in the climate projections (Maraun et al., 2017).The main assumption here (see e.g.Chen et al., 2021;Maraun, 2016) is that CPMs provide a plausible representation of climate change-induced variations in precipitation extremes, even though biases may affect simulations in historical period runs.
The climate change signal in 10-year long near future and far future simulations with respect to the control scenario is evaluated for all the CPM grid points (∼6,500) in the study area.Assuming that the model bias is constant over time, the future relative change C X of each quantity X is computed by comparing the historical and future results in each grid cell of the study area.It is expressed as the relative percentage difference between the near and far future CPM result X near,f ar with the historical CPM result X hist :

Statistical Significance of Biases and of Future Changes
To assess if the differences between observation and historical CPM (to test for the biases) or historical and future CPM (to test for the changes) are statistically significant, we perform formal hypotheses testing procedures.Specifically, we adopt a nonparametric permutation test (Pesarin, 2001), which requires no assumption about the distribution of the data.
We used the same procedure for testing the statistical significance of both biases and future changes.The test is independently applied to each location (grid point) as follows: 1. we label the control data (i.e., the empirical observations) as Group A, and the test data (i.e., historical CPM) with Group B; 2. we compute SMEV parameters and return levels separately for Group A and Group B and calculate the differences between the two groups (as described at Section 3.2); Water Resources Research 10.1029/2023WR035969 DALLAN ET AL.
3. the two groups are then randomly permuted 1,000 times, thus generating 1,000 surrogates of group A and 1,000 surrogates of group B. Each of the surrogates of group A will now contain elements of group B and viceversa; 4. we repeat step ii) for each surrogate A-B pair, obtaining 1,000 differences in SMEV parameters and return levels between each of the two permuted groups; 5. we check if the differences between the original samples can or cannot be distinguished from the differences between mixed samples.Specifically, if the original difference computed at step ii) is within the 2.5th-97.5thempirical percentile of the 1,000 surrogate differences computed at step iv), the difference is not considered statistically significant.Otherwise, the original difference (bias or change) can be considered significant at the 5% level.
The same procedure is used to investigate the significance of the climate change signal, but considering in this case as group A the historical CPM, and as group B the future CPM simulations.

Uncertainty Assessment Between SMEV and GEVr
The analysis of the uncertainty associated with the GEVr and the SMEV approaches is reported in Figure 2. Here, we compare the normalized 90% confidence intervals (CI 90 ) of the return levels estimated with the two methods for different durations, probability levels (20-year and 100-year return period), and time periods (near and far future).The error bars in Figure 2 represent the range of variation of the CIs in the area, that is across the 6,500 grid points, with the median CI 90 indicated with a dot, and thick and thin lines indicating inter-quartile range and 5th-95th range, respectively.The average CI 90 for SMEV is ∼30-35% for the 20 years return time and ∼35-40% for the 100 years.For the 20 years return levels (panels a, b,c), GEVr and SMEV have similar CI 90 at 1h duration; for longer durations SMEV presents generally smaller uncertainty, with both the median value in the area and the range being smaller than for GEVr.This is more evident for the 100 years return time (panels d, e, f): at 1 hr duration SMEV CI 90 is slightly smaller than for GEVr, and for the longer durations it is about 5% smaller in the median value and 10% smaller in the 95th value than GEVr.The uncertainty reduction in SMEV when duration is increased from d = 1h to d > 1h can be related to the different portion of the distribution used for the analysis, that is respectively the top 10% at 1h and 15% for the longer durations.Figure S1 in Supporting Information S1 shows the "true" uncertainty (as defined in Section 3.1.3)and the estimated uncertainty (median and inter-quartile range across the 500 iterations is shown) as a function of the record length L. For our case, that is, L = 10, it is worth noting two aspects.First, "true" uncertainty is smaller for SMEV than for GEVr.Second, the estimated uncertainties of SMEV and GEVr are similar, but the SMEV estimated uncertainty is similar to the true one, while the GEVr estimated uncertainty is systematically underestimated with respect to the true uncertainty even for higher L. For a 10-year-long record, the estimated CI 90 is ∼86% of true CI 90 for SMEV while the estimated CI 90 is ∼78% of true CI 90 for GEVr.This ancillary analysis gives us confidence in the interpretation of the estimated uncertainties.We also point out that estimating bootstrap uncertainties from samples shorter than about 10 years leads to strong underestimations of the true uncertainty with both methods.
Thus, SMEV generally produces a lower uncertainty in the estimation of rare return levels, and a reduced underestimation of the "true" uncertainty with respect to GEVr.

Bias Assessment
The results of bias assessment for the historical period 1996-2005 are shown in Figure 3 for two durations (1 and 24hr), for mean AM and for the 20 years return level.The results for the other durations are reported in the supplemental material (Figure S2 in Supporting Information S1).The 1 hr AM (panel a) biases exhibit a spatial pattern coherent with the topography of the study area: CPM tends to overestimate extremes in the north and north-west of the domain (violet colors), characterized by mountains, while slightly underestimate them over the lowland in the southeastern part of the area (orange colors).The statistically significant biases, ∼20% of the station-points, are concentrated in the mountains in the north part of the domain.Other scattered significant points are likely associated with the sensitivity of the test and are expected given the 5% significance level used.For the 24 hr (panel b), the overestimation is widespread but stronger and significant (∼38% of the station-points) in a transversal zone, crossing the domain south-west to north-east, that is mostly mountainous.The bias in the 20year quantiles show a spatial pattern consistent with the AM of the corresponding duration.The station-points showing significant bias are reduced compared to AM both for 1 and 24 hr duration (respectively ∼18% and ∼22%).On average in the area, the uncertainty in the 20 years return levels, expressed in terms of the coefficient of variation evaluated from the bootstrap results, is 6%-9% for the observations and 9%-11% for CPM, across all the durations.For the intermediate durations, the spatial pattern shown in Figure S2 in Supporting Information S1 is consistent with Figure 3, with the significant overestimation mostly located in the mountainous part of the domain.The higher percentage of significant bias is found at 3 and 6 hr durations (Figure S2 in Supporting Information S1).

Future Changes
Maps of the future changes are reported for the 20 years return levels in Section 4.3.1, while their variation with durations and return times is reported in Section 4.3.2 for the average significant change over the study area.

Projected Changes in Return Levels
The projected changes in the 20 years return levels for the near future with respect to the historical period are reported in Figure 4.The average change in the area is positive, in the range +8-10% across all the durations, with roughly similar spatial patterns.The quantiles corresponding to return periods examined are generally projected to increase significantly in many areas over the mountains, while in the lowlands the change is both positive and negative, and generally not significant.At the 1 hr duration most of the significant changes (∼12% of the domain) are positive and span areas ranging from south-western to north-eastern Alps, as well as the coastal zone.At the 3 hr duration the map of change is consistent with the previous one, with a less noisy pattern especially over the mountains, where a significant increase in the rainfall intensity is reported (significant points ∼17% in the domain).At longer durations, the increasing signal in the mountain is still present but is located in an inner Alpine zone mostly corresponding to Alto-Adige region, while in the lowlands most of the area shows a non-significant decrease.

Water Resources Research
10.1029/2023WR035969 The projected changes in the 20-year return levels for the far future with respect to the historical period are reported in Figure 5.The average change in the area is positive, around 20% for the 1h duration and reducing across durations till 10% for the 24 hr, and negative changes are reported only in small parts of the domain.Over the mountains the return levels are projected to significantly increase, particularly so at the shorter durations (1-3 hr) over the north and western part of the domain.At the longer duration return levels the increase is generally lower in magnitude, with the significant changes concentrated in the inner mountain region in the north of the domain.In the lowlands the change is not statistically significant, with both positive and negative values and no clear spatial organization.At 1 hr duration the signal of change in lowlands is quite noisy; at the longer duration, the western part of the lowland shows no change and low positive change, while the eastern area close to the coast shows generally a small decrease.
In Figure 6, we show the maps of future changes in the 100-year return levels for three durations (1, 3, 24 hr), and for the near and the far future (in Figure S3 in Supporting Information S1 the 6 and 12 hr durations).The results for this rarer return level are consistent with those for the 20 years return level, both in terms of spatial .In purple the positive bias (CPM overestimation), in orange the negative bias (CPM underestimation).Biases significant at the 5% level are indicated with a black circle, and the fraction of significant biases in the area is reported in each panel.
organization and of proportion of significant changes.The strongest and most significant increase is projected to occur in mountainous areas at the shorter durations.Changes at the daily duration are more localized in the inner part of the mountainous region.The signal in the lowlands is generally not significant.

Dependence of the Projected Changes on Duration and Return Period
The average change in the extreme precipitation over the domain is calculated as the mean value of the significant changes and is presented in Figure 7 for both near (left panels, a, c) and far (right panels, b, d) future.The dependence on duration (Figures 7a and 7b) is remarkably different between near and far future.There is no evident variation with duration for the average significant change for the near future, with rather uniform values of about 30%-50% for 5 and 100 years return periods.For the far future, the average significant change decreases with increasing duration, passing for example, from about 55% at 1 hr to about 40% at 24 hr for the 100-year return level.The dependence of the changes on the return period are shown in panel c for the near future and in panel d for the far future.The dependence appears similar in both the time slices, with increasing change for increasing return periods, although the different dependence on duration can be clearly noticed.For the near future, passing from 2 to 100 years return time the change increases by about +20% (from 25% to 45%), for all the durations.For the far future, the change increases by about +23% at 1 hr duration and by +15% at 24 hr.In blue positive changes (increasing return levels), in orange negative changes (decreasing return levels).Changes significant at the 5% level are indicated with a dark dot, and the fraction of significant cases in the area is reported in each panel.

Changes in the Distribution Parameters
At each grid point, the projected change in the estimated distribution parameters (scale, shape) and average yearly number of events are calculated as described at Section 3.2.We then estimate the mean change in the parameters at the grid points where the change in the return levels is found to be significant at the different durations.We show in Figure 8 the results for the case of significant changes in the 20-year return levels, but the figures obtained for other return levels are analogous (see Figure S3 in Supporting Information S1 for the 100 years return period).This analysis shows that the mean change on n is almost null, the mean change on the scale parameter could be both negative (near future) or positive (far future at 1-3 hr durations), and the mean change on the shape is negative across all the durations for both the time slices.A change in the scale implies a multiplicative change across the distribution intensity across all the occurrence probabilities.A decreasing shape parameter implies an increasing "heaviness" of the tail of the intensity distribution, that is, increasing probability of extremely high intensities.The increasing tail heaviness associated with the decreasing shape shown in Figure 8 explains the higher changes with higher return periods reported in Section 4.3.2.Tail "heaviness" is important in determining extreme return levels, and the projected decrease in the shape parameter dominates over the decrease in the scale parameter projected for the near future.In the particular cases of 1 and 3 hr durations in the far future, the combination of decreasing shape and increasing scale has a synergistic effect in increasing the return levels, thus leading to the highest mean change we find in our study: 54% and 49% at 1 and 3 hr for the 100 years return level (Figures 7b and 7d).In blue positive changes (increasing return levels), in orange negative changes (decreasing return levels).Changes significant at the 5% level are indicated with a dark dot, and the fraction of significant cases in the area is reported in each panel.

Bias and Future Changes
The results on the bias assessment at 1 and 24 hr duration are generally in line with the findings in Dallan et al. (2023), where they evaluated ERA-driven CPM simulation against the same rain gauges used in our work over the10-year-long period 2000-2009.In particular, the bias in Figures 3a and 3c for the 1h AM bias and 1h 20 years return level has a spatial pattern similar to the previous study, although with reduced magnitude.For the 24 hr AM (Figure 3b; no 24 hr return level bias was analyzed in Dallan et al., 2023) the CPM driven with the GCM appears wetter than the ERA-driven one, leading to a reduction of the dry bias in lowland and an increase of the wet bias in the mountainous area.As mentioned in Dallan et al. (2023) for the hourly duration, our findings suggest that the role of the orography should be considered in the CPM bias adjustment approaches, and this appears to be different at the different durations.
The obtained projected changes can be partially compared with those in Ban et al. (2020), where the future change for the 10 years return levels estimated with a GEV 3-largest approach, for 1 and 24 hr duration was analyzed over a larger domain.They provided just the averaged change over the domain, finding positive changes for both 1 and 24 hr durations.Our results allow us to discuss in more detail the spatial pattern and the significance of the change, even for rarer return levels.Indeed, summarizing the findings of Section 4.3.1, the analysis of the future changes in extreme precipitation indicates that according to the examined model in our study area an increase in extreme precipitation is expected mostly in the mountains.At the shorter durations (1-3 hr) the increase is concentrated in a south-west to north-east mountainous band, in the near future, and over the whole mountainous area in the far In blue positive changes (increasing return levels), in orange negative changes (decreasing return levels).Changes significant at the 5% level are indicated with a dark dot, and the fraction of significant cases in the area is reported in each panel.
future.At longer durations, the CPM projects a higher increase in the inner part of the mountainous region for both the future periods.In the lowlands no statistically significant change could be detected at the 5% level.By separating the points in three elevation classes (Figure S5 in Supporting Information S1, same classes as in Dallan et al., 2023) it clearly emerges how the change in the median increases from lowlands to high elevations, for both future periods and all durations.The relatively wide ranges of change, however, suggest that elevation alone is not sufficient to explain all the changes.The percentage of significant points in each class exhibits a clear increase  with increasing elevation class, with almost no significant changes observed in the lowlands and the highest percent in the high elevation class.This is more evident at 1-3 hr than at 24 hr duration.
Despite a visual similarity in the spatial patterns of bias and future change, no quantitative relation emerges between bias and change signal, being their correlation low and slightly negative for all durations (see Figure S6 in Supporting Information S1).
Thanks to the limited stochastic uncertainty of the SMEV return level estimates (see Section 4.1 and Figure 2), the statistical of the changes in the 100 years return levels could be determined (shown for three durations in Figure 6).Changes in these very rare quantities are qualitatively consistent with those found for the 20 years case, both in terms of spatial patterns and relative magnitudes.This suggests that the signal-to-noise ratio of the detected changes is similar for 20 and 100 years return levels.The proposed method thus represents a viable approach for estimating future changes in extreme precipitation from short CPM runs.
A generally higher proportion of significant changes are reported at the shorter durations (1-3 hr) with respect to daily durations, suggesting that changes in convective storms (related with short duration high-intensity precipitation) are expected to become more severe with climate change with respect to changes in large-scale storms.

Implications of the Projected Changes
The dependence on duration of the significant changes in extreme precipitation (Figure 7) is in line with the general tendency found in Ban et al. (2020).These authors found that the average increase for the 1 hr extreme precipitation is higher than for the 24 hr and 5 days durations, in both winter and summer seasons, and concluded that convective events are likely to become more significant with climate change.Our analysis confirms that, on average, the short duration extreme precipitation, mainly related to convection, is expected to increase more significantly than the longer duration extremes, especially in the far future.At 24 hr duration, the average significant change is slightly higher in the near future than in the far future period (Figures 7a and 7b).This can be the result of multiple factors: (a) the average significant changes in near and far future periods are calculated over different grid cells since those with significant changes do not coincide between the two time slices; (b) the change in the underlying ordinary-value probability distribution parameters (see Section 5.3) reveals some nuanced dynamics, showing that thermodynamic and dynamic controls are not acting in the same direction when near and far future periods are considered; (c) natural variability may partially obscure the climate-change signal, particularly in a 10-year simulation, as years with record-breaking events may cluster and be followed by several decades with no new rainfall records (Kendon et al., 2023).In order to attenuate these limitations on the use of decadal time slices to sample future precipitation changes, recourse to an ensemble of models is recommended (Kendon et al., 2023), although beyond the purpose of this paper.Ban et al. (2020) found that for summer hourly extremes the average future increase is slightly higher for higher return periods (rarer events).In this study, in which we could isolate the statistical significance of the signals, the projected significant changes show an evident dependence on the return period at all durations.This is of particular interest for risk management and engineers dealing with the design of infrastructures.The results from the examined model suggest that in our study area the largest increase in extreme precipitation is expected for short-duration long-return-period events, with vast implications on the intensity-duration-frequency curves used for hazard assessment (e.g., Martel et al., 2021).
The dependence of the significant changes on duration and return period does not appear to be related to elevation (Figure S7 in Supporting Information S1).For medium and high elevation classes, the average change in the 20 years return level is almost constant with durations in the near future, while it decreases with duration in the far future (Figures S7a and S7b in Supporting Information S1).All elevation classes and durations show relative change increasing with increasing return time (Figures S7c and S7d in Supporting Information S1).

Physical Interpretation of the Projected Changes
The non-asymptotic structure of the SMEV model allows us to examine the changes in the distribution parameters and the number of events underlying the reported changes in extreme return levels (Figure 8).This opens the way to a physical interpretation of the results: the distribution of the ordinary events (and hence its scale and shape parameters) can be related with the local-scale dynamics and thermodynamics of the atmosphere and to the differences in large-scale dynamics associated with atmospheric motion.For example, the Clausius-Clapeyron Water Resources Research 10.1029/2023WR035969 relation quantifies the atmospheric water vapor holding capacity as a function of temperature and, when it comes to extreme precipitation, contains most of the information about the atmospheric thermodynamics.Under extreme precipitation the atmosphere is fully saturated and it is often assumed that extreme precipitation should increase with temperature at the same rate (i.e., about 7% °C 1 ).In such conditions, the scale parameter of the intensity distribution should change with temperature according to the above relation and the other parameters should remain unchanged.
Despite this, the projected increase in extreme precipitation in the examined domain for the near future seems to be explained just by a decrease in the shape parameter.Since a general increase in temperature is expected in the study area in the future (e.g., Kotlarski et al., 2023), this suggests that changes in thermodynamics are not sufficient to explain what we observe, and that atmospheric dynamics plays a dominant role in explaining the projected changes.This is consistent with past changes observed in the same region in Dallan et al. (2022), that associated the past changes to an increased proportion of convective storms in the summer season.The results in Figure 8a suggest that similar changes are to be expected also in the near future.
A different picture is provided for the far future (Figure 8b), where we see a dramatic increase in the scale parameter in addition to similar changes in the shape parameter and in the number of yearly storms.Due to the further increase in temperature toward the end of the century thermodynamic effects start to be clearly recognizable.Moreover, the dependence of the change in the scale parameter with duration shows a larger increase at the short durations, as expected from the thermodynamic effects related to the Clausius-Clapeyron relation (the assumption of full saturation is better met at short durations).
The ability of non-asymptotic methods, such as the SMEV proposed here, to separate the intensity and the occurrence frequency of storms could be further exploited in future studies, by including the analysis of temperature changes and the scaling with temperature of the extreme rainfall.This could provide viable ways to investigate the link between the change in the atmospheric dynamics and the change in the statistical characteristics of extreme rainfall.Moreover, in analysis based on model ensembles, SMEV could be beneficial in the understanding of the (possible) different results among different ensemble members, considering that the precipitation responses depend on several mechanisms and are not explained by just the change in temperature (Fosser et al., 2020).

Conclusions
In this work, we propose the use of non-asymptotic statistical methods to reduce the stochastic uncertainties related to the use of a short time period (10 years for the CPM simulations) and to analyze and attribute future changes in extreme precipitation.We exploit the ability of a high-resolution convection-permitting climate model (COSMO-crCLIM at 2.2 km resolution) in representing extreme short duration precipitation for estimating future changes in sub-daily rare return levels in a complex orography region in the eastern Italian Alps.We use a recent non-asymptotic statistical method, SMEV.We compare the uncertainty on the estimates from SMEV to the ones of a modified GEV approach (GEVr) recently used in Ban et al. (2020) and we take advantage of the reduced estimation uncertainty of SMEV to quantify the statistical significance of projected changes in return levels as high as 100 years events.Further we exploit its non-asymptotic formulation to attribute the observed changes to variations in intensity and occurrence frequency of the storms, and to suggest a physical interpretation of the underlying changes.
are likely related to changes in local and large-scale dynamics, while in the far future thermodynamics (linked to temperature) also plays a major role.
Our results demonstrate the reliability of the proposed method to investigate projected changes in sub-daily precipitation high return levels from short CPM simulations.The potential of non-asymptotic methods should be soon applied to a CPM ensemble to estimate the future changes in precipitation extremes accounting for models' uncertainty and to assess and attribute possible inter-model differences.The use of non-asymptotic methods contributes to establishing a clear relation between the changing physical processes and the changing statistics of extreme precipitation.
run under the Representative Concentration Pathways version 8.5 (RCP8.5)green-house gas scenario.For this study, three 10year long CPM simulations are performed over the extended Alpine domain defined under the CORDEX Flagship Pilot Study on Convective Phenomena over Europe and the Mediterranean (FPS-Convection; Coppola

Figure 1 .
Figure 1.Topography of the study area and location of rain gauges.

Figure 2 .
Figure 2. Comparison of the uncertainty for GEVr and Simplified Metastatistical Extreme Value methods, expressed as the CI 90 obtained with the bootstrap procedure, at varying durations (x-axis) for 20 and 100 years return time (upper and bottom rows), for the three time slices (columns).Each bar represents the variability of the CI 90 in the area: the dot symbol is the median, the thick segment is the interquartile range, the thin segment is the 5th-95th range.

Figure 3 .
Figure 3. Convection-permitting model (CPM) bias with respect to observations, at the station-collocated points, for 1 hr (a), (c) and 24 hr (b), (d), for mean Annual Maxima (AM, top row, panels (a), (b)) and 20 years return level (bottom row, panels (c), (d)).In purple the positive bias (CPM overestimation), in orange the negative bias (CPM underestimation).Biases significant at the 5% level are indicated with a black circle, and the fraction of significant biases in the area is reported in each panel.

Figure 4 .
Figure 4. Near future change for the 20 years return level, estimated at each grid point, for durations from 1 to 24 hr (panels (a) to (e)).In blue positive changes (increasing return levels), in orange negative changes (decreasing return levels).Changes significant at the 5% level are indicated with a dark dot, and the fraction of significant cases in the area is reported in each panel.

Figure 5 .
Figure5.Far future change for the 20 years return level, estimated at each grid point, for durations from 1 to 24 hr (panels (a) to (e)).In blue positive changes (increasing return levels), in orange negative changes (decreasing return levels).Changes significant at the 5% level are indicated with a dark dot, and the fraction of significant cases in the area is reported in each panel.

Figure 6 .
Figure 6.Near (panels (a), (b), (c)) and far (panels (d), (e), (f)) future change for the 100 years return level, estimated with Simplified Metastatistical Extreme Value at each grid point, for durations 1, 3, 24 hr.In blue positive changes (increasing return levels), in orange negative changes (decreasing return levels).Changes significant at the 5% level are indicated with a dark dot, and the fraction of significant cases in the area is reported in each panel.

Figure 7 .
Figure 7. Dependency of the return level future changes on time scale (durations) and probability level (return period).(a) Near future change versus.duration, for 4 return periods; (b) Far future change versus.duration, for 4 return periods; (c) Near future change versus.return period, for 5 durations; (d) Far future change versus.return period, for 5 durations.

Figure 8 .
Figure 8. Mean future changes on the distribution parameters at the different durations, for (a) near future and (b) far future change.The mean change in parameters is calculated considering the grid points with the change in the 20 years return levels is found significant.
using the test introduced by Marra et al. (2020) and later refined by Marra et al. (2023).The test assesses whether the null hypothesis of having Weibull tails can be rejected based on the available data.A detailed description of the test can be found in Marra, Levizzani, and Cattani (2022) and Marra et al. (2023), and the related codes are made available in Marra (