Estimation of Intensity‐Duration‐Area‐Frequency Relationships Based on the Full Range of Non‐Zero Precipitation From Radar‐Reanalysis Data

Intensity‐Duration‐Area‐Frequency (IDAF) models provide the mathematical link between precipitation intensities (I), durations (D), areas (A), and frequency of occurrence (F). They play a critical role in hydrological design, areal rainfall hazard quantification, storm characterization, and early warning system development. IDAF models extend the conventional Intensity‐Duration‐Frequency models by accounting for the spatial extent of precipitation (i.e., the area). In this study, we develop IDAF models using the entire non‐zero precipitation intensities, not only the extremes. We use the extended generalized Pareto distribution (EGPD) to model the precipitation intensities. To build the IDAF models, we adopt a data‐driven approach that allows the linkage of EGPD parameters with duration and area, based on empirically determined parametric relationships. The inference of model parameters is done using a global maximum likelihood estimation, and uncertainties are assessed by the bootstrap method. The study area is Switzerland, a topographically complex region of 42,000 km2 with regional precipitation variability and clear seasonality. The study utilizes 17 years of data from CombiPrecip, a radar‐reanalysis product developed by geostatistically merging radar and rain gauge data in an operational setting. We build the IDAF models for the spatiotemporal range of 1–72 hr and 1 to 1,089 km2 at each pixel in the study area. To the best of our knowledge, our study is the first attempt to use the EGPD in IDAF curve modeling. It discusses the use and limitations of CombiPrecip in extreme value analysis and highlights the challenges of modeling areal precipitation in a complex topographical environment.

duration and return period.Applications of the ARF-based IDAF models can be found in the literature, for example, De Michele et al. (2001) derived an ARF formulation based on the concept of dynamic scaling of rainfall and used it to model IDAF curves in Milan.Later, Ceresetti et al. (2012) used the ARF of De Michele et al. (2001) to model IDAF curves for storm severity assessment in southern France.Panthou et al. (2014) also used the same ARF formulation to characterize areal rainfall in West Africa.Ramos et al. (2005) used an empirical ARF formulation to model IDAF curves for storm severity assessment in Marseille.Bertini et al. (2020) used another empirical ARF formulation to build IDAF curves and used it to design a dam in Italy.Mélèse et al. (2019) and Blanchet and Mélèse (2020) used an extension of the ARF formulation of De Michele et al. (2001) to build IDAF curves respectively for areal hazards and storm severity assessment in southern France.The extension was to cope with the significant spatiotemporal variability in the mountainous area.
Beyond the ARF-based IDAF curves modeling approach, Overeem et al. (2010) proposed a purely data-driven approach to model IDAF curves.This involves modeling the parameters of the statistical distribution of the precipitation intensities as a function of duration and area.The type of relationship is empirically determined from the data, with no underlying physical hypothesis such as spatial correlation as done in Rodriguez-Iturbe and Mejía (1974) or scaling (as done in De Michele et al. (2001)).As highlighted by Mélèse et al. (2019), this approach has the advantage of being flexible and applicable in cases where the assumptions of the analytical ARF formulations cannot be verified.
In spite of the chosen method of building the IDAF curves, whether ARF-based or purely data-driven, the previously cited works have one thing in common; the precipitation data they used and by extension, the underlying parametric distribution.To elaborate more, all the authors used only extreme data in the form of block maxima and, as the distribution, the generalized extreme value (GEV) distribution (Ceresetti et al., 2012;Overeem et al., 2010;Panthou et al., 2014) or its special case, the Gumbel distribution (Bertini et al., 2020;Blanchet & Mélèse, 2020;Mélèse et al., 2019;Nhat et al., 2007), or log-normal distribution (De Michele et al., 2011).A rare application of GPD for threshold excesses is found in Zhao et al. (2023) for IDAF curve modeling.A major drawback of such approaches is the inefficient use of the data since only one value is retained in a block (usually the maximum in a year) or the excesses of a threshold (a tiny fraction of the data), and all the other data in the block is discarded.This can result in significant uncertainty in estimation, especially in cases where the length of the data series is not sufficiently long.The problem of short record length is more apparent with radar and radar reanalysis products, which are usually used in IDAF curve modeling (Blanchet & Mélèse, 2020;Mélèse et al., 2019;Overeem et al., 2010;Zhao et al., 2023), due to the required spatial information they provide.
To address this issue, our approach here is to make efficient use of information by including all the non-zero precipitation intensities, instead of only the block maxima, in modeling the IDAF curves.We then use the extended generalized Pareto distribution (EGPD) of Naveau et al. (2016) as the parametric model for the intensities.This distribution is compliant with extreme value theory in both tails (an advantage over the gamma distribution), it models the entire distribution of non-zero precipitation and does not require the choice of the threshold as in the GPD.It has been shown in many applications to be able to adequately model precipitation (Evin et al., 2018;Haruna et al., 2022Haruna et al., , 2023;;Le Gall et al., 2022;Naveau et al., 2016).In particular, Haruna et al. (2023) showed that it is possible to model IDF curves (without Area) with the EGPD, and we intend to extend their work to model IDAF curves with the EGPD.To our knowledge, this is the first time the EGPD has been used in modeling IDAF curves.
Modeling IDAF curves using all the non-zero data has two potential advantages.First, by using all the non-zero data, estimation uncertainty is expected to reduce, resulting in more accurate predictions.Second, in addition to having IDAF curves, we will have robust marginal distributions for the entire non-zero precipitation that can be used in stochastic weather generators for simulations, or verification of weather and climate models.
We apply the model in Switzerland, a topographically complex location with seasonality, regional variability, and multiple precipitation regimes.Following the work of Mélèse et al. (2019) which underscores the complex spatiotemporal variability of precipitation in mountainous areas, we use the more flexible data-driven method of Overeem et al. (2010) to model the IDAF curves.
The data and study area are presented in Section 2. The EGPD, the methodology for building the IDAF curves, and the method for uncertainty assessment are explained in Section 3. Results on the goodness of fit of the

Study Area
Our study focuses on Switzerland, a country covering 41,285 km 2 .Despite its small size, Switzerland exhibits a complex topography, ranging in elevation from 191 to 4,127 m above mean sea level.Figure 1 shows the map of the study area.Approximately 30% of the land is situated above 1,500 m elevation, resulting in pronounced spatial variability in both the intensity and occurrence of precipitation.The climate of Switzerland is influenced by multiple factors, such as the Alps, the Atlantic Ocean, and the Mediterranean Sea, and these contribute to the seasonal and spatial variability of precipitation, as documented in previous studies (Giannakaki & Martius, 2015;Scherrer et al., 2016;Sodemann & Zubler, 2009).Precipitation patterns show distinct regional differences, with the highest annual sums exceeding 2,000 mm in the Alps, the Jura region (northwest), and the Ticino region (south of the Alps).Conversely, the inner valleys such as the Rhône and Inn receive the lowest annual precipitation, less than 700 mm.Summer is the primary season for precipitation throughout Switzerland, except in Ticino, where autumn dominates.Conversely, winter experiences the least amount of precipitation across all regions.In terms of heavy precipitation, defined as the average seasonal maxima, the spatial distribution varies according to accumulation duration (Panziera et al., 2018).For short-duration accumulations (e.g., 1 hr), the heaviest precipitation occurs in summer across the entire country, with maximum intensities reaching up to 30 mm/hr in Ticino, Jura, and the northern rim.For longer accumulations (1 day and more), Ticino receives the most intense precipitation, with autumn experiencing a maximum 24 hr total exceeding 130 mm.In other regions, heavy precipitation predominantly occurs during summer.

CombiPrecip
CombiPrecip (CPC) is a radar-reanalysis product resulting from the geostatistical merging of radar and rain gauge in an operational setting (Sideris et al., 2014a).It combines the high accuracy of rain gauge with the high spatial coverage of radar.The geostatistical merging is through co-kriging with external drift, where the rain gauge data is treated as the primary source, and the radar data as the external drift.Information from the rain gauge comes from more than 250 automatic stations at 10 min resolution, and that from the radar comes from five polarimetric C-band Doppler radars that are suitably located to provide the reliable coverage required in the topographically complex area (see Figure 1).Since CPC is produced operationally, only rain gauge data within Switzerland are used in the algorithm, As a result, an algorithm for the treatment of extrapolation is used in which some radar pixels outside the Switzerland border are used as virtual rain gauges in the merging.Additionally, a convection control scheme is implemented to overcome the limited representativeness of rain gauges during convection events, especially in summer see Sideris et al. (2014b, for details).
The data from both the rain gauge and radar are subjected to substantial quality control before being employed in the CPC algorithm.The gauge data is checked to ensure that recorded values are within climatologically physical limits, they are consistent with those from nearby gauges, they satisfy inter-parameter consistency, and variability tests (MeteoSwiss, 2017).Treatment of the radar data (Germann et al., 2006) involves clutter elimination through a robust algorithm designed for this purpose, visibility correction resulting from orthographic shielding, correction for vertical profile of reflectivity, and bias correction.This is in addition to an automatic hardware calibration of the radars to check the stability/accuracy of the components and a tailored operational scan strategy (20 elevation sweeps every five minutes) crucial in mountainous regions such as Switzerland (Germann et al., 2015).CPC is available at hourly temporal resolution and a spatial grid of 1 by 1 km and extends 100-150 km beyond the borders of Switzerland.It has been available since 2005, and 17 years of data is available for this study, from 1 January 2005 to 31 December 2021.It has been used in several applications in Switzerland for extreme value analysis (Panziera et al., 2016), climatological studies (Panziera et al., 2018), meteorological forcing of hydrological model (Andres et al., 2016), and has been evaluated in several aspects (Gabella et al., 2017;Gugerli et al., 2020;Panziera et al., 2018).Known limitations of CPC involve the limited length of the data, non-homogeneity of the series due to radar upgrades and evolution of the number of radars, and conditional bias (MeteoSwiss, 2017).Despite these limitations, it is the only sub-daily gridded data available in the study area, and producing a gridded product is beyond the scope of the present study.We note that these limitations are not unique to CPC alone, but common to other radar and radar-reanalysis products, and notwithstanding, they have been used in IDAF modeling for example, Overeem et al. (2010), Mélèse et al. (2019), Blanchet and Mélèse (2020), or extreme value analysis (Allen & DeGaetano, 2005;Durrans et al., 2002;Goudenhoofdt et al., 2017;Panziera et al., 2018;Wright et al., 2014).This is due to the detailed spatial representativeness they provide, especially in mountainous areas, which is practically unobtainable with rain gauge networks alone.

Marginal Distribution for Non-Zero Precipitation Intensities
We use the three-parameter EGPD of Naveau et al. (2016) as the marginal distribution for the non-zero rainfall intensities in the IDAF model.The model is an extension of the classical GPD (which applies only to the excesses of a chosen threshold) to model the entire distribution of precipitation intensities (the low, medium, and extremes).The first advantage of EGPD is that since it is an extension of GPD, it is compliant with extreme value theory, so it behaves like the GPD in the upper tail of the distribution, that is, the same shape parameter see Tencaliec et al. (2020) for demonstration.Second, since it makes use of all the non-zero precipitation data, one does not need to worry about the delicate issue of threshold selection that is known with the GPD.Finally, it models the whole range of non-zero precipitation, which has several practical applications in cases where the interest is not only in the largest values but in the medium and low values as well (e.g., in simulation frameworks or climatological studies).
We define the random variable I to represent non-zero rainfall intensities.We assume that it follows the EGPD whose cumulative distribution function (CDF) is defined as: and the probability density function (PDF) is given as where a + = max (a, 0), σ > 0 is the scale parameter, and ξ ≥ 0 is the shape parameter that controls the upper tail of the distribution.The flexibility parameter, κ > 0 controls the lower tail.With the addition of only one parameter, κ, compared to the GPD, the distribution is able to model the full range of non-zero precipitation see applications in Evin et al. (2018), Le Gall et al. (2022), Haruna et al. (2022Haruna et al. ( , 2023)).

Space-Time Aggregation of the Data
The total area of Switzerland is 41,285 km 2 , and so we have hourly time series of precipitation at 41,285 CPC pixels, each of size 1 km 2 .We take each time series and stratify it into four seasons, with winter (December-January-February), spring (March-April-May), summer (June-July-August), and autumn (September-October-November).This seasonal approach is done to account for the pronounced seasonality in the study area, as done in several studies in the same area (Evin et al., 2018;Fukutome et al., 2015;Haruna et al., 2022Haruna et al., , 2023;;Molnar & Burlando, 2008;Panziera et al., 2018).
To produce the areal precipitation for use in modeling the IDAF relationships, we aggregate the data into 9 additional spatial scales (area) that includes 9,25,49,81,169,289,529,729, and 1,089 km 2 .The area corresponds to squares of sides 3, 5, 7, 9, 13, 17, 23, 27, and 33 km, which are illustrated in Figure 1.This leads to a total of 10 series of areal precipitation with area ranging from 1 to 1,089 km 2 centered around each pixel in the study domain.Since CPC is available beyond the borders, it allows us to have spatially aggregated rainfall everywhere in Switzerland (including the pixels close to the border).We comment here that the choice of the squared area is for simplicity and convenience since the CPC data is originally in this geometry.Other choices are possible such as circular or elliptical shapes as discussed in Mélèse et al. (2019).
Next, to build the time series for higher accumulation durations, we use a moving window to aggregate the hourly areal precipitation series into nine additional durations, that include 2, 3, 6, 10, 12, 16, 24, 48, and 72 hr.We consider durations up to 72 hr (3 days) because according to Froidevaux et al. (2015), while studying catchments larger than 10 km 2 , these time scales are the most relevant for flood-triggering precipitation accumulations in Switzerland.The intermediate durations are meant to ensure a good spread on a logarithmic scale.We also apply temporal declustering to reduce serial dependence in the time series as done in several studies (e.g., Haruna et al., 2022Haruna et al., , 2023;;Le Gall et al., 2022;Naveau et al., 2016).To achieve this, we retain every 3rd observation in the 1 hr time series, and every 4th, 5th, 8th, 10th, 12th, 16th, 24th, 48th, 72nd, respectively in each time series of 2,3,6,10,12,16,24,48, and 72 hr.A ratio plot (not shown) of the maximum intensity with and without declustering over all pixels revealed a median value between 0.8 and 0.95, without any systematic evolution with duration and season.This indicates that in certain cases, the highest intensities in each duration were left out as a result of the declustering process.However, given that we are using all the non-zero precipitation intensities, retaining the aggregated time series from the moving window would result in significant serial dependence in the time series.A plot of the lag-1 auto-correlation after declustering (Figure not shown) showed a large reduction in the auto-correlation especially in summer and the transition seasons, whereas it remains relatively high (median of 0.44) in winter, especially for the 1 hr series.Nonetheless, we retain the declustering steps to decrease the potential of omitting the highest intensities.
At the end of the aggregation, we have a total of 100 time series of areal precipitation at each pixel, each for a pair (D, A).Unlike in the case where only block maxima will be used for modeling the IDAF relationships, here, we retain and use all the non-zero precipitation intensities in modeling the IDAF relationships.Although we have the areal precipitation at all the pixels, for computational reasons (an average of 260,000 non-zero observations in summer, at each pixel location), we fit the IDAF model only at a subset of the pixels, by considering every second and third pixel along the latitude and longitude respectively.This results in a total of 7,056 pixels.

EGPD-IDAF Model
Our assumption is that the random variable of non-zero precipitation intensities for any duration D and area A, I (D, A) follows the EGPD, that is,: where κ(D, A) > 0, σ(D, A) > 0, and ξ(D, A) ≥ 0 are the three EGPD parameters for the duration D and area A.
Let F D,A (i) be the CDF of I (D, A), such that  () = ℙ( < ) , then IDAF curve, which is the T-year return level over duration D and area A is defined by the quantile function of F D,A , that is,: As already highlighted in Section 1, we use the data-driven approach of Overeem et al. (2010) to model the IDAF relationships.The approach involves empirically finding the appropriate regression model to explain the relationship between each of the three EGPD parameters as a function of duration and area.We will now explain our methodology to determine the appropriate regression model.
We begin by considering each pixel and fitting EGPD separately to the 100 aggregated time series of scales (D, A) at that pixel location.We then examine how the three EGPD parameters change with A and D. To model the relationships, we test various regression models using A, D, their transformations; log(A), log(D),  √  ,  √  , as well as some interactions terms.To avoid having a different regression model at each pixel, we compare competing models regionally by assessing their predictive performance in cross-validation.In the end, we retain the following regression models for the EGPD parameters: where D is in hours and A is in km 2 .β i, * for i = 0, 1, …6 are the regression coefficients.The scale (σ) and flexibility parameter (κ) both have a log link transformation because of their positive support.They both have seven regression parameters (β i for i = 0, 1, …6)).The shape parameter ξ has six parameters, making a total of 20 parameters for the complete EGPD-IDAF model for each season and pixel location.We note here that while the number of parameters might appear large, the model is still parsimonious compared to fitting EGPD separately for each time series of (D, A), which amounts to a total of 300 parameters (three (3) EGPD parameters for the 100-time series in our case).In the result Section we will show additional performance comparisons between the 20-parameter EGPD-IDAF model, and the 300-parameter base model.In addition to this, the relative complexity of the model (in terms of parameterization), highlights the inherent difficulty of modeling areal precipitation in mountainous regions, where areal rainfall is less homogeneous in space compared to relatively flat regions.A similar attempt to model IDAF curves in southern France (Massif Central) by Mélèse et al. (2019) highlights similar complexity.
To conclude this section, we illustrate a conceptual plot of IDAF curves in Figure 2. A plot of IDAF curves is 3-dimensional (Figure 2a), with Intensity (I) along the vertical axis, duration (D) along the horizontal axis, and area (A) along the third axis which is perpendicular to the other two axes.For each specific return period (e.g., 2-year, 10-year, or 50-year), a curve is shown to visualize how the intensity changes across A and D. However, a much simpler approach is to decouple the 3-dimensional plot into two sub-plots, each in 2-dimension.The first one shows how the intensities of specific return periods change across durations for a fixed area, that is, IDF curves (Figure 2b), and the second one, a plot of Intensity-Area-Frequency (IAF) curves, shows how the intensities change across areas for a fixed duration (Figure 2c).

Model Estimation
Let us call θ the vector of 20 regression parameters of the EGPD-IDAF model to be estimated at a given pixel location.We estimate θ by maximizing the censored log-likelihood of the EGPD-IDAF model l(θ), which is given by: where θ is the vector of the 20 regression parameters to be estimated.F D,A and f D,A are the CDF and PDF of the EGPD associated with (D, A), i (D,A,j) is the precipitation intensity for (D, A) and time step j.C (D,A) ≥ 0 is the left censoring threshold applied to the data of (D, A).The log-likelihood is finally expressed in Equation 8 as: where κ(D, A) > 0, σ(D, A) > 0, and ξ(D, A) ≥ 0, are the EGPD parameters for (D, A) and the other variables retain their earlier definitions.
The use of the censored likelihood (Equation 8) is mainly to improve the parameter estimation by reducing the influence of the small intensities (Naveau et al., 2016).Without censoring, the smaller intensities influence the parameter estimation, thereby resulting in a gross overestimation of the upper tail shape parameter (ξ).In the equation, both the data above and below the censoring threshold C contribute to the likelihood, albeit in two different ways.The data above C is believed to be observed and so the density function f (Equation 3) is applied to them.For the data below C, it is assumed that their precise magnitude is not known, although they have been observed.All that is known is that they are less than C, and so the distribution function F (Equation 1) is applied.The need for the censored likelihood is likely due to the insufficient flexibility of the three-parameter EGPD model to adequately model the left tail of the distribution or the associated uncertainty in the instrumental recording of very small intensities.A usual censoring approach is to apply a uniform censoring threshold (e.g., 2 mm for all the daily data, or 0.5 mm for all the hourly intensities), but as highlighted by Haruna et al. (2023), this is not usually sufficient, and so, an appropriate censoring threshold has to be obtained for each time series.We follow their footstep and estimate a threshold, for each time series of (D, A) that minimizes the squared error between the modeled and empirical quantiles (see Equation 10).This approach usually results in an adequate fit of the model.We comment that the censoring only applies during inference, afterward, the model is applied to all the non-zero precipitation intensities, even to the data below the threshold.Additionally, all the goodness of fit criteria is computed on the whole non-zero precipitation, and not only the data above the censoring threshold.
Furthermore, Equation 7 is based on the independence likelihood, which assumes independence in the data.This assumption is unlikely to hold given that we have three levels of dependence in the data; serial dependence within time series of the same (D, A), dependence between time series of different durations (e.g., time series of 1 hr and 1 km 2 , vs. time series of 2 hr and 1 km 2 ), and finally the dependence between time series of different spatial scales (e.g., time series of 1 hr and 1 km 2 , vs. time series of 1 hr and 3 km 2 ).Despite these, since our target is on the marginal (univariate) return levels, the violation of the independence assumption is unlikely to induce bias in our estimates (Sebille et al., 2017).Additionally, within the framework of modeling IDF curves using GEV distribution, Jurado et al. ( 2020) showed that little gain in performance is achieved by explicitly modeling the dependence between the data of different durations, in addition to the added complexity.Since their application is with GEV rather than EGPD, an interesting perspective is to investigate this effect with the EGPD.Here we retain the independence assumption to avoid additional complexity to our model which already has 20 free parameters.Finally, to avoid underestimating uncertainties in our model, which is one of the main consequences of the independence assumption, we resort to block-bootstrapping for uncertainty assessment (see Section 3.5).

Uncertainty Assessment
In order to assess uncertainty in the EGPD-IDAF model, we use the block bootstrap approach (Kunsch, 1989).
The principle of the block bootstrap involves dividing the time series into blocks of consecutive observations.Resamples are then generated by randomly selecting blocks with replacements and concatenating them to create a bootstrap sample.By preserving the block structure, the block bootstrap can capture the dependence structure of the original data.This approach is suitable for uncertainty estimation in our case, where we made the independence assumption in the likelihood estimation of the parameters.The block bootstrap method was used for uncertainty estimation by Overeem et al. (2010) in IDAF curves modeling, and by Overeem et al. (2009), Haruna et al. (2023) in IDF curves modeling.
To apply the block bootstrap approach, we take the seasonal time series at each pixel and estimate the uncertainty by following the outlined steps below: 1. Aggregate the time series into the 10 durations and 10 areas, resulting in a total of 100 time series, each for a pair of duration and area (D, A).Decluster each of the series according to the declustering procedure explained in Section 3.4.We call this sample M orig .2. Randomly select blocks of size 2 weeks with replacement, G times, to form the resampled time series (M boot ).
Both M orig and M boot have the same dimensions.The block bootstrapping ensures that we keep the data of the different durations D and areas A together, and hence the dependence structure.We use a block size of 2 weeks, beyond which the autocorrelation in the data does not decrease, as done in Haruna et al. (2023) for the same study area in the case of IDF curve modeling.3. Fit the EGPD-IDAF model to the data in M boot and estimate the intended return levels.4. Repeat steps 2 to 3 a total of 300 times to obtain the bootstrap distribution of the return levels.Finally, compute the 95% Confidence Interval (CI) of the return levels by the percentile method.This is done by taking the empirical 0.025 and 0.975 quantiles of the bootstrap distribution of the return levels obtained in step 4.
As a measure of model precision, we compute the normalized width of the 95% CI of a T-year return level estimate (Shehu & Haberlandt, 2023).For a given pixel location s, it is computed from:

Goodness of Fit of the EGPD IDAF Model
To assess the goodness of fit of the EGPD-IDAF model, we compute the normalized root-mean-square error (NRMSE) and the normalized bias (NBias) at each pixel s and spatiotemporal scale (D, A).To focus on the high intensities, the criteria are computed only on the exceedances above a 1-year return level, computed using the Weibull plotting position, defined as + 1 with j being the rank (from largest to smallest) and n is the sample size.The normalization allows comparison of the score across intensities of different spatiotemporal scales (D, A).For a given pixel s, the two criteria are given as: where n s is the sample size,    is the jth largest empirical quantile with return period = +1 × , δ is the average number of non-zero precipitations for (D, A) per year,     is the corresponding T j return level estimated from the EGPD-IDAF model.The denominator is the average of the exceedances.NRMSE measures the accuracy of a given model in predicting the empirical quantiles.A good model should have NRMSE = 0, and the smaller the score, the better the model.NBias measures the ability of the model to avoid systematic underestimation (NBias < 0) or overestimation (NBias > 0) of the empirical quantiles.NBias = 0 means an unbiased model.

Cross Validation
A natural question to ask is whether the EGPD-IDAF model which links the EGPD parameters with duration and area is a better model, in terms of some performance indicators, compared to fitting the EGPD model separately to each time series of spatiotemporal scale (D, A).The two models will henceforth be referred to as the global model and the base model, respectively.To answer this, we compare the two models in a split-sample cross-validation framework.We will start by describing the cross-validation framework, and then introduce the criteria for measuring the performance.
In the split sampling cross-validation, we consider each pixel and divide the time series into two subsamples of the same length but on different randomly chosen years.We consider the first sub-sample, aggregate the data into the 10 durations and 10 areas, and fit the two competing models, that is, the base model and the global model.We then assess how the two models perform on the second sub-sample (validation sample).A good predictive model should perform well in the data not used in training it.We do the same on the second sub-sample (use it as the training sample, and the first sub-sample as the validation sample).Since the split sampling is done randomly, we repeat the procedure 40 times to address sampling bias.We apply the same procedure to all the pixels in the study area.We then select the method that has the best regional performance (average of the scores over all the pixels.) We use some well-chosen predictive performance criteria to measure the performance of the models.The criteria have seen wide applications in the literature (see Blanchet et al., 2015;Evin et al., 2016;Garavaglia et al., 2011;Haruna et al., 2022Haruna et al., , 2023;;Renard et al., 2013).We give a brief overview of the criteria, while details can be found in the cited references.
• Robustness: The Robustness criteria, SPAN, measures the ability of a model to give similar estimates of a high return level when data from two different calibration periods are used to train the model (Garavaglia et al., 2011).At a given pixel (s) and for a spatiotemporal scale (D, A), SPAN is computed as: where    (1)  and    (2)   are the T-year return levels estimated from sub-sample 1 and 2, respectively at pixel s.A SPAN of 0.5 means that the absolute difference between the two return levels is half of their average.
A regional value of SPAN, over Switzerland, is computed as , where N = 7, 056 is the total number of pixels.A perfectly robust model should have SPANreg,T = 1.
• Reliability in predicting the maximum value: At a given pixel (s) and for a given spatiotemporal scale (D, A), the reliability of the model fitted on sub-sample 1 in predicting the maxima in sub-sample 2 and vice versa is measured by the FF criteria as follows:

𝑠𝑠
(computed over the N pixels), should be close to zero.FF reg at the regional scale, given as 1 − diff, should therefore take a value of 1 for a reliable model and 0 for a completely unreliable model; the lower the value the less reliable the model is.
• The reliability/accuracy over the entire observations: While the previous reliability score (FF), and SPAN focus on extremes only, it is important that the model is also reliable in the bulk of the distribution, especially given that we use the EGPD.To measure the reliability of a model in predicting all the observations in cross-validation, we use the normalized root mean square error (NRMSE_CV), which is expressed as: where  NRMSE_CV 12 s is the score computed at pixel s,   (2)  is the sample size of the second sub-sample,   (2)   is the empirical quantile with return period = + 1 × , δ is the average number of non-zero precipitations for (D, A) per year in sub-sample 2,    (1)   is the corresponding T j return level estimated from the model fitted on sub-sample 1.The denominator is the mean of non-zero precipitation in sub-sample 2 at pixel s computed as Similar to the other criteria, the regional score for each spatiotemporal scale (D,A), computed over the N pixels, is given as .The other score,  NRMSE_CV (21)  reg is computed symmetrically.NRMSE_CV reg = 1 indicates a perfectly accurate model (the model accurately predicts the empirical return levels).

Evaluation of CPC Data
We begin by checking how the statistics from the CPC data compares with those from the rain gauge through a point-to-pixel comparison.This involves comparing the time series from a gauge to the time series from a CPC pixel at the location of the gauge.We consider 79 stations, with no missing data from 2005 to 2020 (the period of overlap of both data sets), and in each case, the data from the gauge is considered the "truth."The location of the stations is shown in Figure A3.We comment here that the comparison is not entirely independent since most of the stations (69 out of the 79) are already utilized for correcting the radar data to produce CPC.However, since some differences remain between the two, it would be interesting to see how the statistics from CPC compare to those from the rain gauge before using them in modeling the IDAF relationships.The difference arises mainly due to the nugget effect in the variogram model, the inherent scale differences between radar and rain gauge measurements, and the convection control scheme in summer (Sideris et al., 2014a).
The comparison is in two steps, in the first step, we compare the two time series using some chosen criteria and in the second step, we fit EGPD to both time series and compare the 20-year return level estimate.The result of the comparison is presented in the following subsections.

Criteria on All Observations
Following the work of Zambrano-Bigiarini et al. ( 2017), we use the three sub-components of the Kling-Gupta-Efficiency (KGE) criterion (Kling et al., 2012) to compare the two data sets (see Equation A1 in Appendix A).The first is the bias (the tendency of CPC to under or overestimate the gauge data).The second is the variability ratio, which measures the under or over-dispersion of CPC data compared to the gauge.The third component measures the linear correlation between the two time series.For a perfect match between the gauge and CPC, all the criteria should be equal to 1.The criteria are computed based on all the data, including zeros.
The boxplot of the correlation coefficient is shown in Figure 3a for the four seasons and eight aggregation durations (1, 2, 3, 6, 12, 24, 48, and 72 hr).Generally, there is a good temporal correlation between the two data sets for all seasons and durations (median > 0.9).For all seasons, the correlation increases with the aggregation duration.Summer generally exhibits the lowest correlation irrespective of the duration, due to the localized and isolated nature of convective events that are likely to be missed by the rain gauge.The bias and variability scores are given in Appendix A. There is generally a tendency toward overestimation of the data (Figure A1a by the CPC for all seasons (median > 0), again the bias is more pronounced in summer compared to the other seasons.Finally, the dispersion bias (Figure A1b is generally negative with a median of 5% for all seasons.

Criteria on Extremes
Next, we evaluate the ability of CPC to correctly detect extreme precipitation as measured by the gauge.Extremes here are defined as the exceedances of a 2-year return level within the 16 years record.We compute three criteria similar to Panziera et al. (2018).The first criterion measures the bias in extreme precipitation totals.The second criterion computes the probability of detecting extremes (probability of detection [POD]), that is, the ability of CPC to classify events as extremes, given that they are also extremes according to the gauge.Finally, we compute the false alarm ratio (FAR), which measures the rate at which CPC classifies events as extremes when they are not extremes according to the gauge.For a perfect agreement, bias should be equal to 0, POD should be equal to 1, and FAR should be equal to 0.
Figure 3b shows the seasonal POD scores.The median of the score ranges from 0.7 to 0.99, which means that 70%-99% of the gauge extreme events are correctly classified as extremes by the CPC.Again, summer shows the lowest values compared to the other seasons.The seasonal boxplots of FAR are shown in Appendix A (Figure A2b) and the median FAR decreases with duration, which shows the agreement improves as the intensities are aggregated to higher durations.The median of the bias in the extremes precipitation totals (Figure A2a is less than 5% for all cases.Summer in this case has the lowest bias but shows the most spread in the case of short durations.

Comparison of Return Level Estimates
In the final phase, we compare the 20-year return level estimates from the two data sets.We fit EGPD to each data set and estimate the 20-year return level.Figure 4 shows the relative bias in the 20-year return level estimates.A positive bias indicates that the CPC estimates are higher than the gauge estimate.In general, the bias for durations greater than 6 hr is close to zero.For the 1 hr duration, however, there is a tendency to have lower estimates with CPC for all seasons, except summer which shows the opposite.
The map of the relative bias, as well as correlation coefficient (r), POD, and bias in total precipitation (β), is shown in Figure A3.It can be observed that in general, there is no distinct spatial pattern, except for β, which shows overestimation in the north and underestimation in the south in all seasons except summer.In addition, three more stations show a quite large disagreement with the CPC data.They are located at La Dole (elevation 1,669 m), Col Du Grand Saint-Bernard (2,472 m), and Säntis (2,501 m) in the west, southwest, and northeast respectively.MeteoSwiss indicates that rain gauge measurements at these stations are subjected to large uncertainties since the stations are not shielded (e.g., influence of wind and snow drift) (MeteoSwiss, 2023).
In conclusion, the result shown in these sections is aimed at checking how the statistics of CPC data compares with those from the rain gauge and to understand which direction they take before using the in modeling the IDAF relationships.Despite the noticeable disagreements, there is generally, a good agreement between the two data sets, given the inherent uncertainties in both databases (gauge vs. radar reanalysis).As mentioned before, although CPC is corrected using the rain gauge data, some differences remain, mainly due to the nugget effect in the variogram model, the convection control scheme in summer (Sideris et al., 2014a), and the measurement scale difference between radar and raingauge.As emphasized in Section 2.2, it is beyond the scope of this study to develop a new gridded data set for this topographically complex study area.CPC presents the only data set at the sub-daily temporal resolution in the study area, and it brings the required spatial information needed for modeling IDAF, which cannot be obtained from rain gauges due to their limited spatial representativity.In the remainder of the article, only CPC is used to build the IDAF models.

EGPD Parameters as a Function of Duration and Area
The purpose of this section is to show the complex relationship that exists between the EGPD parameters and duration D, and area A. Moreover, it aims to showcase that the EGPD-IDAF model is flexible enough to adequately capture this complexity.
As an illustration, we focus on a single pixel located at an elevation of 1,351 m in Adelboden, west of the Bernese Alps (see Figure 1).The estimated EGPD parameters as a function of D and A for the four seasons are shown in Figure 5.In each panel, the lines represent the modeled relationship using the EGPD-IDAF model, and the points show the parameter estimates using the base model.It can be observed from the figure that there is a clear season-dependent relationship of the parameters with D and A. We will focus on winter and summer since the other two seasons present behavior in between the two.
Starting with the top row, the flexibility parameter κ that controls the bulk and lower tail of the distribution shows a clear relationship with both D and A. For large A, it shows a positive monotonic relationship with D, while for small A, it shows a non-monotonic relationship, decreasing and then increasing with D. This non-monotonic relationship with D was also observed by Haruna et al. (2023) while modeling IDF curves in the study area using rain gauge data.Next, looking at the middle row, the scale parameter σ decreases with an increase in D for all A in both seasons.It however shows a non-monotonic relationship with A, which also varies with D. Finally, in the bottom row, the upper tail shape parameter ξ shows a season-dependent relationship with D and A. The strongest relationship is observed in summer, where it decreases with both D and A. While it shows exponential tail (ξ ≈ 0) for D > 24 hr irrespective of A, it shows a heavy tail (ξ > 0.1) for D = 1 hr even at A = 1,089 km 2 .In winter, however, it shows an exponential tail for all D and A.
We highlight here that the pattern of relationship observed at this pixel location is not general all over Switzerland, and our aim is just to illustrate the complexity of the relationship by focusing on this pixel.For instance, for some locations, σ can show a positive-monotonic relationship with A for all D. The shape parameter ξ can remain positive for all D and A in winter, increases with D, or increases with A. This intricate relationship of the parameters with D and A underscores the difficulty and complexity of modeling relationships of areal precipitation in topographically complex locations, due to the regional heterogeneity of the rainfall process.Despite this, as seen in Figure 5, the proposed regression models in Equation 6 are flexible enough to capture the observed trends in the points corresponding to the base model estimates.In the next section, we will present the results of the goodness of fit of the EGPD-IDAF model at all the pixel locations in Switzerland.

Goodness of Fit of the EGPD-IDAF Model for Extremes
We fitted the 20-parameter EGPD-IDAF at each pixel and for each of the four seasons and assessed the goodness of fit of the model using NRMSE and NBias (see Section 3.6).To assess the model's performance on the extreme values, we computed the two criteria on extremes only, defined as the exceedances of a 1-year return level.The normalization allows comparison of the score across intensities of different spatiotemporal scales (D, A). Figure 6 shows the results for the two criteria.In both figures, each of the four panels shows the score for a given season.The results are shown as a function of area (A), and so each boxplot contains the results of 7,056 pixels for the 10 aggregation durations of a given A. Figure 6a shows the result for 1-NRMSE and so the ideal score is 1.For all seasons, the median of the score is greater than 0.8 and the score gets better as A increases, possibly because as we aggregate the process over larger spatial domains, the variability decreases and the fit of the model gets better.While the score is relatively the same across seasons, summer shows slightly lower scores for smaller A (as seen from the width of the boxplot).These smaller scales in summer largely correspond to those experiencing more intense and skewed rainfall due the convective events.As such the shape parameter is heavy, and so the fit becomes more difficult.In Figure 6b, the median of the NBias remains close to zero which means that the model does not consistently overestimate or under-estimate the empirical quantiles.As with the other score, the variability around zero decreases as A increases.
These two scores show that the model is able to adequately reproduce the areal precipitation across durations in the study area.It shows good predictive performance as judged by the NRMSE, and doesn't show a systematic tendency to overestimate or underestimate the empirical values (NBias).

Cross Validation Results
The result of the split sampling cross-validation for the comparison of the EGPD-IDAF model (global model) and the base model is shown in Figure 7.This Figure shows the three cross-validation scores (NRMSE_CV, FF, and SPAN20), one panel for each criterion.As a reminder, the 20-parameter global model allows the linkage of the EGPD parameters with duration and area, the base model fits a separate EGPD model to each of the 100-time series of spatiotemporal scales (D, A).The best model in each case has a score of 1. Starting with the first panel from the left, NRMSE_CV is nearly the same for both models, which means that both models have the same accuracy in predicting the whole non-zero precipitation.Next, the FF criterion also shows similar performance by the two models.A noticeable exception is in summer, where the global model shows better performance.Hence, according to this criterion, while the models have similar reliability in predicting the maximum value, the global model is slightly better in summer.Finally, SPAN20 shows that a better performance is obtained with the global model for all seasons compared to the base model.This means that the global model gives a more stable estimate of a 20-year return level when the calibration sample is changed.Finally, the heat maps in Figure B1 show the median scores for each (D, A) pair.Poor scores are typically obtained for short spatiotemporal scales.
In summary, both models have similar reliability in their predictive ability (NRMSE_CV and FF), however, the global model is more robust in 20-year return level estimations (SPAN20).The robustness of the global model can be explained by the fact that the model is trained with much more data (all the 100-time series are pooled in the parameter estimation), compared to the base model.NRMSE_CV and FF (2 regional scores (i.e.,  FF (12)  reg and  FF (21)  reg ) for each pair of (D, A), and 40 resamplings).In the case of SPAN20, each boxplot contains 100 × 40 points (1 regional score for each pair of (D, A) and 40 resamplings).

Uncertainty
Since the two models have similar predictive performance, we also go a step further to compare the models in terms of their uncertainty estimates.While a good model should give correct predictions, the uncertainty of the prediction should not be too large.Figure 8 shows the n95CI width (%)) (Equation 9) of a 50-year return level estimate with both the global and base model.The smaller the score, the better the preciseness of the model (less uncertainty).Each panel in this Figure shows the result for a given season.The results are shown as a function of area (A), and so each boxplot contains the results of 7,056 pixels for the 10 aggregation durations of a given A. For all seasons, the global model has the smallest values of the n95CI width as seen from the median and width of the boxplots.The lower values of the global model mean less uncertainty compared to the base model, which can be explained by the fact that the global model is trained with more data, and this translates to less uncertainty (narrower confidence intervals).Two more comments can be made from Figure 8. First, for all seasons, the uncertainty decreases with A, which can be a result of the smoothing effect due to spatial averaging.Second, some inter-seasonal differences are noticeable, with summer (winter) having the highest (lowest) uncertainty.A possible explanation is that since more extremes are observed in summer (especially at sub-daily time scales), the uncertainty is expected to be larger.For a given return period, the magnitude of the return levels in summer at the small scale is larger compared to the other seasons, and so will the uncertainty.
To conclude, the results shown so far demonstrate that the modeled EGPD-IDAF can be used in the study area.It has adequate goodness of fit, is reliable and robust in prediction, and has relatively low uncertainty in estimation.With this validation, we will now proceed to showcase examples of IDAF curves constructed from the EGPD-IDAF model at some pixel locations in the next section.

IDAF Curves
Figure 9 shows an application of the EGPD-IDAF model to build summer IDF and IAF curves at the pixel located in Adelboden.This pixel has been introduced in Section 4.2 and is shown in Figure 1.Starting with the top row (Figure 9a), IDF curves are shown in the case of four aggregation areas that is, A = 1, 25, 529, and 1,089 km 2 (1 column each).In each column, the colored lines represent the T-year return level estimate using the EGPD-IDAF model across duration for T = 2, 10, and 20 years.The corresponding empirical levels, computed using the Weibull plotting position, are shown by the colored points.It can be seen that the EGPD-IDAF model correctly predicts the observation as they are within the 95% CI (shown by the colored envelopes).We also see that the uncertainty (indicated by the width of the bounds) is higher for shorter durations.Finally, irrespective of the spatial scale (A), the return levels decrease as the duration increases.
The second row (Figure 9b) shows the IAF curves for four temporal scales, D = 1, 3, 24, and 72 hr.While the model shows an adequate performance for longer durations (D ≥ 24 hr), the fit is less good in the case of shorter durations, especially for higher return periods.Looking at the IAF curves for short durations, we see that the return levels tend to decrease with an increase in the spatial scale.For longer durations, however, the return levels have nearly the same magnitude (flat IAF curves) irrespective of the spatial scale.A possible explanation is that at short durations, the rainfall events are more localized (typical of convective events) and so the magnitude decreases due to spatial averaging.For longer durations, however, the rainfall is more homogeneous in space (typical of frontal events), with no significant variations in rainfall intensity, leading to similar marginal distributions for the considered areal rainfall.
To explore the regional and seasonal variability of the IDAF relationships, Figure 10 shows the autumnal IDF curves (top row) and IAF curves (bottom row) at a location in Sion, in the inner valleys, southwestern Switzerland.This location is at a relatively low elevation of 482 m and experiences low-intensity rainfall due to the shielding effect of the Alps on both sides.Remarkably, the IDF and IAF curves at this pixel location exhibit a distinctive behavior, diverging from the conventional trend of decreasing return levels with increasing spatial scales.The IAF curves (bottom row) highlight this feature.It can be seen that the IAF curves for 1 hr are nearly flat, and the IAF curves for D ≥ 24 hr have positive slopes.A plausible explanation of this behavior is that rainfall, of short and long duration, is less intense at the pixel location compared to its neighborhood locations, which are at a higher altitude (see Figures 6-8 of Panziera et al. (2018)).As a consequence, more intense rainfall is observed as the rainfall is spatially aggregated around the pixel location.Figure 1 shows that a spatial window of 1,089 km 2 , centered around the pixel (elevation of 480 m), extends well beyond the valley into the Bernese alpine slopes (elevation up to 2,400 m).This seasonal and regional variability highlights the complexity of modeling areal precipitation in the study area due to the complex topography.

Areal Rainfall Hazard in Switzerland
In this last section, we use the EGPD-IDAF model to assess areal rainfall hazards in the study area.We investigate the 20-year return level for two spatiotemporal scales, specifically the scales (D = 1 hr, A = 1 km 2 )  and (D = 24 hr, A = 1,089 km 2 ).The corresponding maps of the seasonal 20-year return level are shown in Figures 11a and 11b respectively.For the scale (D = 1 hr, A = 1 km 2 ), we observe that the highest return levels occur during the summer months, while the lowest values are observed in winter.This can be attributed to the prevalence of convective rainfall during summer.We also see significant regional variability across all seasons, particularly during summer.The Ticino region in the south of the Alps, the Bernese Alps in the north, and the Jura Mountains consistently exhibit the highest return levels.Conversely, the inner valleys in Valais and Grisson, due to their location between mountains, depict the lowest values as they are shielded from both directions.
Moving to the scale (D = 24 hr, A = 1,089 km 2 ), we see a shift in the seasonal and regional variability of the 20-year return level.The black colored square shows the spatial coverage of A = 1,089 km 2 , centered around a pixel in Adelboden.The map in Figure 11b shows that the largest values are observed in Ticino, regardless of the season.The Ticino region consistently exhibits the highest levels of extreme precipitation in Switzerland.In the north of the Alps, the plateau displays lower levels compared to the pre-Alps (along the Glarus Alps).These results emphasize the influence of spatiotemporal scale on the seasonality and regional patterns of rainfall hazard in Switzerland.Smaller scales show a higher hazard during summer, while larger scales demonstrate a higher hazard during autumn, particularly in the Ticino region.It is important to note that the Ticino region consistently remains at a higher hazard of extreme precipitation, irrespective of the scale.Conversely, the inner valleys in Valais and Grisson exhibit lower susceptibility to extreme precipitation events.
In conclusion, this result provides insights into the seasonal and regional patterns of rainfall hazards in Switzerland, highlighting the importance of considering spatiotemporal scales when assessing extreme precipitation hazards.It is important to note that while this assessment focuses on the hazard of extreme precipitation, it is essential to consider other factors such as exposure and vulnerabilities at specific locations to fully evaluate the overall risk.

Conclusions
This paper focused on modeling the relationship of extreme precipitation across duration and area through Intensity-Duration-Frequency (IDAF) curves in Switzerland.We proposed a novel approach to model IDAF curves, by using all the non-zero (low, medium, and extremes) precipitation data, instead of only the extremes.To build the IDAF curves, we used the EGPD as the parametric distribution for the precipitation intensities.The EGPD has the key advantage of adequately modeling the entire distribution of non-zero precipitation while being compliant with extreme value theory in both tails.We followed the footsteps of Overeem et al. (2010) to model the IDAF curves through a data-driven approach.This approach involves modeling the EGPD parameters as a function of area and duration, with the form of the relationship being empirically determined from the data.We used 17 years of data from the radar-reanalysis product, Combi-Precip (CPC) (Sideris et al., 2014a) to build the EGPD-IDAF model at each pixel location in the study area.
We used the model to assess areal rainfall hazards for some spatiotemporal scales in Switzerland.More than any region, the results showed that Ticino, located south of the Alps, is the most exposed to extreme precipitation for all the scales considered.Overall, the result provided insights into the seasonal and regional patterns of rainfall hazards in Switzerland, highlighting the importance of considering multiple spatiotemporal scales when assessing extreme precipitation hazards.We comment here that although we used the EGPD-IDAF model for areal rainfall hazard assessment, it can be used in several applications, such as the design of hydraulic structures (Bertini et al., 2020), or the determination of thresholds for use in early warning systems (Panziera et al., 2016).Another potential application is that since the EGPD models the whole distribution of non-zero precipitation, not only the upper tail, it can be used as a marginal distribution in stochastic weather generators for areal rainfall generation.The model will provide for a robust marginal distribution, given the quantity of data used to train it.
Additional results through a point-to-pixel comparison showed that both CPC and rain gauge data provided similar return level estimates, especially for longer durations.While this can be seen as a sort of validation of the CPC in extreme value analysis, the inferred return levels have to be interpreted with caution, mainly due to the limited length of the data.Notwithstanding, our work still provided a framework for further analysis in the presence of longer time series, for example, from simulated series using weather generators.Another limitation concerning the use of EGPD is that the data has to be temporally declustered to reduce the serial dependence in the time series.For example, for durations higher than 10 hr (see Section 3.2) the temporal declustering means the data are taken in blocks.This can undoubtedly lead to the omission of high-intensity events that might result in the underestimation of the return levels.Potential methods to correct this have to be explored, since existing methods, to our knowledge, are for annual maxima series (e.g., Blanchet et al., 2016;Hershfield, 1962).Some perspectives for the present work involve using splines to model the relationships in the EGPD-IDAF model rather than regression forms.A possibility is to use Generalized Additive Models as implemented in Youngman (2020), or its extension that uses censored likelihood as used in Haruna et al. (2022).While splines can be promising due to their flexibility, a likely drawback is the enormous computational time required for inference of the model when using the EGPD, which uses all non-zero data.Our experience in Haruna et al. (2022) shows that the model requires significant time before convergence.The problem will be more complicated in this case where 100 time series is used and for more than 7,000 pixels.Another avenue for further research involves developing an Areal-Reduction-Factor (ARF)-based IDAF model and comparing it with the data-driven approach used in this model.While empirical (e.g., Mineo et al., 2018) andanalytical (e.g., De Michele et al., 2001) ARF formulations exist in the literature (see Svensson and Jones (2010), for a review), our suggestion is to empirically develop an ARF model that works in the study area.This is because previous research by Mélèse et al. (2019) showed that in mountainous regions, ARF formulations can exhibit unusual behavior (e.g.increasing value of ARF with an increase in Area, or ARF > 1).Furthermore, from an inference point of view, it will be interesting to explicitly account for dependence in the likelihood of the EGPD model (Equation 8).Beyond addressing the potential of likelihood misspecification, it will allow the possibility to estimate the conditional probability of observing an extreme event of a particular spatiotemporal scale, given that an extreme of another scale has been observed.This kind of information is invaluable in practice for risk management and planning.
Finally, an avenue for further research is to make an objective comparison of the performance of the EGPD and other distributions, such as GEV or GPD, or the recently proposed meta-statistical extreme value (MEV) distribution (Marani & Ignaccolo, 2015) in modeling IDAF relationships.The MEV distribution, in particular, has become increasingly popular in hydrological applications (e.g., Gründemann et al., 2023;Schellander et al., 2019) because it does not require the asymptotic assumption, and it uses more data compared to GEV and GPD.The evaluation framework and criteria used in this thesis (see Section 3.7) can be applied to compare these distributions.Additionally, the criterion proposed by Gründemann et al. (2023) to measure the heaviness of the tail of the distribution can also be computed.This will allow a thorough evaluation of the advantages and potential drawbacks of using the EGPD when the interest is only in the extremes.In any case, the EGPD has an edge over the GPD/GEV/MEV distributions since it models the entire non-zero precipitation (low, medium, and extremes), while the latter distributions only model the extremes.

Figure 1 .
Figure 1.Map of Switzerland, the study area.The background color denotes the elevation (meters) above mean sea level.The Radar symbols show the location of the five radars in Switzerland, with their names in the white boxes.The name of some cities is shown in black and the name of some mountains and regions are shown in red.The colored embedded squares show exemplarily the extent of 7 out of the 10 square windows used for data aggregation.They cover areas of 25, 49, 169, 289, 529, 729, and 1,089 km 2 , from the innermost to the outermost.The red colored circles show the location of 79 rain gauges used for evaluation of the gridded product (CPC).
D, A) > 0, σ(D, A) > 0, and ξ(D, A) ≥ 0 are the three EGPD parameters for the duration D and area A. T is the return period in years, δ(D, A) is the average number of non-zero precipitation intensities for duration D and area A per year.We estimate δ D,A based on the long-term average of the non-zero precipitation intensities per year.

Figure 2 .
Figure 2. (a) Conceptual illustration of Intensity Duration Area Frequency (IDAF) curves in 3-dimension.Intensity Duration Frequency curves curves for A = 81 km 2 (shown in panel (b)) are obtained by cutting a plane on the IDAF curves in panel (a) at A = 81 km 2 (red-colored broken lines).The Intensity-Area-Frequency curves on panel (c) are obtained by cutting a plane at D = 6 hr on panel (a) (blue-colored broken lines).

Figure 3 .
Figure 3. Boxplots of (a) linear correlation and (b) probability of detection for the four seasons.Each boxplot contains 79 points, 1 point for each pair of rain gauge and the underlying CPC pixel.

Figure 4 .
Figure 4. Boxplots of relative bias in a 20-year return level estimate for the four seasons.Each boxplot contains 79 points, 1 point for each pair of rain gauge and the underlying CPC pixel.

Figure 5 .
Figure 5. Extended generalized Pareto distribution (EGPD) parameters as a function of duration D and area A at a pixel located in Adelboden (elevation of 1,354 m, see Figure1), for the four seasons (columns).The first row is for κ(D, A), the second row is for σ(D, A), and the last row is for ξ(D, A).In each panel, the lines represent the modeled relationship using the EGPD-Intensity Duration Area Frequency model, and the points show the parameter estimates using the base model.The lines and points are colored by duration.

Figure 6 .
Figure 6.Goodness of fit of the extended generalized Pareto distribution-Intensity Duration Area Frequency model computed on extremes, defined as the exceedances of a 1-year return level for each (D, A).(a) Boxplots of (1-normalized root-mean-square error).Note that the vertical axis is cut at zero although negative values exist.The negative values account for less than 0.06% of the scores (b) normalized bias for the four seasons.Each boxplot contains 7,056 by 10 points, 1 point each for a pixel and a spatiotemporal scale (D, A).

Figure 9 .
Figure 9. Application of the fitted extended generalized Pareto distribution-Intensity Duration Area Frequency model at a pixel location in Adelboden for the summer season.The top row (a) shows some Intensity Duration Frequency curves curves for four spatial scales (one per column).The bottom row (b) shows the Intensity-Area-Frequency curves for four temporal scales (one per column).The lines and the points show the modeled and empirical levels respectively, colored by their return periods.The colored envelopes are the 95% confidence intervals of the model estimates obtained by block bootstrap.The 50-year empirical values are not shown due to the short record length of the data.

Figure 11 .
Figure 11.Map of seasonal 20-year return level obtained with the extended generalized Pareto distribution-Intensity Duration Area Frequency model for the spatiotemporal scales (a) (D = 1 hr, A = 1 km 2 ) and (b) (D = 24 hr, A = 1,089 km 2 ).The black-colored square in panel (b) shows exemplarily the maximum extent of the square window used for data aggregation, that is, 1,089 km 2 .

Figure 10 .
Figure10.Same as Figure9but for autumn at a location in Sion in the Canton of Valais (see Figure1).

Figure A1 .
Figure A1.(a) Boxplots of bias (β) for the four seasons.(b) Boxplots of variability ratio (γ) for the four seasons.Each boxplot contains 79 points, 1 point for each pair of gauge and the underlying CPC pixel.

Figure A2 .
Figure A2.(a): Boxplots of the bias in extreme precipitation totals for the four seasons.(b): Boxplots of the false alarm ratio for the four seasons.Each boxplot contains 79 points, 1 point for each pair of gauges and the underlying CPC pixel.

Figure A3 .
Figure A3.Map of Switzerland showing the location of the 79 stations used for comparing CPC against rain gauge time series.The locations shown by the circles are those managed by MeteoSwiss, while those shown by the triangles are managed by the canton of Lucerne.The shapes are colored according to the value of the criterion.From top left in clockwise direction: linear correlation (r), probability of detection, bias (β) relative bias in a 20-year return level estimate.

Figure B1 .
Figure B1.(a) Seasonal heat maps of the median score over 80 values (2 × 40 resamplings) for various spatiotemporal scales (D, A).Top left: NRMSE_CV, top right: the FF criterion.The bottom panel shows the same maps for the case of SPAN20 over 40 resamplings.
Appendix B: Cross Validation Criteria for Various Spatiotemporal Scales denotes the average of the 300 estimates.The normalization is to enable the comparison of uncertainty width across intensities of different scales and return periods.