Modeling Intensity‐Duration‐Frequency Curves for the Whole Range of Non‐Zero Precipitation: A Comparison of Models

Intensity‐duration‐frequency (IDF) curves are useful in water resources engineering for the planning and design of hydrological structures. As opposed to the common use of only extreme data to build IDF curves, here, we use all the non‐zero rainfall intensities, thereby making efficient use of the available information. We use the extended generalized Pareto distribution to model the distribution of the non‐zero rainfall intensities. We consider three commonly used approaches for building IDF curves. The first approach is based on the scale‐invariance property of rainfall, the second relies on the general IDF formulation of Koutsoyiannis et al. (1998, https://doi.org/10.1016/S0022-1694(98)00097-3), and the last approach is purely data‐driven(Overeem et al., 2008, https://doi.org/10.1016/j.jhydrol.2007.09.044). Using these three approaches, and some extensions around them, we build a total of 10 models for the IDF curves, and then we compare them in a split‐sampling cross‐validation framework. We consider a total of 81 stations at 10 min resolution in Switzerland. Due to the marked seasonality of rainfall in the study area, we performed a seasonal‐based analysis. The results reveal the model based on the data‐driven approach as the best model. It is able to correctly model the observed intensities across duration while being reliable and robust. It is also able to reproduce the space and time variability of extreme rainfall across Switzerland.

Regarding the model for the dependence of intensity on duration, many formulations that are based on different approaches have been proposed in the literature. Here, we identify and focus on three major approaches.
The first approach is based on scale invariance. It has been shown that rainfall exhibits this property within some scales (see Gupta & Waymire, 1990Harris et al., 1997;Lima, 1998;Molnar & Burlando, 2008;Over, 1995;Paschalis, 2013;Schertzer & Lovejoy, 1987;Veneziano & Lepore, 2012). This property provides the physical justification for modeling IDF curves, and thus the possibility of inferring return levels of interest across scales. This approach is arguably the most commonly used approach, possibly because of its rich theoretical background, physical basis, and ease of application in regions with scarce availability of sub-daily rainfall series. IDF curves based on this approach can be found in several applications Burlando & Rosso, 1996;Innocenti et al., 2017;Menabde et al., 1999;Sane et al., 2018;Van de Vyver & Demarée, 2010;Willems, 2000).
The second approach is based on the general formulation of Koutsoyiannis et al. (1998), a generalization of the various empirical formulations for modeling IDF curves. This formulation has the key advantage of being a separable function of return levels and duration. It is also consistent with both probabilistic theories and the physical constraints of scaling across duration. Several applications of this formulation to build IDF curves can be found in the literature (e.g., Blanchet et al., 2016;Fauer et al., 2021;Koutsoyiannis et al., 1998;Roksvåg et al., 2021;Sane et al., 2018;Ulrich et al., 2020;Van de Vyver & Demarée, 2010). This method can be seen as an extension of the scaling approach in which an additional parameter (θ) is added to allow for the curvature of the return levels curves for short durations.
The third approach is based on Overeem et al. (2008) and is called the data-driven approach. Here, the functional relationship (linkage) between IDF model parameters and duration is empirically determined from the data itself. The method involves fitting a parametric model, for example, GEV, to data of each duration. A particular regression model is then fitted for each parameter as a function of duration. As a consequence, the return level of any duration can be inferred from the inverse of the distribution, with parameters obtained from the regression model. This approach imposes neither the assumption/existence of scaling nor the separability condition in the case of the general formulation of Koutsoyiannis et al. (1998). Interestingly, both approaches can be seen as special cases of the data-driven approach with particular functional relationships imposed on the parameters. We note that, although the other two approaches also consider the link between parameter and duration, the specific forms rely on some underlying theoretical hypothesis. However, with the data-driven method, the functional relationship is empirically determined from the data itself, hence its name.
There are also nonparametric approaches, which rather than imposing a parametric model on the intensities, use stochastic rainfall models to estimate the IDF curves (for a brief review, see Veneziano et al., 2007;Tyralis & Langousis, 2019). Here, we focus on the class that uses parametric models for the intensities.
The common use of GEV and GPD as models for the distribution of the rainfall intensities in IDF curves is justified by the practical use of IDF curves, which is to infer high return levels for hydrological designs. However, there is a major drawback in using these distributions. This is mainly due to poor utilization of already scarce data; only one value per block (in GEV distribution), or only values above a given threshold (as in GPD). This is why Langousis et al. (2016) mentioned that compared to the GEV, GPD is more favored for extreme return level estimation since it makes more efficient use of hydrologic information and hence the resulting decrease in estimation uncertainty. Despite that, the delicate issue of threshold remains with the GPD.
HARUNA ET AL.
10.1029/2022WR033362 3 of 33 As a remedy, Naveau et al. (2016) proposed the extended generalized Pareto distribution (EGPD) to model all the non-zero precipitation intensities. It has the advantage of using all the information present in the sample of non-zero rainfall data, and not only one value per block (like GEV distribution) or only values above a given threshold (as in GPD distribution), thereby reducing uncertainty. It does not require the choice of GPD threshold and has the advantage of being compliant with extreme value theory (EVT) in both the lower and upper tails. Another advantage is its ability to model all non-zero precipitation. This has applications in water resources management for urban water supplies, hydropower, flood forecast, and irrigation systems. As an example, marginal distributions for the entire range of non-zero precipitation are required in stochastic generators (see an example of application in Evin et al., 2018;Viviroli et al., 2022). Simulated precipitation from the generators is used as input to conceptual hydrological models for subsequent flood modeling and risk assessment. Other practical applications are in the evaluation of numerical weather simulations or investigation of the climatology of rainfall events as outlined by Blanchet et al. (2019). On the contrary, the assumption of independence in estimation is more likely to be violated in the case of EGPD, which uses all the information compared to the GEV and GPD. A common approach to reducing the dependence is to apply temporal declustering, for example, by taking one-third of the data as done in Naveau et al. (2016), Le Gall et al. (2022), and Haruna et al. (2022).
The goal of this article is to use the EGPD to build IDF curves for the non-zero precipitation intensities, based on the three outlined approaches, that is, scale-invariance, the general formulation of Koutsoyiannis et al. (1998) and the data-driven approach. Due to the marked seasonality of rainfall in the study area, we consider a seasonally based analysis, with winter (Dec-Jan-Feb), spring (Mar-Apr-May), summer (Jun-Jul-Aug), and autumn (Sep-Oct-Nov). We use a split-sampling cross-validation framework, based on some well-chosen criteria to compare and select the best model. To the best of our knowledge, this study is the first to present an objective comparison of the three outlined IDF curve modeling approaches. Furthermore, while several applications of the EGPD have been presented in the literature (e.g., Blanchet et al., 2019;Evin et al., 2018;Haruna et al., 2022;Le Gall et al., 2022;Rivoire et al., 2021;Tencaliec et al., 2020), our study presents the first attempt to use this distribution in IDF curves development. In addition to the common application of IDF curves for return level estimation, IDF curves based on all the non-zero precipitation data (modeled with EGPD) will allow for a more comprehensive validation of stochastic weather generators. Finally, apart from the traditional IDF curves (see Eicher & Krejci, 1996), as far as we know, this study presents the first attempt to model IDF curves that consider linking parameters with duration over the whole of Switzerland.
The rest of the paper is organized as follows: Section 2 introduces the data and the area under study, and Section 3 presents the EGPD, the models for the IDF curves, and the evaluation framework. Section 4 presents the results and discussion, and finally, Section 5 draws the conclusions and gives relevant perspectives.

Data and Area Under Study
The study area is Switzerland, a small country by size with an area of 41,285 km 2 . Despite its relatively small size, it however presents a complex topography with elevations ranging from 191 to 4,127 m above mean sea level. Around 30% of the area is located above the elevation of 1,500 m above sea level. This results in marked spatial variability both in intensity and occurrence of precipitation.
Point precipitation data from a total of 81 stations, with a minimum record length of 20 years, are available for this study. They are spread across Switzerland and their location is shown in Figure 1. Out of this total, 71 stations belong to the SwissMetNet of the Swiss Federal Office for Meteorology and Climatology (MeteoSwiss), while 10 belong to the canton of Lucerne, a partner network of MeteoSwiss. The precipitation data is measured with a tipping-bucket gauge of 0.1 mm depth resolution at a sampling resolution of 10 min. Most of the stations at high altitudes are shielded from wind and the tipping gauge is heated to account for snow. As a result, heating-related losses can result in up to 24% underestimation compared to the measurements using an electronic weighting system (Savina et al., 2012). The sample data has a variable length ranging from a minimum of 20 years to a maximum of 40 years from 1981 to 2020. The stations are located at elevations ranging from a minimum of 203 m, an average of 952.4 m, and a maximum of 3,294 m.
The climate of Switzerland is influenced by the Alps, the Atlantic Ocean, as well as the Mediterranean sea (Giannakaki & Martius, 2015;Sodemann & Zubler, 2009). As a result, precipitation is characterized by seasonality and spatial variability both in intensity and occurrence (Frei & Schär, 1998 Burlando, 2008;Sevruk, 1997;Sevruk et al., 1998). Annual precipitation is largest in the Alps and its rims, along the Jura (in the northwest), and in the Ticino (south of the Alps). In these areas, annual sums are generally larger than 2,000 mm. On the other hand, the inner valleys (Rhône and Inn) receive the least annual precipitation (less than 700 mm). Summer is the main season of precipitation all over Switzerland, except in Ticino where autumn is the main season. For all regions, winter receives the least precipitation. In terms of heavy precipitation (see Panziera et al., 2018), the spatial distribution depends on the accumulation duration. For short-duration accumulations (e.g., 1 hr), the heaviest precipitation occurs in summer everywhere. It is largest in Ticino, Jura, and the northern rim, where maximum summer hourly intensities can reach up to 30 mm/hr. For higher accumulations (1 day and greater), Ticino receives the heaviest precipitation, where a maximum of more than 130 mm over 24 hr can be expected in autumn. For the other regions, the heaviest precipitation almost always occurs in summer.
Due to this marked seasonality, we consider a seasonally based analysis. We divide the data into four seasons of 3 months each. Winter includes Dec-Jan-Feb, Spring Mar-Apr-May, Summer Jun-Jul-Aug, while Autumn includes Sep-Oct-Nov. A similar seasonal approach was used in previous studies in this area (see Evin et al., 2018;Fukutome et al., 2015;Haruna et al., 2022;Molnar & Burlando, 2008).

Methodology
In this section, we start by presenting the distribution for the non-zero precipitation intensities, we then present the various IDF models, and finally, the inference strategy to estimate the parameters. A flowchart for the framework is shown in Figure 2.

Marginal Distribution for Non-Zero Precipitation Intensities
We define the random variable I to represent non-zero rainfall intensities. We intend to find a distribution that will allow us to model the entire distribution of I, that is, both its small, medium, and extreme values. A natural choice would be the Gamma distribution, but it has been reported to underestimate extreme precipitation (Katz et al., 2002).
When looking at only the extremes, the EVT (Coles, 2001) tells us that, the probability that the excesses of I above a high threshold u are less than or equal to a large value i can be approximated by the GPD given as: with the cumulative distribution function (CDF) where a + = max(a, 0), σ > 0 is the scale parameter, and −∞ < ξ < +∞ is the GP shape parameter that controls the upper tail of the distribution. The bounded case (short-tailed) is when ξ < 0, the exponential case (light-tailed) when ξ = 0, and the unbounded case (heavy-tailed) when ξ > 0.
The GPD has a key advantage over the GEV since it makes use of all the observations above the threshold u, compared to the GEV case where only the largest observation within a block, usually a season or year, is retained. However, some major issues remain. First, there is yet no unified method for the choice of the optimum threshold u over which excesses are GP distributed. Second, the observations that are below the chosen threshold u, although precious and not easy to come by, are discarded and entirely not utilized. Finally, the question of modeling the whole range of non-zero precipitation which has applications in water resource management remains.
Many parametric, semi-parametric, and non-parametric models have been proposed in the literature as extensions to the GPD to model the full distribution of I, and not only its upper tail as given in Equation 1 (see review in HARUNA ET AL. 10.1029/2022WR033362 6 of 33 Scarrot & MacDonald, 2012). In particular, Naveau et al. (2016) proposed a family of models they called the EGPDs that are defined as: where G is a continuous CDF that is also defined on [0,1], with constraints given in Naveau et al. (2016) to ensure that the distribution of I remains GP in its upper tail, a gamma-like lower tail, and a smooth transition in between. Four parametric families of G were proposed by the authors, the simplest of which is defined as G(v) = v k . The CDF of this model is defined as: The model thus has three parameters. k > 0 controls the lower tail, σ > 0 is the scale parameter, and ξ ≥ 0 controls the upper tail. ( ) is as defined in Equation 2. We exclude the bounded case (ξ < 0) since we are dealing with precipitation. We thus assume precipitation to be either light-tailed (ξ = 0) or heavy-tailed (ξ > 0). The density of the model, alongside that of the Gamma distribution, is illustrated in Figure 3. With just one additional parameter, κ, the distribution is parsimonious and able to adequately model the full range of non-zero precipitation (see applications in Evin et al., 2018;Haruna et al., 2022;Le Gall et al., 2022). We thus use this model in the rest of the paper.

IDF Models
We define the random variable I d as the average non-zero precipitation intensity over the duration d. It is described by the CDF, F d (i), such that ( ) = ℙ( < ) . The exceedance frequency is defined as p d (i) = 1 − F d (i). The return period of any non-zero intensity i, as a function of p d is given by ( ≥ ) = 1 × , with δ d the average number of non-zero precipitation intensities of duration d per year. We estimate δ d based on the long-term average of the non-zero precipitation intensities per year. Consequently, the T-year return level over duration d, i (T, d), is defined as the Accordingly, IDF relationship is a mathematical function (T, d) ↦ i(T, d) that relates non-zero rainfall intensity i with its duration d, and the frequency of exceedance p d (or rather the return level T). In this article, the CDF of I d , F d (i) is defined by the EGPD presented in Section 3.1. All the different formulations considered here simply differ by how they define this mathematical relationship between i, T, and d, while taking F d (i) as an EGPD model. = v k for σ = 1, ξ = 0.5, and κ = 0.8, 1, and 2 (adapted from Naveau et al. (2016)). The density of gamma distribution with both shape and scale = 1.4 is shown in blue color. The right panel zooms on the right tail. Indeed the gamma tail decays much faster while having a similar shape in the bulk of the distribution for the case of κ = 2. The case where κ = 1 is the exact GP case.

of 33
In the following subsections, we present the different IDF-EGPD models based on the three outlined approaches, that is, scale-invariance, the general formulation of Koutsoyiannis et al. (1998), and data-driven approaches. For sake of simplicity, we drop the "EGPD" term and simply refer to the IDF models as IDF modelname , where the subscript "modelname" refers to the approach used to build the model.
For all the models, the IDF curve, corresponding to the ( 1 − 1 × ) quantile of the EGPD is defined in Equation 5 as: Where κ d > 0, σ d > 0, and ξ d ≥ 0 are the three EGPD parameters for the duration d, T is the return period in years, and δ d is the average number of non-zero precipitation intensities per year for the duration d. The choice of the model determines whether each of the three parameters; κ, σ, and ξ varies with d or not, and the form of the relationship.
We consider 13 durations, that is, d = 30 min, 40 min, 1, 2, 3, 6, 10, 12, 16, 18, 24, 48, and 72 hr. We consider durations up to 72 hr (3 days) because according to Froidevaux et al. (2015), precipitation accumulations from 0 to 3 days before an event are the most relevant for triggering floods of high magnitudes in Switzerland. The intermediate durations are meant to ensure a good spread in a logarithmic scale and for a later comparison of our quantiles with those provided by MeteoSwiss (not in this paper). We use a fixed window to aggregate the data from the gauge resolution of 10 min, to the various durations. For instance, the 24 hr intensities correspond to amounts accumulated from 00h00 to 24h00 UTC of every day, divided by 24. We chose this time window because, in the study area, convective events are mostly in the evening. This will allow us to differentiate between the first and last 12 hr starting from 00 hr. The choice of a fixed window over a moving window will likely result in omitting the highest intensities in each duration. However, since we are using all the non-zero precipitation intensities, using a moving window will result in significant dependence in the time series.

Scaling IDF Model
Scale invariance in the strict sense of Gupta and Waymire (1990) refers to the property where the probability distribution of I d can be inferred from the distribution of 0 at the reference duration d 0 through: where the parameter C λ determines the type of scaling; simple-scaling or multi-scaling. For our case, the reference duration d 0 is taken as 1 hr.
A weaker assumption, the so-called "wide sense scaling" (Gupta & Waymire, 1990), is when the scaling is in the moments according to: where q is the order of the moment, k(q) is called the moment scaling function, and d 0 is the reference duration. Moment scaling analysis as described by Gupta and Waymire (1990) is used to determine the type of scaling.
When the parameter in Equation 6, = ( 0 ) − , that is, a scalar that depends only on the ratio of the scales, we have "strict sense simple-scaling." This is expressed in Equation 8.
where 0 < H < 1 is the scaling exponent. The other variables retain their meanings.
Wide sense simple-scaling is when the moment scaling function in Equation 7 is linear in q, that is, k(q) = Hq, as expressed in Equation 9.
HARUNA ET AL. 10.1029/2022WR033362 It can be shown that, under the strict sense simple-scaling, only one parameter of the EGPD is scaling, which is σ, whereas κ and ξ are independent of duration. For the rest of the paper, we drop the term "strict sense," and simply use "simple scaling" for convenience. Accordingly, the simple-scaling EGPD model, IDF ss , is defined such that: = 0 , = 0 , and σ d is a power law given as: where 0 > 0 , 0 > 0 , and 0 ≥ 0 , are the parameters of the reference duration d 0 = 1 hr, and 0 < H < 1 is the scaling exponent. The inference method for the parameters is described in Section 3.3.
An important issue is the existence of multiple scaling regimes in precipitation. This means that different scaling behaviors (scaling exponents) exist for different ranges of duration. IDF curves have to be modeled considering the existence of this change in scaling (e.g., Bougadis & Adamowski, 2006;Courty et al., 2019;Yu et al., 2004). To illustrate this behavior, we consider the case of a station at Robbia in Graubünder in winter. We fit the EGPD to the data of each of the 13 durations separately by maximum likelihood. We then inspected how the estimated scale parameter (σ) varies with duration. The result is given in Figure 4a. Here, a single power law (log-log given in Equation 10) in blue is not enough to explain the scaling. To account for this break in the scaling relationship, we define the two-regime simple-scaling EGPD IDF model, IDF ss_TR as: An illustration of this equation is in Figure 4a (log-log TR in red). Where K is the duration of the scaling break, 0 > 0 is the scale parameter for the reference duration d 0 = 1 and σ d is continuous in d = K. The other parameters, 0 < H 1 < 1 and 0 < H 2 < 1 are the scaling exponents of the first and second regimes. The other two EGPD 10.1029/2022WR033362 9 of 33 parameters = 0 , = 0 , remain independent of durations. Hence for this model, a total of six parameters have to be estimated, that is, 0 , 0 , 0 , H 1 , H 2 and K. Finally, although the simple-scaling EGPD model imposes a constraint on the linkage of ξ with d, that is, = 0 , we however notice some of the stations to show an apparent variation of ξ with respect to d. For illustration, we consider a station in Zurich in summer. We fit EGPD to the data of each of the 13 durations separately by maximum likelihood. We then inspected how the estimated shape parameter (ξ) varied with duration and modeled the relationship through a linear-log form as expressed in Equation 12. Figure 4b illustrates this and how the linear-log model fits the points correctly: where a ξ and b ξ are the intercepts and slopes, respectively. This leads to two additional IDF models, with ξ = f(d), namely: • IDF ss_ξ (d) : an extension of the basic simple-scaling model IDF ss , to allow ξ to vary with d according to Equation 12. • IDF ss_TR_ξ(d) : an extension of the two-regime simple-scaling model IDF ss_TR , to allow ξ to vary with d according to Equation 12. Koutsoyiannis et al. (1998) proposed a general formulation for the different traditional formulations of the IDF curves in the literature. He showed that all of them can be simplified into the form:

General IDF Formulation
quantile of the re-scaled intensities I d b(d). a(T) is independent of d and completely determined by the statistical model considered for I d , in our case, the EGPD.
This formulation has the key advantage of being a separable function of return levels a(T), and duration b(d) that is consistent with both probabilistic theories and the physical constraints of scaling across duration. Menabde et al. (1999) showed that this formulation is the same as the scale-invariant model if θ is set to 0.
When applied to the EGPD, IDF kouts is defined such that: θ > 0, and 0 < H < 1 have to be inferred. The reference duration here is taken as d 0 = 1 − θ.
Following the same arguments discussed in Section 3.2.1 regarding the existence of a break in the scaling relationship, and the dependence of ξ with d, we propose three extensions to this model: • IDF kouts_TR : Allowing for a break in the scaling regime. This model is defined as: where K is the duration of the scaling break, 0 > 0 is the scale parameter for the reference duration d 0 = 1 − θ and σ d is continuous in d = K. The other parameters, 0 < H 1 < 1 and 0 < H 2 < 1 are the scaling exponents of the first and second regimes. The other two EGPD parameters = 0 , = 0 , remain independent of durations. Hence for this model, a total of six parameters have to be estimated, that is, 0 , 0 , 0 , H 1 , H 2 , and K. • IDF kouts_ξ(d) : an extension of the basic model IDF kouts , to allow ξ to vary with d according to Equation 12. • IDF kouts_TR_ξ(d) : an extension of the two-regime model IDF kouts_TR , to allow ξ to vary with d according to Equation 12.

Data-Driven IDF Model
The scaling theory and the specific form of Equation 13 impose particular functions for the relation between the scale parameter, σ of the EGPD with respect to duration, d. However, in the case of the data-driven models, the expression of the relationship for each of the three EGPD parameters is empirically determined by the data itself. To guide our choice of the appropriate functional relationship, we inspected how each locally estimated EGPD parameter varies with duration. Figure 4 gives an example for the σ and ξ parameters at two stations. We finally settled on the following functions to model the three parameters with respect to duration: For the first two parameters, κ and σ, the function is a continuous two-linear piece-wise model in log space. K * is the duration of the break-point (σ continuous for d = K * ). a * , b 1,* , and b 2,* are the intercepts and slopes of the first and second lines, respectively. In the case of ξ, the function was already given in Equation 12.
Note that, by keeping κ and σ independent of duration, and using either , the simple-scaling or the general formulations of Koutsoyiannis et al. (1998) presented in Sections 3.2.1 and 3.2.2 respectively can be obtained from this data-driven approach.
We consider two IDF models in this class, IDF and IDF , both impose the same type of functional relationships (Equations 12, 15, and 16), but simply differ in the way the regression parameters are estimated. The inference strategy is explained in detail in Section 3.3.
The different models compared in this study are summarized in Table 1. For the list of some key variables and their meanings, refer to Table A1.

Inference
In this section, we describe the inference methods to estimate the parameters of the ten (10) IDF models presented in Table 1. As a prerequisite to an objective comparison, the same estimation strategy has to be employed for all the models. For this reason, we use a global maximum likelihood estimation for all the models (as done by Blanchet et al. (2016)), which we describe in the next paragraph. An exception is the case of only one model, the IDF , which involves a two-step method. In this case, first, we fit, for a given station, the EGPD on the data of each duration separately (using maximum likelihood estimation). Second, we fit for each fitted parameter, the chosen regression model as a function of duration. The parameters in Equations 15 and 16 are estimated by segmented regression (i.e., regression model with break-points), while those of Equation 12 by ordinary least squares. The segmented regression we use here is based on the algorithm of Muggeo (2003), and with the functions that are implemented in the segmented package. Details of the algorithm can be found in Muggeo (2003).
We now come back to the global maximum likelihood for the other nine (9) IDF models. This method involves pooling, for each station, all the data from the 13 durations to estimate the model parameters. The duration d is used as a covariate. We note here that by pooling all the data, we made the hypothesis of independence between the intensities of the different time steps and different durations (since we use independence likelihood as in Blanchet et al. (2016)). This hypothesis is difficult to justify given that we consider all the non-zero intensities. Note that by taking fixed windows rather than moving windows, we have reduced the dependence between the different time steps. Previous studies considered similar approaches to reduce temporal dependence between different time steps, by taking a fraction of the data (Haruna et al., 2022;Le Gall et al., 2022;Naveau et al., 2016)). The dependence between the intensities of different duration, for example, intensities of 1 hr versus those of 2 hr, however, remains. While it might be possible to account for this dependence using different strategies proposed in the literature (e.g., Davison et al., 2012;Sebille et al., 2017), we believe this requires a complete task of its own and will likely complicate the optimization of the models. This is in addition to the possibility of performance deterioration due to dependence structure misspecification. We take this as a limitation of the present study and a topic for future research.
The log-likelihood (ll) that is maximized here (given in Equation 17) takes left censoring into account. The importance of using left censoring in fitting rainfall data by maximum likelihood has been pointed out by Naveau et al. (2016), and they showed that better performance is obtained by taking it into account.
where ll censored and ll uncensored are the contributions of the censored and uncensored data, given in Equations 18 and 19, respectively, as where d ranges over the 13 durations and j ranges over the number of time steps in the data of duration d. κ d > 0, σ d > 0, ξ d ≥ 0 are the EGPD parameters for duration d. i d,j is the precipitation intensity for the duration d and time step j. c d ≥ 0 is the left censoring threshold applied to the data of duration d. Many authors have taken this censoring approach into account but they usually take a uniform threshold value for all the stations (e.g., Tencaliec et al. (2020) used 2 mm for daily rainfall). Here, we did not find the use of a common threshold over the 81 stations sufficient. We had to select, for each station and duration, the lower threshold c that minimizes the Normalized Root Mean Square Error (NRMSE) of Equation 20 in Section 3.5.
In For the cases where two linear models are fitted (see Equations 11, 12, 14-16), we use segmented regression to estimate the regression parameters. We then use the fitted parameters as initial values in the optimization of the likelihood function.
We note here that other estimation methods were used in other studies for the simple-scaling approach and the general IDF formulation of Koutsoyiannis et al. (1998), besides the global MLE. In the case of the simple-scaling models, a two-step procedure, where the scaling exponent in Equation 8 is first obtained through moment scaling analysis, then all re-scaled intensities from all the durations are pooled to fit the IDF model using MLE (see Innocenti et al., 2017;Nhat et al., 2008;Panthou et al., 2014). For the general IDF formulation of Koutsoyiannis et al. (1998), the authors proposed two different estimation strategies; the so-called "robust estimation'" and the "one-step least square method'" The robust estimation is a two-step procedure that involves the estimation of the parameters of b(d), and then those of a(T) (see Equation 13), through the minimization of the Kruskal-Wallis statistic. The one-step least square method involves the joint estimation of all the parameters of Equation 13 that minimizes the squared error of the observed and modeled quantiles from the IDF model. Despite this, we only use the global MLE method to objectively compare the models based on the same inference strategy. In addition, it has the advantage of being a one-step estimation procedure and is better suited than least squares for non-Gaussian distribution estimation.

Uncertainty Estimation
We employ a block bootstrap approach for the uncertainty estimation in all the models considered in this study. The principle of the block bootstrap involves sampling with replacement, all the data contained in a block of a given size B, R number of times. Here, for computational reasons, we use R = 500. We then use the percentile method to estimate the 95% confidence interval (CI). The choice of the appropriate block size B is a delicate issue in the scientific community, but one common way is to choose B large enough to ensure that the temporal dependence in the data is maintained. A block size B too small will underestimate the uncertainty.
Here, we checked the autocorrelation in the seasonal data (result not shown), and saw that it does not decrease after a lag of more than 1 week. Hence we choose B = 2 weeks. To maintain the dependence between data of different durations d, we always ensured that all the data of the different durations d contained in the same block B = 2 weeks are sampled together. A summary of the block bootstrapping is summarized in the steps below: For each station s and season: 1. Aggregate the data to intensities of the 13 duration d, store the data in a matrix of n by d. Where n is the number of observations and d = 13. We call this matrix, M orig . 2. Sample with replacement, blocks from the row of matrix M orig in the previous step to form the bootstrap matrix M boot . Both M orig and M boot have the same dimensions. By sampling from the rows of M orig , we keep the data of the different durations d together, and hence the dependence structure. 3. Fit the IDF model on the data in M boot and estimate the intended return level. 4. Repeat steps 2 to 3 a total of 500 times (R = 500) to obtain the bootstrap distribution of the return level. 5. Obtain the 95% CI of the intended return level by percentile method. This is done by taking the empirical 0.025 and 0.975 quantiles of the bootstrap distribution of the return level obtained in step 4.
We note here that to ensure all the models use the same bootstrap matrix M boot , we use the "set.seed" in to keep track of the random number generation in the sampling of step 2. Examples of the CI obtained with this approach are later given in Figure 10.

Evaluation Framework
We evaluate the performance of the models in two aspects. First, in calibration, that is how well a given model predicts the data that was used in training it. Second, we evaluate their predictive performance in a cross-validation framework.
HARUNA ET AL.

Calibration
We use two criteria to evaluate the performance of the models in calibration (in-sample performance). They are itemized below: • NRMSE: The normalization, which here is done by the mean, allows the comparison of intensities of different duration across different stations. For each station s, and duration d, we compute the NRMSE over the non-zero precipitation intensities as: where NRMSE s (d) denotes the score computed at station s, and duration d, n s (d) is the number of non-zero precipitation intensities for duration d, • Akaike Information Criteria (AIC) (Akaike, 1974): This information criterion rewards the goodness of fit, as measured by the likelihood, but also penalizes the additional number of parameters to be estimated. It is computed as: where L is the maximized likelihood from Equation 17 and p is the number of parameters to be estimated (see Table 1). The lower the AIC, the better the model.

Cross-Validation
We follow a split sampling procedure in a cross-validation framework. For each station s, we divide the 10 min precipitation intensities into two equal sub-samples of the same length but on different years that are randomly chosen. We then aggregate the data into intensities of various duration, d = 30 min, 40 min, 1,2, 3, 6, 10, 12, 16, 18, 24, 48, and 72 hr. Then we fit each of the 10 IDF models.
We then evaluate the performance of the models fitted on sub-sample 1 on the observations in sub-sample 2 and vice versa, by computing the relevant cross-validation criteria. We repeat this procedure 10 times.
In the following, we present three different criteria we use to evaluate the models. These criteria have been used in several studies to evaluate and compare competing models (see Blanchet et al., 2015;Evin et al., 2016;Garavaglia et al., 2011;Haruna et al., 2022;Renard et al., 2013).
• The Robustness criteria, SPAN, measures the stability of the estimate of a high return level when the training data is changed. It is computed as: where (1) ( ) and (2)  For each duration d, a regional score over all the N stations (N = 81) is computed as SPANreg, ( ) = 1 − 1 ∑ =1 SPAN , ( ) and a perfect model in terms of robustness according to this criteria should have SPAN reg,T (d) = 1.
• The reliability of the model fitted on sub-sample 1 in predicting the maxima in sub-sample 2 and vice versa is measured by the FF criteria: where FF (12) ( ) is the cross-validation criteria computed at station s, and duration d, by predicting the probability of the maximum value in sub-sample 2, of sample size (2) ( ) using the model F (1) ( ) fitted on the sub-sample 1. FF (21) ( ) is computed symmetrically.
For a given duration, Renard et al. (2013) and Blanchet et al. (2015) showed that each (12) ( ) at a station should be a realization of a uniform distribution. So the difference in the area, diff between a theoretical uniform distribution and that of the N set of FF (12) ( ) (computed over the N stations), should be close to 0. FF reg (d) at the regional scale, given as 1 − diff, should therefore take a value of 1 for a reliable model and 0 for a completely unreliable model; the lower the value, the less reliable the model is.
• The reliability/accuracy of the model in predicting the entire observations in cross-validation is measured by the NRMSE_CV.
where NRMSE_CV 12 s ( ) is the score computed at station s, and duration d, (2) ( ) is the sample size, (2) ( ) is the empirical quantile with return period T j in sub-sample 2, (1) ( ) is the corresponding T j return level estimated from F (1) ( ) . The denominator is the average daily rainfall in sub-sample 2 at site s given as (2) ( ) .
Finally, for each duration d, the regional score computed over the N stations is given as: NRMSE_CV (12) reg ( ) = 1 − 1 ∑ =1 NRMSE_CV (12) ( ) .NRMSE_CV (21) reg ( ) is computed in similar way. NRMSE_CV reg = 1 means a perfect model and indicates that there is a complete agreement between all the empirical and theoretical quantiles for all return periods. The closer the value is to 1, the more accurate the model is.

Results and Discussion
We present the results in the following order: first, we investigate the appropriateness of the EGPD to fit the data of each duration. Then we present the results of the comparison of the IDF models in calibration, and then in cross-validation. Finally, we show some IDF curves modeled with the best model.

Assessment of EGPD Goodness of Fit
The first issue is to investigate whether EGPD is an appropriate model for the precipitation data at hand. To check this, we fitted the model at each station and for each duration, independently. We call this EGPD model fitted on each data separately as the "base" model. We then assess the quality of the resulting fits by computing the NRMSE given in Equation 20. The seasonal boxplots of the score for each duration are shown in Figure 5. The higher the score, the better the model. In spring and summer, the quality of the fit is less good for durations lower than 2 hr. In winter, on the other hand, the fit is less good for d = 48 and 72 hr. Overall, more than 74% of the scores fall above 0.9 and 96% above 0.8. We, therefore, consider the EGPD to be a reasonable model for the data.
The fitted shape parameter ξ with respect to duration is shown on Figure 6. Each boxplot contains 81 values, one for each station. We can observe strong dependence of this parameter on duration, especially in summer. For this season, while 75% of the stations have a ξ > 0.17 for d = 1 hr, only 25% have ξ > 0.06 at d = 24 hr. In winter, however, the dependence is not very strong, as judged by the large variability of the boxplots.

Comparison of Models
Results of the model comparison are presented under two frameworks, first in calibration, and then second in cross-validation based on split sampling.  Figure 7 presents the seasonal boxplots of the 1−NRMSE for the 10 IDF models and the base model. Each boxplot contains 1,053 points, summarizing the score over 81 stations and 13 durations. In the case of the base model (in yellow), the scores are the same as those in Figure 5, but here we merge the scores for all the durations together.

Normalized Root Mean Square Error
For all seasons, the two data-driven IDF models, IDF and IDF always show the best performance compared to the others. When looking at the two, the generally outperforms the IDF . This means that the global fitting of the model improves the estimation performance compared to the simple interpolation of the locally estimated parameters.
Comparing the IDF ss and the IDF kouts (white vs. red boxplots), the results show that for all seasons, the IDF kouts has a better performance compared to the IDF ss . Recall that the two models differ by the additional parameter θ in the former to account for curvature for short durations. Allowing for ξ = f(d) (models with subscript _ξ(d)) increased the performance of the models mainly in summer, where all the models without this addition showed very poor performance. For the other seasons, the gain in performance is not as pronounced.
Finally, the models allowing for scaling break (those with subscript _TR), show improved performance compared to those with the single regime for all the seasons, except summer (e.g., IDF ss vs. IDF ss_TR , i.e., the white and violet boxplots).
We note here that we used the NRMSE (Figure 7) to measure the accuracy over all non-zero intensities. This is because we are using the EGPD model, which is supposed to model correctly all the non-zero intensities. However, we also computed the NRMSE on extremes only, which we define as the exceedances over the 98% quantile overall intensities (including zeros). The normalization was done by the average of the exceedances. The result (not shown) maintains the same performance ranking order of the models as in Figure 7.

Akaike Information Criteria
The result in Figure 7 as measured by the NRMSE assesses the in-sample accuracy of the models but does not reward parsimony in terms of the number of model parameters. A natural question to ask is whether the additional performance is worth the additional complexity. To answer this, we compute the AIC for each model at each of the 81 stations. Each time, we rank the models from the best (rank = 1, smallest AIC) to the least (rank = 10, largest AIC). Figure 8 shows results in the form of stacked-bar chars for the four seasons. The horizontal axis shows the rank from 1 to 10, and the vertical axis shows the percentage of stations over which a model is ranked. For instance, in winter, IDF is the best (rank = 1) in 82.7% of the stations, while the base model is the best in 17.3% of the stations.
The result shows that for all the seasons, the first and second rank is almost exclusively shared between the IDF and the base model. To summarize the results, Table 2 gives the ranking of the models for the four seasons. The model with the highest percentage is selected for each rank (most likely model at each rank). For all the seasons, the Data-driven ( ) is the best, the base model following behind. The results also revealed the relevance of the models allowing for shape parameters to vary with duration only in summer. For the other three seasons, however, the gain in performance is not worth the additional parameter modeling of the shape parameter as a function of duration. Comparing the models with a constant shape parameter, while the IDF kouts is always the most parsimonious, the simple-scaling (IDF ss ) remained the worst. In summary, the which had the best in-sample performance among the IDF models (as measured by the NRMSE in Figure 7), remained the best across all the seasons based on the AIC criteria. Although our focus has been to compare the IDF models that consider the linkage of parameters with duration, we still included the base model in the comparison of the AIC. The result here showed that the base model (with 39 parameters) is less parsimonious compared to the for most of the stations.
Finally, it was not possible to consider the here, because, unlike the 10 models that are based on maximum likelihood estimation, this model involves a two-step estimation process. The first is a maximum likelihood, followed by a second step to estimate the link between the parameters and duration through least squares and segmented regressions. Nevertheless, this is not an issue because this model is based on the same principle as the and simply differs by the estimation method. Thus, comparing the two versus can be seen as comparing two estimation methods within the same model, and according to the NRMSE results in Figure 7, better performance is achieved with the compared to the . Note. The model with rank 1 is the best, that is, it has the smallest AIC for most of the stations in that season. A rank of 10 indicates the worstperforming model. Figure 9. Boxplots of the regional cross-validation criteria, NRMSE_CV, FF, and SPAN. For the first two criteria, each boxplot contains 2 × 130 points, corresponding to one regional score for each of the 13 durations and 10 repetitions of the split sampling. For the SPAN, each boxplot contains 130 points. The optimal value for each criterion is equal to 1.
HARUNA ET AL.

Evaluation in Cross-Validation
The split-sampling procedure allows for the comparison of the models in a cross-validation framework. We use three regional criteria: NRMSE_CV, FF, and SPAN (see Section 3.5.2), to enable the comparison of the models based on their predictive capabilities. We want to select a model, which in addition to being able to fit the data used to train it, is able to perform reliably and robustly in the presence of new data. Figure 10. Simple-scaling (IDF ss ) and data-driven ( ) curves (a) in summer at a station in Zurich (north-east). (b) In autumn at a station in Locarno (Ticino area in the south). The curves are for the return periods T = 2, 5, 10, 40, and 100 years. The points are the empirical quantiles corresponding to T = 2, 5, and 10 years. The envelopes represent the 95% confidence bounds obtained by block bootstrap.
In the following, we present the results in three paragraphs, first according to the reliability/accuracy of the model in predicting all the observations as measured by NRMSE_CV, then the reliability in predicting the maxima as measured by the FF criterion, and finally, the robustness of the model in predicting the 100-year return level as measured by SPAN100. Figure 9 presents the results for the four seasons. For all the criteria, the model with a regional score equal to 1 is the best model. For all seasons, the NRMSE_CV shows the data-driven models, specifically the , to be the most accurate/reliable in predicting the entire observations compared to the other models. In winter, however, the difference in the performance of the models is not very clear. Looking at the summer results, the models without accounting for ξ = f(d) always have the worst performance.
In terms of the FF criterion, the best performance in predicting the maxima in winter is shown by the IDF ss_TR model. In fact, all the models with no allowance for ξ = f(d) happen to be the most reliable models in this season. The converse is however true in the case of the remaining seasons. In summer, while , is the best model, IDF ss_ξ(d) is the best in spring and autumn.
The robustness criteria, SPAN100 shows the models with no allowance for ξ = f(d) to be the most robust models. An exception to this is in summer, where the model is the most robust model. Also, higher robustness is found for the models not accounting for ξ = f(d) compared to their counterparts, for example, IDF ss versus IDF ss_ξ(d) . This is despite the fact that the former performs poorly in calibration, and is the least performing according to the other cross-validation criteria of reliability. This confirms the previous comments of Garavaglia et al. (2011) that a robust model can completely fail to model/predict the data. Hence the robustness criteria should only be used alongside other reliability criteria, such that the most robust model is only selected among models of similar reliability.
To summarize the results, the best IDF model should perform well in calibration, and should not be very sensitive to the data used to train it. In calibration, the data-driven model showed the best performance compared to all the other nine models, it also remains accurate and reliable at predicting the entire observations in the split-sampling cross-validation (as measured by the NRMSE_CV), especially in summer. This is an important feature since we are interested in the complete range of intensities. Finally, it generally showed more robustness compared to the other models of similar reliability. Figure 10a shows the IDF curves from two models, IDF ss and , along with their 95% CI in summer, at a station in Zurich which is located in the Northeast of Switzerland. In this region, summer is the main season of heavy rainfall. As a reminder, the IDF ss allows scaling only in the scale parameter, σ of the EGPD, the other two parameters (κ and ξ), are independent of duration. The , on the other hand, allows each of the three parameters to vary with duration. The curves are for return periods T = 2, 5, 10, 50, and 100 years, while points are the empirical levels for T = 2, 5, and 10 years. The IDF ss performed poorly at predicting the empirical quantiles. The curves modeled by the , on the other hand, are in agreement with the empirical levels. Similar IDF curves for autumn are shown in Figure 10b for a station in Locarno which is located in the Ticino area in the south of Switzerland. The Ticino area is subject to the heaviest precipitation compared to the other regions in Switzerland. Again, the is able to model the empirical levels correctly for both the short and long durations.

IDF Curves
The 95% confidence bounds for the return periods T = 2, 5, 10, 50, and 100 years in both stations are shown in Figure 11 along with the empirical levels for T = 2, 5, and 10 years (broken lines). For comparison sake, we also include those of the base model, that is, where EGPD is fitted to the data of each duration separately (without linkage of parameters to duration). It can be seen that narrower CIs are obtained with the simple scaling (IDF ss ) and data-driven compared to the base model. This is because the base model uses less amount of data in its inference compared to the other two models where all the data of the 13 durations are pooled together in the estimation. It is expected that the width of the CI would be narrower as the available data for estimation increases. The IDF ss has six parameters less than those of the , so a natural question is whether the CI will be wider in the latter model. Indeed for levels where both models predicted the empirical level correctly (d = 1 hr in the case of Locarno, second row of Figure 11), the CI is a bit narrower in the case of the IDF ss . But looking at 22 of 33 the same station for d = 24 hr, the bounds are wider, in addition to the bias in predicting the empirical level. Of course, a narrower CI is only relevant if it contains the empirical level.
In Figure 10, the curves of the simple-scaling model (IDF ss ) are not parallel. This behavior resulted from the definition of IDF models for non-zero precipitation in Equation 5. From this equation, we see that the T-year return level is defined as . The term δ d , representing the average number of non-zero precipitation per year varies across the durations leading to a non-constant slope for the different curves.
We finally show, in Figures 12 and 13, respectively, the seasonal 100-year return level maps for d = 1 and 24 hr. The levels in the maps can be interpreted as the levels which are expected to be exceeded every 100 seasons. The levels were obtained with the best performing model, that is, IDF . Looking at the return levels for d = 1 hr (Figure 12), we see that the levels in winter are the lowest, with no specific spatial pattern or variability. In spring, the levels in the north and Ticino starts to increase. Summer has the highest levels, and similar levels are obtained all along the plateau in the north. In autumn, while the levels in the north are comparable to those in Figure 11. Return level estimates (points) and their 95% confidence interval bounds (lines) for return levels T = 2, 5, 10, 50, and 100 years estimated with the simple-scaling (IDF ss ), data-driven ( ) and the base model (no linkage of extended generalized Pareto distribution parameters with duration). The confidence bounds were obtained using block bootstrap. The top row is for a station in Zurich (northeast) in summer for d = 1 and 24 hr. The second row is for a station in Locarno (Ticino area in the south) in autumn for d = 1 and 24 hr. In all cases, the broken lines in red, black, and yellow are the empirical levels for T = 2, 5, and 10 years respectively. HARUNA ET AL.
10.1029/2022WR033362 23 of 33 winter, those in Ticino are comparable to those in summer. A different spatial pattern is however observed for the 100-year return level for d = 24 hr. Specifically for summer, the levels in the plateau are lower than those along the northern alpine rim (Prealps). The exhibited spatial pattern of the levels produced by this model is similar to those observed in earlier studies (see Fukutome et al. (2015) for the hourly, and Haruna et al. (2022) for the 24 hr precipitation).

Discussion
In the following paragraphs, we briefly discuss some of our choices in terms of the functional forms of the data-driven models, taking into account the varying shape parameter with duration, and the issue of scaling break in the data.
First, for the data-driven models, we limited our choice of functional relationships to simple parametric models, specifically to piece-wise linear models. Other choices would be possible such as smooth regression splines (e.g., Youngman, 2019Youngman, , 2020. This choice has its advantage and drawback. The advantage is that the splines are able to automatically adjust to fit any form of relationship. The main drawback is that it is inherently non-parametric, and so the mapping of the IDF models, to allow predictions at ungauged locations, is not directly possible. One can only map the three EGPD parameters for a particular duration. For instance, for 13 durations, this means 3 × 13 = 39 maps. For our choice of linear functions, 10 parameters are able to describe the IDF curves at each station, and hence 10 maps for the whole area under study. Regarding the variation of the shape parameter ξ with respect to duration d, some earlier studies did observe or discuss it (e.g., Ulrich et al., 2021;Veneziano et al., 2007). They however did not model it, either due to the weak form of the relationship, or because the IDF model did not allow for it. Here, especially in summer, we found very strong dependence, and the results have shown that taking it into account is invaluable. For the other seasons, however, the AIC results showed that it is more parsimonious to let the parameter be independent of duration. We note that the strength of the variation of the shape parameter with duration is different in the four seasons. In our case, it is strongest in summer and weakest in winter. A possible explanation is that the underlying precipitation formation mechanism responsible for the formation of short and long-duration precipitation is different in the four seasons. For instance, in winter, the same frontal system is responsible for both short and long-duration events, and hence the variation of the shape parameter with duration is weak. In summer where the linkage is strong, different systems are responsible for the short-duration intensities (e.g., due to short convective events) and long-lasting duration intensities (e.g., due to frontal systems). The other seasons (spring and autumn) present a mixture of the two behaviors.
Finally, we observed a break in the scaling relationship of the EGPD scale parameter. The existence of scaling regimes has been investigated by Fraedrich and Larnder (1993) and they linked it to atmospheric processes such as the structure of frontal systems, the diurnal cycle, the seasonal periodicity, and climatic variability. Hence, break points in the regime observed at stations could be explained by the possible transition from one rainfall system to another. For example, from short convective events at short duration to frontal systems at longer duration. For breaks at very short durations in the range of around 1 hr, however, it is generally attributed to being artificial, imposed by the resolution of the gauge (measurement precision errors). Despite this, the identification of scaling breaks involves some level of uncertainty depending on the method used. For instance, Paschalis (2013) tried to identify scaling breaks in Switzerland using the same 10 min data sets as in our case. He used spectral density analysis based on the Fourier transform, and another based on wavelet decomposition. While both methods agree on the existence of scaling breaks, the exact times of the breaks were different in both. The author also mentioned that no clear seasonal or regional pattern was observed. In our case, we rely on the estimated EGPD scale parameter to identify the scaling breaks. Other authors relied on moment analysis (plot of moment vs. duration in log-log scale for different orders) to identify breaks in scaling (Bougadis & Adamowski, 2006;Nhat et al., 2008). In reference to this, we do not rule out the possible effect of the EGPD model on the observed scaling breaks. A detailed study of its own might be required to study and characterize/interpret the scaling breaks in the study area since any detailed interpretation might be premature.

Conclusions
Our aim in this paper was to build IDF curves using all the non-zero precipitation data in Switzerland. To achieve this, we used the EGPD model as the parametric model for the precipitation intensities. The literature presents various approaches to link the different durations together in IDF curves. We considered three of these approaches to build the IDF curves while using the EGPD as the parametric model. The first approach is based on the scale invariance theory, where IDF curves are built based on the scaling behavior of precipitation. The second approach is based on the general IDF formulation of Koutsoyiannis et al. (1998), which generalizes the various traditional IDF formulations. The last approach is called the data-driven approach, where each parameter can vary with duration, and the form of the relationship is empirically determined by the data at hand.
We started from these three approaches and added some extensions to account for scaling break and varying shape parameter with duration. We ended up with a total of 10 IDF models. We then compared them, first in calibration, and then in a split-sample cross-validation approach.
The results showed that, given the EGPD as the parametric model, the data-driven IDF-EGPD, particularly the IDF , is the best model for the data at hand. The AIC also showed this model to be more parsimonious compared to the base model where no linkage of parameter with duration is considered. The IDF curves based on simple-scaling and the general formulation of Koutsoyiannis et al. (1998), did not perform as efficiently even with the added extensions in terms of scaling break and in the way the shape parameter varies with duration. The fact that the simple-scaling IDF models performed poorly in summer confirms the previous findings of Molnar and Burlando (2008) and Paschalis (2013) that in Switzerland, precipitation in summer shows multiscaling behavior.
Although our work focused on Switzerland, the data-driven approach can be applied everywhere, especially for regions where high-resolution data are available. This is because the approach considers the empirical relationship between the model parameters and duration, and is not constrained by the hypothesis of scaling across durations. Our result also showed that it is possible to model the linkage of the shape parameter with duration in IDF curve modeling. Moreover, the study highlights the need to explore in detail the empirical relationship of model parameters with duration before the application of any of the widely used IDF construction approaches. Finally, we showed that it is possible to use the EGPD for IDF curve modeling.
In terms of perspectives, it would be interesting to produce maps of the parameters to allow for predictions at ungauged sites. This could be achieved by simple interpolation of the local IDF parameters as done by Blanchet et al. (2016), or through quantile regression methods (Ouali & Cannon, 2018), or by global estimation using spatial covariates (e.g., Ulrich et al., 2020). Another possibility is to use a regionalization technique, such as the method of Hosking and Wallis (2005) and then interpolate the index flood to allow predictions at the ungauged sites (e.g., Mascaro, 2020).
It should be mentioned that throughout this work, we estimated the IDF models through the independence likelihood, thus omitting the correlation between different timesteps and durations. In the case of annual maximum series (GEV), Nadarajah et al. (1998) have modeled the dependence between the different time steps using multivariate extreme value distributions, and Tyralis and Langousis (2019) followed suit by using max-stable processes. Later, Jurado et al. (2020) investigated the impact of accounting for this dependence in extremes and showed that there is little gain in performance, in addition to the added complexity of using max-stable processes. Since all these authors focused on the distributions that are based on extreme data only, an interesting perspective will be to investigate the effect of accounting for the dependence in IDF curve modeling for the case of the EGPD that uses all the non-zero data.
Finally, consideration of the effect of climate change in building IDF curves is invaluable. For instance, Cheng and AghaKouchak (2014) showed that by neglecting nonstationarity in modeling IDF curves, there could be up to 60% underestimation of extreme precipitation, especially for short durations. It would therefore be interesting to model the curves while accounting for a warmer climate (e.g., Cheng & AghaKouchak, 2014;Kristvik et al., 2019;Mirhosseini et al., 2013;Ouarda et al., 2019;Ragno et al., 2018).
HARUNA ET AL.

B2. Plots of the EGPD Parameters Versus Duration
In this section, we explore the relationship between the three EGPD parameters with respect to duration. Two stations located at Zurich (KLO) and Robbia in Graubünden (ROB) are given as examples. For each season, the fitted parameter from the base model is plotted against the duration d. 1st and 2nd slopes in two-scaling regimes 0 < H 1 < 1, 0 < H 2 < 1 - Figure B1. Relationship between the extended generalized Pareto distribution (EGPD) κ parameter and duration d for a station in Zurich (KLO). To obtain this, we fitted EGPD to the data of each of the 13 durations separately by maximum likelihood. The black points are the fitted κ. The broken lines in red represent the 95% confidence interval for the fitted model (log-log TR) see Equation 11. Figure B3. Same as in Figure B1, but for the extended generalized Pareto distribution shape parameter ξ. The equations for the fitted models are given in Appendix B. The broken lines represent the 95% confidence interval for the linear-log model. Figure B2. Same as in Figure B1, but for the extended generalized Pareto distribution scale parameter σ.

Water Resources Research
HARUNA ET AL.
10.1029/2022WR033362 29 of 33 Figure B4. Same as in Figure B1, but for a station at Robbia in Graubünder (ROB). Figure B5. Same as in Figure B4, but for the extended generalized Pareto distribution scale parameter σ. Figure B6. Same as in Figure B4, but for the extended generalized Pareto distribution shape parameter ξ. The equations for the fitted models are given in Appendix B. The broken lines represent the 95% confidence interval for the linear-log model.

Data Availability Statement
The data used in this study are maintained by the Swiss Federal Office of Meteorology and Climatology, Mete-oSwiss, and can be freely obtained from IDAWEB (2021). The software for this research is freely available as an R package (Haruna, 2023) with GPL-3 license.