Assessment of parametric approaches to calculate the Evaporative Demand Drought Index

The Evaporative Demand Drought Index (EDDI), based on atmospheric evaporative demand, was proposed by Hobbins et al. (2016) to analyse and monitor drought. The EDDI uses a nonparametric approach in which empirically derived probabilities are converted to standardized values. This study evaluates the suitability of eight probability distributions to compute the EDDI at 1‐, 3‐ and 12‐month time scales, in order to provide more robust calculations. The results showed that the Log‐logistic distribution is the best option for generating standardized values over very different climate conditions. Likewise, we contrasted this new parametric methodology to compute EDDI with the original nonparametric formulation. Our findings demonstrate the advantages of adopting a robust parametric approach based on the Log‐logistic distribution for drought analysis, as opposed to the original nonparametric approach. The method proposed in this study enables effective implementation of EDDI in the characterization and monitoring of droughts.


| INTRODUCTION
Drought is one of the main climate hazards affecting society and the environment, with severe impacts on agriculture, natural ecosystems and water supplies (Wilhite, 1993;Wilhite and Pulwarty, 2017). It is not easy to identify and quantify droughts in terms of intensity, magnitude, duration and spatial extent (Wilhite and Glantz, 1985;Vicente-Serrano, 2016). For this reason, a great deal of effort has been invested in developing objective methods to quantify drought severity, as well as the impacts on various natural and socioeconomic sectors.
Climatic drought indices are one of the most broadly used approaches in identifying and quantifying this type of event (Heim, 2002;Keyantash and Dracup, 2002;Mukherjee et al., 2018). Currently, there is a wide variety of drought indices (Mishra and Singh, 2010) based on climatic information, and typically used in drought analysis and monitoring (Mckee et al., 1993a;McKee et al., 1993b;Vicente-Serrano et al., 2010).
Traditionally, drought indices are calculated from precipitation data. However, this perspective is insufficient in that it does not include all the variables causing drought severity, among which the atmospheric evapora-tive demand (AED) is also highly significant (Hobbins et al., 2017;Vicente-Serrano et al., 2020). Several studies have suggested that AED is crucial in the development and intensification of certain drought events (Ciais et al., 2005;Hunt et al., 2014;Otkin et al., 2016;Zhang and He, 2016;García-Herrera et al., 2019). Thus, recent drought indices include AED in calculations, among others the Standardized Precipitation Evapotranspiration Index (SPEI; Vicente-Serrano et al., 2010;Beguería et al., 2014) or the Standardized Evapotranspiration Deficit Index (SEDI; Kim and Rhee, 2016;Vicente-Serrano et al., 2018). Hobbins et al. (2016) and McEvoy et al. (2016) formulated the Evaporative Demand Drought Index (EDDI), based exclusively on AED data. Adopting the AED as a unique metric of drought severity could give rise to problems in certain circumstances given its complex influence on drought severity (Vicente-Serrano et al., 2020), although it could be very useful during periods of very low soil moisture and strong landatmosphere feedbacks (Hobbins et al., 2017;Miralles et al., 2019). Thus, the EDDI could identify anomalous increases in the AED that can trigger drought conditions (Pendergrass et al., 2020). Given the complexity of droughts, as many drought indices as possible should be included, since they can complement each other and provide a more accurate picture of drought severity, so the EDDI is a valuable tool that must be tested and used and, if possible, improved.
Unlike other widely used drought indices such as the Standardized Precipitation Index (SPI; Mckee et al., 1993aMckee et al., , 1993b and the SPEI, the EDDI is based on a nonparametric approach using empirically derived probabilities to obtain a standardized index that can be compared in space and time. This methodological approach is very flexible as it can be used without adopting a specific probability distribution for the reference variable (e.g., precipitation, AED, soil moisture, etc.). Given that nonparametric approaches do not assume that there is a suitable probability distribution representative of the data, there is no need to estimate parameters and evaluate goodness-of-fit, which is a computational advantage (Farahmand and AghaKouchak, 2015). However, as parametric approaches are not bound by the highest and lowest observed values, they have the advantage over nonparametric approaches. This is critical for drought monitoring, as if the new value corresponds to the lowest and highest ones, the index cannot be adequately modelled above or below these maximum and minimum values. Thus, parametric methods are more suitable than nonparametric approaches to calculate drought indices based on a defined reference period so that a new value can be easily placed within the range of the theoretical probability distribution (Beguería et al., 2014;Stagge et al., 2015;Vicente-Serrano and Beguería, 2016;Svensson et al., 2017). In addition, the range of index values in nonparametric methods is a function of the length of the reference climatology, which limits their use when long-term data are not available.
In general, parametric approaches are better at modelling distribution tails corresponding to the most extreme values (Vicente-Serrano and Beguería, 2016), and at determining the anomalous character of a single value, because they are not heavily constrained by the available observations, as in the case of nonparametric approaches. This is a very relevant issue in accurate drought characterization, as these extreme values are crucial for determining the severity and intensity of drought episodes.
If the EDDI is intended to be included in precise drought quantification and monitoring, it is necessary to find the most accurate approach (i.e., probability distribution) to calculate it parametrically. Studies have been made to determine the most accurate probability distributions for calculating the SPI, SPEI and SEDI (Mckee et al., 1993a;1993b;Stagge et al., 2015;Vicente-Serrano and Beguería, 2016;Vicente-Serrano et al., 2018). There is general agreement that the Gamma distribution performs better in calculating the SPI, and the Log-logistic distribution provides the best results for the SPEI and the SEDI.
In this study, we tested several probability distributions and proposed a method to calculate the EDDI by means of a parametric approach. Likewise, we compared this new EDDI formulation based on a parametric approach with the original nonparametric formulation proposed by Hobbins et al. (2016). For this purpose, we used a recently developed high spatial resolution gridded dataset of the AED in Spain (Tom as-Burguera et al., 2019), since Spain is characterized by large spatial and seasonal differences in the AED Tomas-Burguera et al., 2017;Tomas-Burguera et al., 2021), and enabling the capacity of different probability distributions to be assessed over a wide range of climate conditions.

| Atmospheric evaporative demand dataset
This study used a high spatial resolution (1.21 km 2 ) gridded climate dataset with coverage for mainland Spain and the Balearic Islands at monthly temporal resolution over the period 1961-2018. The dataset is based on the entire daily observational information from the National Spanish Meteorological Service (AEMET), which was subjected to a thorough quality control and homogenization process (Tom as-Burguera et al., 2016). Details of the process followed in developing the dataset are available in Tomas-Burguera et al. (2019). The reference evapotranspiration (ET o ), a robust metric of the AED (Vicente-Serrano et al., 2020), was calculated from the maximum and minimum temperature, relative humidity, wind speed, and sunshine duration, using the FAO-56 Penman-Monteith equation (Allen et al., 1998).
The AED is driven by a radiative component, reflecting the available energy to vaporize water, and an aerodynamic component that reflects the capacity of the air to store water . The role of AED in the development and intensification of droughts can be very complex and it is closely related to climatic characteristics, exhibiting large contrasts between humid regions characterized by energy-limited conditions and dry regions characterized by water-limited conditions (Vicente-Serrano et al., 2020). Thus, under the former, increases in AED are not expected to result in a drought, T A B L E 1 Cumulative distribution functions F(x) of the eight probability distributions tested

| Evaluation of different probability distributions for EDDI computation
We tested eight probability distributions to calculate the EDDI, including the three-parameter General Extreme Value, Log-logistic, Lognormal, Pearson III, Generalized Pareto and Weibull distributions, and the two-parameter Normal and Exponential distributions. The probability distributions have been widely tested for different hydroclimatic applications (Hosking, 1990;Rao and Hamed, 2000) and are the most commonly used for scientific and applied purposes (Bobee and Ashkar, 1991;Guttman, 1999;Vicente-Serrano et al., 2010, 2012Stagge et al., 2015;Barker et al., 2016). Although some studies suggest using other distributions to calculate standardized drought indices (e.g., the Tweedie distribution; Svensson et al., 2017), our preference was for those more widely used by the scientific community. The cumulative distribution functions of the eight probability distributions tested for EDDI computation are described in the Table 1. More details about the probability distributions are described in depth in Hosking et al. (1985), Hosking (1986Hosking ( , 1990, Singh et al. (1993), and Rao and Hamed (2000). The parameters of each probability distribution were calculated using unbiased probability weighted moments (UB-PWMs; Hosking, 1990). Each monthly AED series aggregated at different time scales (i.e., 1, 3 and 12 months) over the period 1961-2018 were fitted to each of the eight probability distributions. If the probability distribution proved suitable for fitting monthly AED series, the cumulative probabilities of AED values were calculated and transformed into a normal distribution with a standard deviation equal to 0 and 1 [N(0,1)] to obtain standardized units (i.e., values of EDDI) using the classic approach of Abramowitz and Stegun (1965). The EDDI was calculated for 1-, 3-and 12-month time scales, reversing the index sign (i.e., higher AED results in lower EDDI values).
The flowchart followed to select the optimal probability distribution to calculate the EDDI is presented in Figure 1. We used four approaches to test the suitability of the different distributions: 1. Visual checking of the goodness-of-fit of probability density functions to monthly AED series: for a first evaluation of the suitability of the probability distributions, we examined the fit of the probability density functions of each distribution to the monthly AED series aggregated at 1, 3 and 12 months in locations and seasons with significant climatic contrasts. 2. Determining the percentage of monthly AED series that cannot be fitted by the selected distribution and do not provide a solution for the EDDI: to check the goodness-of-fit of the probability distributions, we fitted the probability density functions of each of the eight distributions to monthly AED series aggregated at 1, 3 and 12 months. Given that the parameters from a specific probability distribution cannot be fitted to the AED data, a solution cannot be found for the EDDI. Also, there are some cases in which the origin parameter of the distribution can be higher than the lowest observed AED value, again providing no solution for the EDDI. In order to evaluate the robustness of the eight probability distributions, we calculated the percentage of monthly series for each probability distribution with no solution for the EDDI. 3. Examining the normality of the resulting EDDI series: to determine the normality of the resulting EDDI series from the probability distributions, we used the Shapiro-Wilks test. A rejection rate of p < .05, corresponding to a 95% confidence level, was selected to accept that the EDDI series follows a normal distribution. 4. Analysing the frequency of high and low EDDI values: since distributions model the low and high values, they are very important in assessing the quality of the fit (Vicente-Serrano and Beguería, 2016;Vicente-Serrano et al., 2018). Therefore, we also compared the frequency of low and high EDDI values (on 1-, 3-and 12-month time scales) and the associated return period obtained by the most suitable probability distributions. More specifically, we compared the relative frequency of negative extreme EDDI values (threshold of −2.58, corresponding to a return period of 1 in 200 years).

| Comparison between parametric and nonparametric EDDI formulation
We contrasted the new parametric approach suggested in this study with the original nonparametric formulation proposed by Hobbins et al. (2016) for EDDI computation. For this purpose, we compared the EDDI series obtained by both approaches at 1-, 3-and 12-month time scales over the period 1961-2018 in several locations of Spain that represent a wide variety of annual AED values ( Figure 2). In order to illustrate some of the advantages of the parametric approach, especially those related to drought characterization and monitoring, we also calculated and compared the EDDI series obtained by both approaches based on a reference period (i.e., 1961-1989) at 1-, 3-and 12-month time scales over the period 1961-2018 as well as during two extreme drought events. The sign of the nonparametric EDDI was also reversed (i.e., higher AED results in lower EDDI values).

| Evaluation of different probability distributions for EDDI computation
The eight candidate probability distributions for EDDI calculation were evaluated and successively filtered following the four criteria proposed (see Figure 1): 1. Figure 3 presents several examples from the February and August AED series from different locations over Spain aggregated at 1-, 3-and 12-month time scales with the eight theoretical distributions that fit the data. In general, all probability distributions exhibit great flexibility and goodness-of-fit, with the exception of Exponential and Generalized Pareto distributions that, in most cases, do not fit the AED data and are therefore unsuitable for EDDI calculation. As depicted, both the peak and the lower and upper tails of the probability density function of General Extreme Value, Log-logistic, Lognormal, Pearson III, Weibull and Normal distributions are generally well adapted to AED histograms, regardless of the time-scale and climatic conditions. Therefore, and given that these six probability distributions exhibit a similar fit at this stage, it is difficult to determine which distribution is most suitable for EDDI computation. 2. Table 2 shows the percentage of monthly AED series computed at the time scales of 1-, 3-and 12-months with no solution from each of the eight probability distributions tested. As expected, the Generalized Pareto and Exponential distributions exhibited a T A B L E 2 Percentage of the total monthly series of AED with no fitting solution tested through the eight probability distributions at 1-, 3-and 12-month time scales on mainland Spain and the Balearic Islands percentage of series with no solution for the EDDI close to 100% in all months and time scales, so they were rejected for EDDI calculation. The Weibull, Lognormal and General Extreme Value (GEV) distributions also showed high percentages of monthly series with no solution at 1-, 3-and 12-month time scales, so they were also discarded as unreliable alternatives for calculating the EDDI. Thus, further evaluations were based on only the three distributions that exhibited a low percentage of monthly AED series with no fitting solution (i.e., the Log-logistic, Pearson III and Normal distributions) to EDDI computation at 1-, 3-and 12-month time scales. 3. Table 3 depicts the percentage of monthly EDDI series obtained at 1-, 3-and 12-month time scales from the three remaining probability distributions (Log-logistic, Pearson III and Normal) that follow a normal distribution according to the Shapiro-Wilks normality test (95% confidence level). The Log-logistic and Pearson III distributions were the highest overall, with values that commonly exceeded 95% at 1-, 3-and 12-month time scales. The normal distribution exhibited the highest percentage of series in which the normality of the series is rejected, especially in November and December at short time scales. Figure 4 shows the spatial distribution of monthly EDDI series calculated through Log-logistic, Pearson III and Normal for which the null hypothesis of normality was rejected over mainland Spain and the Balearic Islands. The EDDI calculated by the Log-logistic returned series that follow a normal standard distribution for almost all months and time scales over the whole of Spain. Similarly, the EDDI series obtained by Pearson III distribution followed a normal distribution in most of the study area at 1-, 3-and 12-month time scales. In addition, Log-logistic and Pearson III did not reveal any spatial biases. On the contrary, the normal distribution showed wide areas in which the null hypothesis of normality was rejected, especially in November and January at the 1-month time scale, but also in central regions in July and September. Likewise, there are wide areas of the northwest in August at 3-month and southwest at 12-month time scales in which the series do not follow a normal distribution. Therefore, only Log-logistic and Pearson III distributions were used for further assessment. 4. Figure 5 shows the percentage of the total monthly EDDI series which returned values of less than −2.58 (corresponding to a return period of 1 in 200 years) for Log-logistic and Pearson III distributions at 1-, 3and 12-month time scales. Given the available length of the AED series , it was expected that these extreme values would be infrequent. Nevertheless, the Pearson III distribution provided a large percentage of series with extreme values, which unrealistically overestimates the frequency of these extreme drought events in comparison with a more coherent frequency provided by the Log-logistic distribution. The spatial distribution of these extreme values showed wide variability at 1-, 3-and 12-month time scales (Figure 6). In general, the Log-logistic distribution displayed a low frequency of extreme negative values in all months and across the whole study area, regardless of the time scale. On the contrary, the Pearson III distribution provided a high number of extreme negative values in several months at 1-, 3and 12-month time scales. For example, in December and July EDDI series at the 1-month time scale showed cases below −2.58 across most of the study area. Likewise, large parts of the study area exhibited EDDI series with cases below −2.58 in February and November at the 3-month time scale, and also at the 12-month time scale in western regions. This demonstrates that the Pearson III distribution generally provides an unrealistic frequency of extreme EDDI values. Figure 7 illustrates the relationship between EDDI values at 1-, 3-and 12-month time scales and the associated return periods obtained by the Log-logistic and Pearson III distributions. The EDDI values obtained by both distributions at different time scales showed a high degree of consistency over a wide range (±1.80σ) in which the Log-logistic and Pearson III distribution provided similar values. However, there are notable differences in the lower and upper tails of distributions, corresponding to the extreme EDDI values. As depicted, the Pearson III distribution exhibited more extreme negative and positive EDDI values than the Log-logistic across all time scales, but especially at short time scales. Consequently, the associated return periods obtained through the Pearson III distribution are higher than for the Log-logistic, regardless of time scale. It was noted that Pearson III resulted in some cases in return period of 1 in 500 years for EDDI values, which reported periods shorter than 1 in 100 years with the Log-logistic distribution. Therefore, the Pearson III distribution was rejected in favour of the Log-logistic distribution, which provides much more coherent extreme values and associated return periods for EDDI computation.

| Comparison between parametric and nonparametric EDDI formulation
The parametric approach providing the best performance for EDDI computation (i.e., the Log-logistic distribution) was contrasted with the original nonparametric approach (i.e., that of Hobbins et al. (2016)) at a variety of time scales and climatic conditions. For several locations in F I G U R E 4 Spatial distribution of monthly EDDI series calculated through the Log-logistic, Pearson III and Normal at 1-, 3-and 12-month time scales for which the null hypothesis of normality was rejected by the SW test at a confidence level p < .05 on mainland Spain and the Balearic Islands over the period  Spain, the EDDI was estimated using both approaches at 1-, 3-, and 12-month time scales and using two different reference periods: a 29-year period from 1961 to 1989 and the full 58-year period of record from 1961 to 2018. In general, both approaches exhibited a robust performance to model EDDI values when the index was calculated retrospectively for long-term periods regardless of time scale and climatic conditions (Figure 8). Only in few cases (i.e., very extreme dry/wet episodes) did the nonparametric approach show limitations in modelling extreme EDDI values when the entire period (1961-2018) was used to calculate the index. However, when the shorter reference period  is used to compute the EDDI, a common practice in operational drought monitoring, the nonparametric approach cannot model the extreme values at different time scales if the new values exceed the maximum or minimum value of the reference climatology ( Figure 9). As depicted, the limitations of a nonparametric approach to modelling extreme EDDI values are frequent and easily recognized during dry/wet F I G U R E 6 Spatial distribution of monthly EDDI series with cases below −2.58 (return period of 1 in 200 years) at 1-, 3-and 12-month time scales on mainland Spain and the Balearic Islands over the period 1961-2018 F I G U R E 5 Percentage of the total monthly EDDI series with cases below −2.58 (return period of 1 in 200 years) at 1-(all months), 3-(February, May, August and November) and 12-month (December) time scales on mainland Spain and the Balearic Islands periods at different time scales and climatic conditions, but especially at long time scales as evidenced in Madrid (Figure 9d) or Seville (Figure 9f). This issue of nonparametric approaches to modelling EDDI values at long time scales is very common during drought episodes in central and southern Spain (Figure 10), since these areas generally show a positive trend in AED and the periods characterized by strong increases in AED were recurrent over the last two decades. In contrast, the parametric approach based on Log-logistic distribution shows wellmodelled extreme EDDI values, even if the new values are outside of the reference climatology, regardless of time scale and climatic conditions (Figure 9). Likewise, this approach reported a robust performance with series that show a trend or high frequency of extreme drought episodes ( Figure 10).
To illustrate the relevance of this issue in detail, we compared the EDDI series obtained through parametric and nonparametric approaches based on the 29-year reference period (i.e., 1961-1989) at 12-month time scales in several locations during two extreme drought episodes in 1990 and 2017 ( Figure 11). During these periods characterized by severe drought conditions affecting large areas of northern (Figure 11a), central, and southern F I G U R E 7 Relationship between EDDI values and the associated return periods calculated with Log-logistic and Pearson III distribution at 1-, 3-and 12-month time scales. The colours represent the point density, with the highest density shown in red [Colour figure can be viewed at wileyonlinelibrary.com] Spain ( Figure 11b) and lasting for approximately 1 year, the nonparametric approach cannot adequately model EDDI values because these anomalous AED values are outside the climatology used as a reference to compute the index. On the other hand, the parametric approach based on Log-logistic provides very relevant information on the severity and intensity of the drought events, making it possible to accurately identify how the drought conditions developed over time and space. Thus, for example, it can be seen how the drought of 1990 reached its maximum intensity in summer (Figure 11a) or how the drought of the 2017 progressed in intensity from the central (i.e., Madrid) to the southern regions of Spain (i.e., Mérida and Seville) over the period (Figure 11b).

| CONCLUSIONS
This study assessed the suitability of eight parametric distributions of probability to calculate the Evaporative Demand Drought Index (EDDI). This was tested in mainland Spain and the Balearic Islands over the period . The majority of the tested probability distributions had no fitting solution to calculate the EDDI and were rejected. From the eight probability distributions tested, only the Log-logistic, Pearson III, and Normal provided solutions for EDDI calculation over most of the study area at 1-, 3-and 12-month time scales. However, the normal distribution was also discarded because it exhibited a high percentage of EDDI series that did not follow a normal distribution relative to the Pearson III and Log-logistic distributions. Finally, the Pearson III F I G U R E 9 EDDI series from (a) Santander, (b) Zaragoza, (c) Valladolid, (d) Madrid, (e) Valencia and (f) Seville at 1-, 3-and 12-month time scales computed through a parametric and a nonparametric approach based on a reference period  [Colour figure can be viewed at wileyonlinelibrary.com] F I G U R E 1 0 EDDI series from Toledo, Mérida and Granada at 12-month time scale over the period 1961-2018, computed through a parametric and a nonparametric approach based on a reference period  [Colour figure can be viewed at wileyonlinelibrary.com] F I G U R E 1 1 EDDI series during drought events of (a) 1990 (Oviedo, Santander and Bilbao) and (b) 2017 (Madrid, Mérida and Seville) at 12-month time scale, computed through a parametric and a nonparametric approach based on a reference period  [Colour figure can be viewed at wileyonlinelibrary.com] distribution was also discarded because it yielded a higher frequency of positive and negative extreme values and longer return periods than Log-logistic distribution, regardless of the time scale analysed. Therefore, we conclude that the Log-logistic is the most suitable and robust probability distribution for EDDI computation using a parametric approach.
The parametric approach based on Log-logistic distribution proposed in this study also performed better when compared to the original nonparametric approach, which is heavily constrained by the length of the series since the distribution is bound by the highest and lowest observational values, which limits the modelling of new values more extreme than that observed in the reference climatology. Thus, the original nonparametric approach showed very similar values to the parametric approach based on Log-logistic distribution when the index is computed retrospectively and long-term periods are available, but it exhibited notable limitations in modelling new EDDI values when the index calculations are based on a previous reference period. This demonstrates the issues of adopting a nonparametric approach to modelling the extreme values of EDDI, especially if long-term series are not available. In contrast, the parametric approach based on Log-logistic distribution modelled extreme EDDI values very well, even when using a reference period, as it can model the new values outside the reference climatology, providing an important advantage for drought analysis and monitoring.
Therefore, based on the results obtained in this study, we recommend the use of Log-logistic distribution to calculate the Evaporative Demand Drought Index (EDDI). This distribution proved to be the best fit to AED series for EDDI calculation and provided robust results, regardless of the time scale and climate region. Likewise, Loglogistic distribution also returned a better performance compared to the original nonparametric formulation for EDDI computation, since this parametric approach is less limited by the length of the climatology. The Log-logistic distribution has already been recommended for calculating other drought indices such as SPEI (Vicente-Serrano et al., 2010;Vicente-Serrano and Beguería, 2016) and SEDI (Vicente-Serrano et al., 2018) worldwide. Our study focused exclusively on Spain, but given the wide range of climatic conditions characteristic of the country and the absence of spatial bias in fitting AED series, we consider that the results seen here may be representative for other regions; we therefore also recommend the Log-logistic distribution for calculating the EDDI in other areas of the world. In summary, this study provided a robust parametric approach for EDDI computation, indicating that this standardized drought index can be optimally implemented in drought analysis and monitoring. The code used to calculate EDDI based on the Log-logistic distribution in the R programming language is available on (https://github.com/ivannoguera/EDDI-Log-logistic).

ACKNOWLEDGEMENTS
This work was supported by the research projects CGL2017-82216-R, PCI2019-103631 and PID2019-108589RA-I00 financed by the Spanish Commission of Science and Technology and FEDER; CROSSDRO project financed by the AXIS (Assessment of Cross(X)-sectoral climate Impacts and pathways for Sustainable transformation), JPI-Climate co-funded call of the European Commission and INDECIS which is part of ERA4CS, an ERA-NET initiated by JPI Climate, and funded by FORMAS (SE), DLR (DE), BMWFW (AT), IFD (DK), MINECO (ES), ANR (FR) with co-funding by the European Union (grant 690462). The first author is the beneficiary of the predoctoral fellowship 'Subvenciones destinadas a la contrataci on de personal investigador predoctoral en formaci on (2017)' awarded by the Gobierno de Arag on (Government of Arag on, Spain).